Game Development Community

CMDscan.l fix for block comments

by Tom Spilman · in Torque Game Engine · 06/01/2006 (7:55 pm) · 7 replies

The CMDscan.l file which is used to generate the TorqueScript scanner has a bug with how it handles block comments. Currently the handler looks like this:

"/*" {
      register int c;
      for ( ; ; )
      {
         while ( (c = CMDgetc()) != '*' && c != EOF )
         ; /* eat up text of comment */

         if ( c == '*' )
         {
            while ( (c = CMDgetc()) == '*' )
            ;
            if ( c == '/' )
               break; /* found the end */
         }

         if ( c == EOF )
         {
            CMDerror( "EOF in comment" );
            break;
         }
      }
   }

The first and biggest issue is that this fails to increment the line count. This causes the line numbers for TorqueScript opcodes to be incorrect which in turn makes the TelnetDebugger return incorrect line numbers when stepping thru script and fail to accept valid breakpoints. I think this has cause a few of the Torsion bug reports i've gotten with regards to breakpoints not working.

Second the use of CMDgetc() here is incorrect. CMDgetc() reads directly from the buffer and skips characters already in scanners cache. yyinput() is the right function which reads from the cache and refills it using CMDgetc() if need be. This is importaint because you often have a /* followed by a newline... if you skip reading from the cache you can miss a newline and all the opcode lines are off again.

Finally if you fix both of those the logic of this handler is wrong. It won't accept Javadoc style /** */ block comments as any * found screws up the loop. And this implementation came right out of the flex documentation... is that cheating? :)

So after spending 2 hours learning Flex and figuring all this out i have a new comment block handler:

"/*" {
         register int c = 0, l;
         for ( ; ; )
         {
            l = c;
            c = yyinput();

            // Is this an open comment?
            if ( c == EOF )
            {
               CMDerror( "unexpected end of file found in comment" );
               break;
            }

            // Increment line numbers.
            else if ( c == '\n' )
               lineIndex++;

            // Did we find the end of the comment?
            else if ( l == '*' && c == '/' )
               break;
         }
      }

This works correctly in all cases.

While i was at it i fixed the CMDerror() function to properly report error messages.

void CMDerror( char* format, ... )
{
   char error[1024];
   va_list args;
   va_start( args, format );
   _vsnprintf( error, sizeof(error), format, args );

   if ( fileName )
      Con::errorf(ConsoleLogEntry::Script, "%s Line: %d - %s", fileName, lineIndex, error );
   else 
      Con::errorf(ConsoleLogEntry::Script, error);

   gSyntaxError = true;
}

I hope this can make it into all the TAPs quickly... especially TGB with the forthcoming flood of contest participants.

About the author

Tom is a programmer and co-owner of Sickhead Games, LLC.


#1
06/03/2006 (1:26 pm)
Thanks Tom,
This is exactly where I was stuck :)
#2
06/03/2006 (1:46 pm)
Thanks alot Tom, looks good.
#3
06/05/2006 (11:06 am)
Tom,
I've integrated your block comment fix for the parser (TGB), however I'm not sure that your CMDError fix is necessary or desired? I resolved the issue that had advanced error reporting in CMDError not reporting to the console properly. I notice your CMDError does not do the advanced error reporting, is there a reason your solution is 'fixed' more than the current implementation in TGB? I'm not terribly familiar with the console parser, let me know if I'm missing something.


Cheers,
-Justin
#4
06/05/2006 (12:43 pm)
Thanks Tom.

For anyone interested, it is possible to take the above changes and roll them directly in to CMDscan.cc for 'case 34:' so you can avoid setting up Flex and recompiling the parser.

The compiler error messages per line numbering (#line) might be off, but CMDscan.cc is stable, so there isn't any errors reported right now anyway.
#5
06/05/2006 (1:06 pm)
@Justin - I was working from TSE which i thought was the latest now with the merge of TGE 1.4... but anyway... the advanced reporting wasn't in the source i worked from. If it was i would have fixed it something like this (off the cuff, non-compiled or tested)...

void CMDerror(char *format, ...)
{
   Compiler::gSyntaxError = true;

   const int BUFMAX = 1024;.
   char tempBuf[BUFMAX];
   va_list args;   
   va_start( args, format );   
   _vsnprintf( tempBuf, BUFMAX, format, args );

   if(fileName)
   {
      Con::errorf(ConsoleLogEntry::Script, "%s Line: %d - %s", fileName, lineIndex, tempBuf);

#ifndef NO_ADVANCED_ERROR_REPORT
      // blah!
#endif

      // blah blah!
  }
   else
      Con::errorf(ConsoleLogEntry::Script, tempBuf);
}

This leaves your advance reporting and still reports the actually syntax error message. One thing to maybe protect against is if the 'format' parameter is null, but that would be a bug IMO.

The reason to do it this way is that CMDerror should normally pass a specific error message... one better than "Syntax Error" when possible. For example if you look at my fix for the comments you'll see i do...

CMDerror( "unexpected end of file found in comment" );

By ignoring the parameters to CMDerror this string is never printed. Although the advanced error reporting marks where the error is which is a nice improvement, nothing beats a specific message.
#6
06/05/2006 (1:19 pm)
@Jeff - You can in a fix, but hopefully the next minor update to TGE will finally add the flex yacc stuff as a custom build step (I was able to get it working myself... should i post the steps?). It would make it much easier for people to post improvements and bug fixes for the compiler.
#7
07/03/2006 (1:40 am)
This has a compile error on Mac OS X and gcc4. To fix just change _vsnprintf to vsnprintf. Don't ask me why but it works.