Game Development Community

A TorqueScript (pre)compiler - anybody interested?

by Daniel Buckmaster · in Torque 3D Professional · 11/09/2013 (11:58 pm) · 34 replies

I've been thinking about this for some time, and it's been bugging me enough that I figured I'd step out and gauge interest. What I want to do is implement a preprocessor for TorqueScript. Not like the C preprocessor, but nearly a full-on compiler that parses your scripts, does any useful steps in the middle, and outputs the actual TorqueScript that your game runs.

Some 'preprocessing' steps I would want to implement are:

Optimisations. For example, detecting when a local variable is constant (i.e. it's assigned a value only once) and replacing further occurrences of it with its value. Should save some name lookups. Or another example - pulling complex expressions out of for loops and such. This is a bit of a tricky one, but consider this:
for(%i = 0; %i < getWordCount(%string); %i++)
getWordCount is called every time the condition is checked. It's more performant to write this:
%len = getWordCount(%string);
for(%i = 0; %i < %len; %i++)
But why do that when it should be the compiler's job? Of course, it can't always be like this, particularly if %string can change length over the course of the loop. Which the compiler would have to check. It also doesn't work if the function called (in this case getWordCount) isn't pure, and may return a different result given the same input. But 90% of the time this holds. You could go all sorts of directions with this, replacing constants doesn't even scratch the surface of the possibilities.

Syntax sugar. For example, plain-language operators like 'and' and 'or' instead of '&&' and '||'. Also, maybe, removing the % prefix for local variables to make code that little bit more modern/readable. Maybe removing the requirement for semicolons. It's all up for grabs.

String construction. Since strings are so important in TS, I'd love to implement new ways of putting them together. One way to do this is vector notation:
%pos = VectorAdd([1, 2, 3], [4, %y, %z]);
// Which compiles to
%pos = VectorAdd("1 2 3", "4" SPC %y SPC %z);
Another way is string interpolation:
%greeting = "Hello, @{%name}!";
// Which compiles to
%greeting = "Hello, " @ %name @ "!";

Destructuring assignment. This is a huge one that I'd love to try. Before I explain, here's a code example:
%data = ContainerRayCast(...);
%hitObj = getWord(%data, 0);
if(%hitObj) {
   %hitPos = getWords(%data, 1, 3);
   %hitNormal = getWords(%data, 4, 6);
}
I haven't even had to write that snippet many times, but it gets boring. And returning multiple results as words in a string is a common pattern in TorqueScript. So it'd be much nicer to write something like:
[%hitObj, %hitPos, %hitNormal] = ContainerRayCast(...);
using the vector notation above to pull groups out of the word-delimited string. Note that this doesn't work entirely as given, because the second two parameters take 3 words each. So in a basic implementation you'd be looking at
[%hitObj, %hx, %hy, %hz, %nx, %ny, %nz] = ContainerRayCast(...);
Which isn't awful, but could definitely be improved. Maybe like this:
[%hitObj, %hitPos 3, %hitNormal 3] = ...
providing word lengths for sequences that are > 1 word. There should also be syntax for destructuring records (newlines) and fields (tabs):
function giveMeData() {
   return "word word" TAB "tab" NL "newline";
   // Or,
   return ["word", "word": "tab"; "newline"];
}

[$a, $b: $c; $d] = giveMeData();

Inline/anonymous functions. Like:
schedule(1000, function() {
   echo("Hi.");
});
This is a pretty easy transformation to make naively - just pull the function out into the global namespace and replace its use with an auto-generated name - but there are tricky things to implement like how it handles scope and binding. On the other hand, it'd be very easy to not even try to match other languages on that, since TorqueScript doesn't do that anyway.






So, who's interested in the possibilities? I'd love to try this just from the point of view of implementing my own language/compiler, and because I think TorqueScript lacks some usability.

EDIT: Two more ideas I forgot

Infix function calls. Vector maths is a bit painful, thanks to the way we need to use global functions for everything. But what if there's another way:
VectorAdd(VectorAdd(%a, %b), %c);
// becomes
%a ^vectorAdd %b ^vectorAdd %c;
I'd propose using a backslash (it got eaten by the code block, so I used a caret instead) or something to allow functions of two arguments to be called as infix operators to make code read more nicely.

A module system.
You know, where names aren't available to you unless you've imported the appropriate model. The compiler would then need to do some wizardry to determine which files need to exec which others, but hey, with modern science I'm sure we can come up with something.

About the author

Studying mechatronic engineering and computer science at the University of Sydney. Game development is probably my most time-consuming hobby!

Page«First 1 2 Next»
#21
11/14/2013 (5:35 pm)
We are trying for 1st Quarter 2014. We haven't firmed up pricing yet and we are researching out options. This information will be forth coming as we get closer to release.
#22
11/15/2013 (1:43 pm)
@dB,
I did explore some compiling libraries for Python. There are several libs out there so I cannot really recommend a good one, but it seems like they have the structure to help build a compiler to compile another language. My limitation on that has been understanding the compile process period. That is why I picked up the "Dragon" compiler book (can't remember title). My intent is to make a compiler that converts TS to ECMA (JS), but a general purpose one would be nice. One hard one to tackle is packages. I have not yet found a way to do that in C++ much less my Python interface.
#23
11/22/2013 (3:00 pm)
@Vince
I have a question about OMNI. I was checking out Mono's website and it turns out from what I can tell that Mono is like $900 dollars for IOS and ANDROID. Mono may not be the right fit for an MIT Licensed Torque. But you probably already know this and are willing to pay the price for Android and IOS mono, is this correct? I think it would still be nice to have a low cost option for MIT Torque. Just a few thoughts I was having.
#24
11/22/2013 (4:14 pm)
Demolishun: I'll be using Parsec at the moment, a Haskell parser combinator library, and pretty much the de facto standard for Haskell parsers. I'm closely following language-java, which parses Java using Parsec, so it's a great example. I have the AST data structure nearly down, and I think my next task will be to define a pretty-printer for it, then work my way up to the actual parser.

At the moment the AST type is looser than the actual language definition. For example, my AST allows you to construct statements like "string";. But in TS, a statement must be at least an assignment or a function call, not just any old expression. I think I need to do a bit of refactoring if I want to actually make it match TS perfectly. My current reasoning is that I can make my AST a superset of valid TS, but the parser can just use the subset that is actually valid TS.

IMO packages are crazy and should not be brought along into translated programs :P. I think, though, you could simulate them in JS using prototypical inheritance. When you activate a package, its prototype becomes the last package on the stack. You'd then need to convert all global function calls (i.e. not method calls) to be calls on the last package that was pushed to the stack. But wait, can namespaced (i.e. method) functions be packaged? I've forgotten. If so, you'd need to transform all method calls to package stack calls as well. Not pretty.
#25
11/22/2013 (6:01 pm)
@Anthony
OMNI and our MONO implementation will not be MIT T3D compatible. In the end, OMNI will be our own engine that we will support ourselves.
#26
11/22/2013 (6:32 pm)
Quote:But wait, can namespaced (i.e. method) functions be packaged? I've forgotten. If so, you'd need to transform all method calls to package stack calls as well. Not pretty.

If you're saying like this:

package p
{
    function Player::bleh(%this)
    {
        Parent::bleh(%this);
        // bleh
    }
};

then yes that can be done.
#27
11/23/2013 (7:08 pm)
Oh, save us. So what Parent:: function gets called? ShapeBase? Or the next package up in the stack? I'll have to do more detailed tests when it actually comes to translating the language.

Well, in that case, you'd just have to convert every method call to a call on a global package object. Of course, there are way better ways to achieve the same functionality depending on what the packages are used for, so there will probably be a bit of hand-tweaking to be done.

I've yet to see packages used in a way that makes me happy about their existence. I think Mike Hall mentioned his modular script templates use packages a lot, so I'm very interested to see the code.
#28
11/23/2013 (10:43 pm)
No, packages would be better ignored for a newer/different scripting language. The same goal would be achieved using an object/abstraction in a language such as Python. I assume a similar situation with ECMA or Haskell.

Packages are part of Torque Script's abstraction methods. I think even in TS you could use objects such as ScriptObject or SimObject to accomplish the same thing. It is just a context switch mainly for game types. Pointing a global to another object could do the same exact thing.

I am only supporting packages in ScriptT3D because I am maintaining a mixed environment. If I rewrote the console to use Python I would certainly not maintain constructs from another language.

So to translate packages to Haskell or Python I would think an object pointer in (God forbid) global space would be fine to simulate packages. In the future people would probably not write their code that way anyway. So a translation tool is probably not worth the effort. Or at least the users of such a tool will realize it will require effort to clean up the translation.

I apologize for bringing up packages at all. I realize this is a touchy subject. For using packages in TS is fine. However, in a translation to another language I say "run away!" just like Sir Robin.
#29
11/24/2013 (3:15 am)
Basic tutorial is online. I construct a testing AST then show how to traverse it to make a list of all variable names used in a function. I've pitched it at people who haven't necessarily used Haskel before, but I haven't fully explained the syntax - it's not supposed to be a language tutorial. If it needs clarifying in places, let me know! I sort of rushed the end, where I use universeBi to traverse statements.

EDIT: once you realise that traversing a three-level syntax tree and pulling out a list of variable names is literally two lines of Haskell code, you start to realise just what awesome power you wield. I feel like Frankenstein discovering the secret of life :P.
#30
11/24/2013 (8:52 am)
@Daniel nope, actually if you had a previous definition of Player::bleh() it would just call that. Does that make it even more confusing :P
#31
11/24/2013 (12:22 pm)
I've converted the scripts to C# 6 or 7 times.

Most of the packages can be removed, and I've done that in my last build. But there are some packages which are uses to override the onescape event.

So that when your in tools and a onescape event fires it doesn't exit the game but instead exits the tools.

This is done in several spots through out the code. It would take a bit of planning to achieve the same thing packages do of the system events which get written into packages.
#32
11/26/2013 (5:31 am)
New tutorial up - this one is an example of how to perform dead-code elimination of statements after a function return. Still no parser, I'm constructing the AST by hand. I'm just having fun for now. My next goal is to put up an example of something really tricky: detecting variables that are used before they are assigned to (i.e. are uninitialised). Then I'll start on the parser in earnest.

Also, I think the AST may be due for an overhaul. Just realised that it allows you to assign to the result of a function call :P.
#33
11/28/2013 (12:36 pm)
Last one for now. I'm pretty proud that I managed to implement that (admittedly fairly simple) name checking algorithm in 12 lines of Haskell (plus types). Of course a true pro would have reduced it to a single line, but hey :P.

Right, now on with the exciting stuff like actually parsing something!
#34
11/28/2013 (11:05 pm)
I gotta say Daniel that what your working on is really interesting to me. Thank you for keeping us updated. Nice work!
Page«First 1 2 Next»