Inspirel - FMT: Grammar preprocessors

About Us Products Articles News Contact Support

Grammar preprocessor

The language features explained so far are complete in the sense that they allow to model arbitrary imperative programs with direct mapping to features of the selected target programming language. In some cases, however, it might be useful for the engineer to extend the range of available modeling patterns in order to handle repetitive constructs or some higher-level control structures. The grammar preprocessor is a hook in the early grammar processing stage that allows the user to automatically insert additional model processing steps.

Writing grammar preprocessing routines will typically rely on Wolfram’s capabilities in symbol transformation and as such requires some mastery of functions related to symbolic computation.

As a very simple introduction to grammar preprocessing the first example shown below replaces all integer constants 12345 with 54321.

The simple model for this example, with just one parameter-less operation is:

The source code generated for this operation (for example in C) is not surprising:

void my_package_op(void)
{
    int32_t temp;
    
    temp = 7;
    temp = temp + 42;
    temp = 12345 - (2 * temp);
}

The grammar preprocessor is a user-provided function that takes the complete grammar expression for a single operation body and returns a new expression, which will be used instead of the original body in subsequent processing steps.

The following simple Wolfram function is sufficient to achieve the initial goal:

This function, which is entirely based on a single ReplaceAll expression, can be tested to verify that it performs the intended replacements:

The preprocessing function operates as expected and can be installed in the model by means of configurable options (explained in a separate chapter devoted to customization):

After this option is set, all grammar-related operations include the newly installed preprocessor in their operation. In particular, the generated code now looks like this:

void my_package_op(void)
{
    int32_t temp;
    
    temp = 7;
    temp = temp + 42;
    temp = 54321 - (2 * temp);
}

The intended replacement was performed successfully and the 12345 value was changed to 54321 without changing the original model. This means that the same model can be processed again, with perhaps different grammar preprocessor, thus leading to different results.

Grammar preprocessor, once installed, is used in all operations that rely on the grammar in any way, including functions that perform correctness checks. The generated code is consistent with correctness checks as long as the same (or none) preprocessor is used for both.

There is no limit on what the preprocessor can do with the operation body, although it cannot inject new local variables to the list of locals or change the definition of existing variables.

The following, more elaborated example implements unrolling of static loops of a form:

With such a new construct, the user might write:

instead of more time-consuming:

The preprocessor function that implements appropriate expansion can be implemented this way:

Again, this happens to be a single-line function that relies on several standard Wolfram features. The ReplaceAll function is used again, but this time with RuleDelayed, since the actual number of repetitions is not static and can be different for each repeatN found. The Table function creates the requested number of copies and the resulting list is changed into a compoundT wrapper that is an internal grammar representation of a compound statement.

Now, with the preprocessor stalled:

and the following operation body:

the generated code (here in Ada) will look like:

procedure Op is
   Temp : Integer;
begin
   Temp := 7;
   Temp := Temp + 1;
   Temp := Temp + 1;
   Temp := Temp + 1;
end Op;

This demonstrates the unrolling of repeatN body and in effect the extension of the core modeling language.

The implementation of this preprocessor indicates an important issue that needs to be addressed - the compoundT symbol was used instead of the standard CompoundExpression. The reason for this is that CompoundExpression would be immediately evaluated by Wolfram when the replacement expression is computed, which would in turn leave only the last component of the expanded sequence and thus defeat the whole intent of repetition. The FMT’s internal symbol compoundT does not have this problem, as it has no special meaning in Wolfram.

This issue will be visible in most non-trivial preprocessors - and in particular in those that operate on control structures. Another example that shows (and solves) this problem is the preprocessor that extends the core language with infinite loop. The intent is to replace the following ad-hoc syntax:

with core language construct:

The naive attempt to implement the preprocessor function this way:

will fail, because While is a standard Wolfram function that will be evaluated even before the replacement is made - and certainly it is not the intent to have that infinite loop being actually executed within the preprocessor.

The solution in such cases is to use FMT’s internal symbols instead of standard ones:

Now the preprocessor function can be safely executed, even when testing it in isolation:

Another important point to consider when implementing user-provided preprocessors is that they are not automatically chained. When the new preprocessor is installed by setting the grammarPreprocessor option, the old value of that option is lost. It is therefore the responsibility of the user to ensure that preprocessors properly delegate to the previously set option if it is intended for multiple preprocessors to be used at the same time.

Just for the purpose of demonstration, the FMT package contains the default preprocessor that is automatically set up for each new model. The default preprocessor extends the core modeling language with the For loop, by replacing constructs of this form:

with core-language While loops:

The implementation of this default preprocessor is able to handle the case of nested For loops - something that for example the simple implementation of repeatN shown above cannot do, due to the fact that a single replacement expands to the subexpression that is no longer taken into account by the standard pattern matching process. The solution is based on the use of standard FixedPoint function, which repeatedly performs some computation until the result does not change any longer (the actual code in FMTOptions.wl is formatted differently):

The implementation of repeatN might need to use a similar approach in order to support nested constructs, for example:

Now it is possible to use nested constructs:

The final generated code for this operation (this time in C++) is:

void my_package::op()
{
    std::int32_t temp;
    
    temp = 7;
    temp = temp * 2;
    temp = temp + 1;
    temp = temp + 1;
    temp = temp + 1;
    temp = temp * 2;
    temp = temp + 1;
    temp = temp + 1;
    temp = temp + 1;
}

Previous: Getting more from Mathematica, next: Metaprogramming