Multi-line Strings
with Executable Inclusions

MLS is a simple Java preprocessor that supports multi-line strings in any Java compiler. The strings may contain executable inclusions; executable Java expressions to be expanded within strings as in Perl.

The reasons these features are needed is described in the paper, Web Applications as Java Servlets, which describes how MLS can eliminate the need for html-based languages like JSP, ASP and WebMacros by writing the application entirely in Java.

It is mystifying that a language like Java, which is quite advanced in most respects, would have such primitive support for a heavily used data-type such as strings. Java strings are delimited by double-quotes, like this: "This is a string". New line characters and embedded quotes within strings are forbidden unless rewritten as escape sequences like \" or \n. If the job involves long sequences of text, such as html pages, the best you can do is to write each line as a Java string connected with string concatenation operators like this:

"<html>\n" +
"  <body>\n" +
"  ...and so forth\n" +
"  </body>\n" +
"<html>\n"

In addition, any double-quote characters in the html text must be escaped with a backslash. All this just to get ordinary html text into a Java program. And we haven't even started on the real work of integrating computed information (executable inclusions) into outgoing stream of text. Getting all the quotes and plus signs right is sufficiently tedious that most quickly give up and turn to html-based tools like JSP, particularly when working with user interface specialists who expect html to look like html, without extraneous quotes and plus signs.

The MLS preprocessor addresses these two needs (multi-line strings and executable inclusions) by means of a single pair of digraphs, {{ and }}, which it processes differently according to the digraph nesting level. For example, this program demonstrates the simplest case, where digraphs replace double quotes for delimiting an ordinary string.

Input Output
public void main (String[] args)
{
  System.err.println({{Hello world}});
}
public void main (String[] args)
{
  System.err.println("Hello world");
}

The program is color coded in alternating red and blue to make the nesting level obvious. MLS simply passes the blue text through unchanged, and converts the red text to Java strings. If the red text contains new line or double-quote characters, MLS converts them to Java conventions.

Input Output
public void main (String[] args)
{
  System.err.println({{The answer,
my dearest, 
is "yes".
}});
}
public void main (String[] args)
{
  System.err.println("The answer\n"+
"my dearest,\n"+
"is \"yes\"."+
"");
}

The executable inclusion feature lets the answer be computed at run-time instead of being hard-coded into a string. Simply enclose any Java expression within digraphs, like this:

Input Output
public void main (String[] args)
{
  System.err.println({{The answer,
my dearest, 
is {{ computeAnswer() }}.
}});
}
public void main (String[] args)
{
  System.err.println("The answer\n"+
"my dearest,\n"+
"is "+computeAnswer()+"."+
"");
}
MLS is governed by entirely by the nested digraphs. It has no knowledge of Java other than how to emit concatenated Java strings as output. The {{ digraph begins a multi-line string and the second, }}, terminates it. The same pair of digraphs serves double duty to begin and end executable inclusions.

Doubled brackets were chosen as the digraphs because:

MLS emits exactly one line of java for each line of MLS source text so any error diagnostics from the Java compiler will match the line numbers in the MLS input. Also notice that new line characters within multi-line strings are exactly represented in the generated code.

The variable inclusion feature is based on the fact that Java will automatically convert any type concatenated with a string by calling the types toString() method, which all objects inherit from Object automatically. Thus the computeAnswer() subroutine in the above example could return any type whatsoever, including built-in types like int or float and application-specific types.

Since most Java compilers optimize away concatenation of known constants at compile-time, MLS adds no runtime overhead at all.

Multi-level nesting

As implied by the color coding convention above, the digraphs can nest to any level. In the preceeding example, the file as a whole (blue) is the 0th level of nesting, the multi-line string argument to the System.out.println(); statement (red) is the 1st level of nesting, and the {{ computeAnswer() }} executable inclusion (blue) is a 2nd level nesting. But any Java expression might appear where computeAnswer() appears now, including subroutine calls, which might have String arguments, which might be written as multi-line strings, which might contain executable inclusions. In other words it is quite possible for the nesting to continue to any depth. MLS supports this even thought it is not often encountered in practice.

MLS handles nested digraphs by relying on recursive calls between a pair of subroutines. MLS starts execution in 0th level (blue) mode by passing control to the doCode() subroutine.

Both subroutines check for and report unbalanced nesting by throwing exceptions as appropriate.

Usage Instructions

By default MLS replaces the input file suffix (I use .j as the postfix for MLS files) with a .java suffix and emits each output file into the same directory as the input. In practice it is more convenient to invoke the preprocessor as mls -d outputDirectory inputFile.mls ..., in which case it will emit the output files into the specified outputDirectory.

As a convenience feature, MLS will print the name of each output file on stdout to facilitate the typical usage pattern demonstrated in the following Makefile:

CP=/java/jdk/jre/lib/rt.jar:/webapp/WEB-INF/classes:/tomcat/lib/servlet.jar
MLS=/tools/mls/mls -d ../mls.java
SRC=$(shell find * -name CVS -prune -o -name \*.mls -print)

all: $(SRC) 
  jikes -classpath $(CP) -d /webapp/WEB-INF/classes `$(MLS) $(SRC)`

This example Makefile simply recompiles the entire web site each time it is run. Better Makefiles could be devised, but I've never bothered: The MLS/Jikes combination is so fast that I've never felt a need for a more selective compilation procedure.

Source Code

For those who like to read code online before downloading it, here is the source code for the MLS preprocessor. If you plan to compile and run code, use the Download link at the left of this page. The version shown here has been reformatted to comply with html restrictions which may have introduced errors.

package com.sdi.tools.mls;
import java.util.*;
import java.io.*;
/**
 * Multi-line Java Strings with Executable Inclusions
 * A Java Preprocessor by Brad Cox, Ph.D.
 * bcox@superdistributed.com
 */
public class Main 
{
  private static int nestingLevel;
  private static String fileName = "";
  private static int lineNumber;
  private static PushbackInputStream in;
  private static PrintWriter out;
  private static final String usage = 
    "Usage: java com.sdi.jp.Main inputFileName...";
/**
 * Insert the method's description here.
 * Creation date: (12/27/00 09:37:35)
 */
private static void doCode()
  throws Exception
{
  nestingLevel++;
  int thisInt, nextInt;
  while((thisInt = in.read()) != -1)
  {
    switch(thisInt)
    {
      case '\n':
        lineNumber++;
        out.print((char)thisInt);
        break;
      case '{':
        nextInt = in.read();
        if (nextInt == '{')
        {
          doString(lineNumber);
          break;
        }
        else 
        {
          out.print((char)thisInt);
          in.unread(nextInt);
          break;
        }
      case '}':
        nextInt = (char)in.read();
        if (nextInt == '}')
        {
          if (--nestingLevel <= 1)
            throw new Exception(fileName + ":: Extraneous }} at line " + lineNumber);
          return;
        }
        else 
        {
          out.print((char)thisInt);
          in.unread((char)nextInt);
          break;
        }
      default:
        out.print((char)thisInt);
        break;
    }
  }
}
/**
 * Process a PushBackInputStream
 */
public static void doStream(InputStream is, PrintWriter os)
  throws Exception
{
  in = new PushbackInputStream(is);
  out = os;
  lineNumber = 0;
  nestingLevel = 0;
  doCode();
}
private static void doString(int line)
  throws Exception
{
  nestingLevel++;
  int thisInt, nextInt;
  out.print("\"");
  while((thisInt = in.read()) != -1)
  {
    switch(thisInt)
    {
      case '\n':
        lineNumber++;
        out.print("\\n\"+\n\"");
        break;
      case '\\':
        out.print("\\" + (char)thisInt);
        break;
      case '"':
        out.print("\\\"");
        break;
      case '{':
        nextInt = in.read();
        if (nextInt == '{')
        {
          out.print("\"+");
          doCode();
          out.print("+\"");
          break;
        }
        else 
        {
          out.print((char)thisInt);
          in.unread(nextInt);
          break;
        }
      case '}':
        nextInt = (char)in.read();
        if (nextInt == '}')
        {
          out.print("\"");
          return;
        }
        else 
        {
          out.print((char)thisInt);
          in.unread((char)nextInt);
          break;
        }
      default:
        out.print((char)thisInt);
        break;
    }
  }
  throw new IOException(fileName + ": unterminated {{string}} at line " + line);
}
/**
 * Insert the method's description here.
 * Creation date: (12/27/00 09:24:51)
 * @param args java.lang.String[]
 */
public static void main(String[] args)
{
  File outDirectory = new File(".");
  try
  {
    Vector files = new Vector();
    for (int i = 0; i < args.length; i++)
    {
      if (args[i].startsWith("-"))
      {
        if (args[i].startsWith("-d"))
        {
          outDirectory = new File(args[++i]);
          if (!outDirectory.isDirectory() && !outDirectory.mkdirs())
          {
            System.err.println("Couldn't create " + outDirectory);
            System.exit(-1);
          }
        }
        else
          System.err.println(usage + "\n   invalid switch: " + args[i]);
      }
      else
        files.addElement(args[i]);
    }
    for (Enumeration e = files.elements(); e.hasMoreElements(); )
    {
      fileName = (String)e.nextElement();
      lineNumber = 0;
      File inFile = new File(fileName);
      BufferedInputStream bis = null;
      try
      {
        FileInputStream fis = new FileInputStream(inFile);
        bis = new BufferedInputStream(fis);
      }
      catch (FileNotFoundException ex)
      {
        System.err.println("Cannot read " + fileName);
        continue;
      }
      String base = fileName.substring(0, fileName.lastIndexOf("."));
      File outFile = new File(outDirectory, base + ".java");
      PrintWriter pw = null;
      try
      {
        FileOutputStream fos = new FileOutputStream(outFile);
        BufferedOutputStream bos = new BufferedOutputStream(fos);
        pw = new PrintWriter(bos);
      }
      catch (IOException ex)
      {
        System.err.println("Cannot write " + outFile);
        continue;
      }
      doStream(bis, pw);
      bis.close();
      pw.close();
      /**
       * Print names of output files on stdout to support
       * the usage pattern: jikes `mls inputfiles`
       */
      System.out.println(outFile);
    }
  }
  catch (Throwable e)
  {
    System.err.println(e.getMessage());
    e.printStackTrace();
    System.exit(-1);
  }
}
}

The End