MLS is a simple Java preprocessor that supports multi-line strings in any Java compiler. The strings may contain executable inclusions; executable Java expressions to be expanded within strings as in Perl.
The reasons these features are needed is described in the paper, Web Applications as Java Servlets, which describes how MLS can eliminate the need for html-based languages like JSP, ASP and WebMacros by writing the application entirely in Java.
It is mystifying that a language like Java, which is quite advanced
in most respects, would have such primitive support for a heavily used
data-type such as strings. Java strings are delimited by double-quotes,
like this: "This is a string"
. New line characters and
embedded quotes within strings are forbidden unless rewritten as escape
sequences like \"
or \n
. If the job involves
long sequences of text, such as html pages, the best you can do is to
write each line as a Java string connected with string concatenation
operators like this:
"<html>\n" + " <body>\n" + " ...and so forth\n" + " </body>\n" + "<html>\n"
In addition, any double-quote characters in the html text must be escaped with a backslash. All this just to get ordinary html text into a Java program. And we haven't even started on the real work of integrating computed information (executable inclusions) into outgoing stream of text. Getting all the quotes and plus signs right is sufficiently tedious that most quickly give up and turn to html-based tools like JSP, particularly when working with user interface specialists who expect html to look like html, without extraneous quotes and plus signs.
The MLS preprocessor addresses these two needs (multi-line strings and executable inclusions) by means of a single pair of digraphs, {{ and }}, which it processes differently according to the digraph nesting level. For example, this program demonstrates the simplest case, where digraphs replace double quotes for delimiting an ordinary string.
Input | Output |
---|---|
public void main (String[] args) { System.err.println({{Hello world}}); } |
public void main (String[] args) { System.err.println("Hello world"); } |
The program is color coded in alternating red and blue to make the nesting level obvious. MLS simply passes the blue text through unchanged, and converts the red text to Java strings. If the red text contains new line or double-quote characters, MLS converts them to Java conventions.
Input | Output |
---|---|
public void main (String[] args) { System.err.println({{The answer, my dearest, is "yes". }}); } |
public void main (String[] args) { System.err.println("The answer\n"+ "my dearest,\n"+ "is \"yes\"."+ ""); } |
The executable inclusion feature lets the answer be computed at run-time instead of being hard-coded into a string. Simply enclose any Java expression within digraphs, like this:
Input | Output |
---|---|
public void main (String[] args) { System.err.println({{The answer, my dearest, is {{ computeAnswer() }}. }}); } |
public void main (String[] args) { System.err.println("The answer\n"+ "my dearest,\n"+ "is "+computeAnswer()+"."+ ""); } |
Doubled brackets were chosen as the digraphs because:
vi
and emacs
,
so out-of-balance conditions are readily noticed.
MLS emits exactly one line of java for each line of MLS source text so any error diagnostics from the Java compiler will match the line numbers in the MLS input. Also notice that new line characters within multi-line strings are exactly represented in the generated code.
The variable inclusion feature is based on the fact that Java will
automatically convert any type concatenated with a string by calling
the types toString() method, which all objects inherit from Object
automatically. Thus the computeAnswer()
subroutine in the
above example could return any type whatsoever, including built-in types
like int or float and application-specific types.
Since most Java compilers optimize away concatenation of known constants at compile-time, MLS adds no runtime overhead at all.
As implied by the color coding convention above, the digraphs can
nest to any level. In the preceeding example, the file as a whole (blue) is
the 0th level of nesting, the multi-line string argument to the System.out.println(); statement (red) is the 1st level of nesting, and the
{{ computeAnswer() }} executable inclusion (blue) is a 2nd level
nesting. But any Java expression might appear where computeAnswer()
appears now, including subroutine calls, which might have String arguments,
which might be written as multi-line strings, which might contain executable
inclusions. In other words it is quite possible for the nesting to continue
to any depth. MLS supports this even thought it is not often encountered
in practice.
MLS handles nested digraphs by relying on recursive calls between
a pair of subroutines. MLS starts execution in 0th level (blue) mode by passing control to the doCode()
subroutine.
doCode()
subroutine. This simply passes input to the output unchanged. In this
example,
this mode applies to the 0th level nesting (the file as a whole), and also
to the 2nd level nesting represented by the {{ computeAnswer() }}
expression. If the doCode()
subroutine detects a '{{' digraph, it
invokes doData()
to process it. If it finds a }} digraph it
returns to its caller.
doData()
subroutine. This simply converts the incoming text to a Java string by
surrounding it in quotes and concatenating it with its neighbors with a '+'
while prefixing any internal quotes or new lines characters with '\'.
If doData()
finds a '{{' digraph, it calls doCode()
to process it. If it finds a '}}' digraph, it returns to its caller.
Both subroutines check for and report unbalanced nesting by throwing exceptions as appropriate.
By default MLS replaces the input file suffix (I use .j as the postfix for
MLS files) with a .java suffix and emits each output file into
the same directory as the
input. In practice it is more convenient to invoke the preprocessor
as mls -d outputDirectory inputFile.mls ...
, in which
case it will emit the output files into the specified outputDirectory.
As a convenience feature, MLS will print the name of each output file on stdout to facilitate the typical usage pattern demonstrated in the following Makefile:
CP=/java/jdk/jre/lib/rt.jar:/webapp/WEB-INF/classes:/tomcat/lib/servlet.jar MLS=/tools/mls/mls -d ../mls.java SRC=$(shell find * -name CVS -prune -o -name \*.mls -print) all: $(SRC) jikes -classpath $(CP) -d /webapp/WEB-INF/classes `$(MLS) $(SRC)`
This example Makefile simply recompiles the entire web site each time it is run. Better Makefiles could be devised, but I've never bothered: The MLS/Jikes combination is so fast that I've never felt a need for a more selective compilation procedure.
For those who like to read code online before downloading it, here is the source code for the MLS preprocessor. If you plan to compile and run code, use the Download link at the left of this page. The version shown here has been reformatted to comply with html restrictions which may have introduced errors.
package com.sdi.tools.mls; import java.util.*; import java.io.*; /** * Multi-line Java Strings with Executable Inclusions * A Java Preprocessor by Brad Cox, Ph.D. * bcox@superdistributed.com */ public class Main { private static int nestingLevel; private static String fileName = ""; private static int lineNumber; private static PushbackInputStream in; private static PrintWriter out; private static final String usage = "Usage: java com.sdi.jp.Main inputFileName..."; /** * Insert the method's description here. * Creation date: (12/27/00 09:37:35) */ private static void doCode() throws Exception { nestingLevel++; int thisInt, nextInt; while((thisInt = in.read()) != -1) { switch(thisInt) { case '\n': lineNumber++; out.print((char)thisInt); break; case '{': nextInt = in.read(); if (nextInt == '{') { doString(lineNumber); break; } else { out.print((char)thisInt); in.unread(nextInt); break; } case '}': nextInt = (char)in.read(); if (nextInt == '}') { if (--nestingLevel <= 1) throw new Exception(fileName + ":: Extraneous }} at line " + lineNumber); return; } else { out.print((char)thisInt); in.unread((char)nextInt); break; } default: out.print((char)thisInt); break; } } } /** * Process a PushBackInputStream */ public static void doStream(InputStream is, PrintWriter os) throws Exception { in = new PushbackInputStream(is); out = os; lineNumber = 0; nestingLevel = 0; doCode(); } private static void doString(int line) throws Exception { nestingLevel++; int thisInt, nextInt; out.print("\""); while((thisInt = in.read()) != -1) { switch(thisInt) { case '\n': lineNumber++; out.print("\\n\"+\n\""); break; case '\\': out.print("\\" + (char)thisInt); break; case '"': out.print("\\\""); break; case '{': nextInt = in.read(); if (nextInt == '{') { out.print("\"+"); doCode(); out.print("+\""); break; } else { out.print((char)thisInt); in.unread(nextInt); break; } case '}': nextInt = (char)in.read(); if (nextInt == '}') { out.print("\""); return; } else { out.print((char)thisInt); in.unread((char)nextInt); break; } default: out.print((char)thisInt); break; } } throw new IOException(fileName + ": unterminated {{string}} at line " + line); } /** * Insert the method's description here. * Creation date: (12/27/00 09:24:51) * @param args java.lang.String[] */ public static void main(String[] args) { File outDirectory = new File("."); try { Vector files = new Vector(); for (int i = 0; i < args.length; i++) { if (args[i].startsWith("-")) { if (args[i].startsWith("-d")) { outDirectory = new File(args[++i]); if (!outDirectory.isDirectory() && !outDirectory.mkdirs()) { System.err.println("Couldn't create " + outDirectory); System.exit(-1); } } else System.err.println(usage + "\n invalid switch: " + args[i]); } else files.addElement(args[i]); } for (Enumeration e = files.elements(); e.hasMoreElements(); ) { fileName = (String)e.nextElement(); lineNumber = 0; File inFile = new File(fileName); BufferedInputStream bis = null; try { FileInputStream fis = new FileInputStream(inFile); bis = new BufferedInputStream(fis); } catch (FileNotFoundException ex) { System.err.println("Cannot read " + fileName); continue; } String base = fileName.substring(0, fileName.lastIndexOf(".")); File outFile = new File(outDirectory, base + ".java"); PrintWriter pw = null; try { FileOutputStream fos = new FileOutputStream(outFile); BufferedOutputStream bos = new BufferedOutputStream(fos); pw = new PrintWriter(bos); } catch (IOException ex) { System.err.println("Cannot write " + outFile); continue; } doStream(bis, pw); bis.close(); pw.close(); /** * Print names of output files on stdout to support * the usage pattern: jikes `mls inputfiles` */ System.out.println(outFile); } } catch (Throwable e) { System.err.println(e.getMessage()); e.printStackTrace(); System.exit(-1); } } }
The End