Wednesday 10 August 2011

Enjoying Java (Again)

Fun with Java seems like an unlikely combination these days. All the cool kids have moved to dynamic languages, or to more exciting JVM languages like Scala. Meanwhile the old dog plods along, not really learning any new tricks, doing a corporate gig as the COBOL of the 21st Century.

I have a fair amount of stuff prototyped with LuaJava, which had grown into a hairy blob which needed rewriting - the point at which exploratory programming breaks down and you have to move from scripting to programming. This does not automatically mean changing from a dynamic language, of course: 'scripting language' is perjorative and somehow implies that doing better isn't possible given the language. But here I had to make the adult choice between a language which my peers did not know and one which would be a better bet for maintainability. But I was determined to have fun in the process.

It is perfectly possible to enjoy Java programming, especially if you use a more dynamic way of thinking. Certain sacred cows will be inconvenienced, of course. The heart of the matter is how to be sure that a program is correct; the static types perspective is that the compiler should catch as many problems as possible, whereas the dynamic perspective is that errors will happen anyway, so make sure that they happen as soon as possible. A language like Java allows solutions along the whole continuum between these positions.

The verbosity of Java is an issue, but can often be worked around creatively. Several features introduced in Java 1.5 are helpful. (Some of these clearly came about because of competition from C#.) The first is methods that can take a variable number of arguments of a type:

 static Object[] A(Object... objects) {
     return objects;
 }

This is useful and also succinctly expresses how the varargs mechanism works; it is just syntactical sugar for passing an array. Now we can do cool things like this without the distraction of the syntactical overhead:

 Object[] ls = A("hello",2.3,A(true,2,"what"));

Auto-boxing (another feature 'inspired' by C#) makes these list expressions work as expected. The result is in fact rather close to what a Python or Lua programmer would recognize as data.

The Collection Literals proposal would certainly be useful, but appears not to have made it in Java 7. (In any case, this proposal only applies to immutable lists, maps, etc.)

Here is an easy map constructor:

 static Map M (Object... objects) {
     Map map = new HashMap();
     for (int i = 0; i < objects.length; i += 2) {
         map.put(objects[i], objects[i+1]);
     }
     return map;
 }
 ...
 Map<String,Integer> map = M("one",1,"two",2,"three",3);

Eclipse does not approve of this code, and expresses itself with yellow ink; we have lost some compile-time guarantees and gained something in return. Is the increased expressiveness worth the loss in type-safety? It depends how you evaluate the cost; a dynamic-language person is used to type uncertainty, and immediately thinks about error checking and testability. No seasoned programmer regards a correct compilation as anything but the first step to correct code. The dynamic strategy is to gain expressivity and flexiblity, losing some static type safety, and try to fail as soon as possible in the development cycle. That is, 'late binding' is best if the binding isn't too late.

Many would argue that the type-safety guarantees that Java can provide are hopelessly inadequate anyway (any variable of object type can be null for instance.) So they seek languages where ultimately it is impossible to write incorrect programs (at the cost of making it impossible to write non-trivial programs, but I digress.)

There are many applications for which Java reflection was clearly intended. There is an irritating amount of detail involved in writing command-line programs in any language, and the following is one way of approaching the problem. The immmediate inspiration was lapp, a framework for writing command-line Lua scripts; the idea was that parameters could be declared as having specific types, e.g files-for-reading, and they would then become directly available for use, with the framework closing any open files at the end.

The introduction of attributes was another useful innovation that came with 1.5 (again, arguably under pressure from C#.)

 @Help("Demonstrating a simple command-line framework")
 public class SimpleCommand extends Commandlet {
     @Help("scale factor")
     public double scale = 1.0;
     @Help("adds two numbers together and scales")
     public double add(double x, double y) {
         return (x+y)*scale;
     }
     public static void main(String[] args) {
         new SimpleCommand().go(args);
     }
 }
 $> java SimpleCommand --help
 Demonstrating a simple command-line framework
 Flags:
 -scale: scale factor
 Commands:
 add:    adds two numbers together and scales
 $> java SimpleCommand add 1.2 4.2
 5.4
 $> java SimpleCommand -scale 2 add 1.2 4.2
 10.8

Commandlet is a straightforward little framework (about 500 lines, which is micro by Java standards) that handles some pesky details. From reflection, it knows what types a command expects and converts the parameters accordingly. (I'm not a fan of prescriptive frameworks, so it is not necessary to use @Help annotations; they are just used to provide help, rather like Python doc strings.)

Naturally, after I did this I discovered that cliche does something very similar, except for writing interactive shells. This is more of a library - you do not derive your classes from a framework class - which is arguably a more flexible design.

The only tricky bit of reflection magic needed was to support variable argument lists:

 @Help("takes a variable number of doubles")
 public void sum(double... vals) {
     double sum = 0.0;
     for (double x : vals) {
         sum += x;
     }
     System.out.println ("sum was " + sum);
 }

Invoking this method by reflection involves passing a double[] array, so for vararg methods any extra parameters are collected and converted into a primitive array of doubles.

 if (method.isVarArgs()) {
   Class<?> type = types[nargs].getComponentType();
   Object varargs = Array.newInstance(type, parms.length - nargs);
   for (int i = nargs, j = 0; i < parms.length; ++i,++j) {
      Object res = convertString(parms[i],type);
      Array.set(varargs,j,res);
   }
   values.add(varargs);
 }

Being one of those people who can never remember how to read a text file in Java, I've made BufferedReader a known type:

 @Help("read a file and trim each line")
 public void read(BufferedReader file) throws IOException {
     String line = file.readLine();
     while (line != null) {
         System.out.println(line.trim());
         line = file.readLine();
     }
 }

Parameters of type int, double, String, BufferedReader and PrintStream are known, and others can be added. Say I have this method:

 public void dec (byte[] arr) {
     for (byte b : arr) {
         System.out.print(b+" ");
     }
     System.out.println();
 }

Then defining how byte[] is to be read in can be done like so:

 @Converter
 public byte[] toByteArray(String s) {
     byte[] res = new byte[s.length()/2];
     for (int i = 0, j = 0; i < s.length(); i += 2, j++) {
         String hex = s.substring(i,i+2);
         res[j] = (byte)Short.parseShort(hex, 16);
     }
     return res;
 }
 $> java bytearr dec AF03EE
 -81 3 -18

The strategy is simple: if a parameter type is unknown, then look at all public methods marked with Converter and match against their return types. It's interesting to contrast this convention with the approach to converters taken by Cliche which is more of a classic Java solution (define a new interface and create an anonymous class implementing it).

Simularly, you gain control over output by defining 'stringifiers'

 public byte[] test() { return new byte[] {0x4E,0x3C,0x02}; }
 @Stringifier
 public String asString(byte[] arr) {
     StringBuffer sb = new StringBuffer();
     for (int i = 0; i < arr.length; i++) {
         sb.append(String.format("%02X",arr[i]));
     }
     return sb.toString();
 }
 ...
 $> java bytearr test
 4E3C02

Here it's the first argument type that must match the output type, and the return value must be a string. As with Converters, the actual name of the method is not important.

Usually the idea is to turn data into something more readable for humans, but it's also useful to generate standard formats that other programs can easily parse. JSON is a popular representation of data that fits well with the first theme of this article, which is discovering expresive notations. The elegant little function J makes creating dynamic structured data almost as natural as it is in JavaScript or Python:

 public Json result(int a, int b) {
     return J("one",A(a,10*a),"two",A(b,10*b));
 }
 $> java JDataTest result 10 20
 {
  "two":[
   "20",
   "200"
   ],
  "one":[
   "10",
   "100"
   ]
  }

The class Json is derived from a map from strings to objects, so J can be expressed like this in a way that makes the potential errors clearer.

 static void error (String message) throws IllegalArgumentException {
     throw new IllegalArgumentException(message);
 }
 static Json J(Object... objects) throws IllegalArgumentException {
     Json res = new Json();
     if (objects.length % 2 != 0)
         error("need key/value pairs");
     for (int i = 0; i < objects.length; i += 2) {
         if (! (objects[i] instanceof String))
             error("keys must be strings");
         map.put(objects[i], objects[i+1]);
     }
     return res;
 }

Json also contains a method for generating an optionally pretty-printed text representation of this structure; (useful, but not in itself particularly interesting code); it gives us the stringifier.

 @Stringifier
 public String json(Json map) {
     return Json.asText(map," ");
 }

There was a discussion on Reddit recently about how JSON would make a good interchange format between command-line programs. We can go further, and do a Java shell that works like PowerShell; commands pass actual data between each other, which is then put out as text by various adapters. That is, in this line the piped 'programs' are classes which will pass data directly to each other - here baz is either a custom adapter, or one gets the default output in JSON.

 $> foo | bar | baz

And (perhaps) that will be something to amuse me on a rainy Sunday ...

In the meantime, at work I wanted to show a concrete example of a Web service that could allow our server speaking CORBA to communicate with a Django web server. Corbaphobia is a well-known phenomenon in the distributed programming universe and I wanted to make all this cool remote functionality available in a simpler format. JSON seemed a better match than XML for our data. So I realized that it would actually be fairly straightforward to derive a little web server from Commandlet and in fact it took less than 120 lines, thanks to classes like URLDecoder. An URL like 'find?name=alice&age=16' is turned into 'find -name alice -age 16' and then interpreted by Commandlet, with all such commands returning Json data with the above stringifier. The commands take no explicit parameters, which are instead passed as flags. Such servers must therefore define a resetFields method so they can clear out the fields before each new request.

Making such a simple web service interface to existing remote services turns out to be very straightforward, mostly due to the expressive power of the J function above.

 public String ssid;
 public int did;
 public Json getSingleDataRange() {
     if (ssid == null) return error("must provide 'ssid'");
     long sid = parseSid(ssid);
     if (sid == 0) return error("cannot parse "+ssid);
     if (did == 0) return error("must provide 'did'");
     Results results = remote.getSingleDataRange(sid, did, getTimeSpan(sid));
     // have to explicitly box the primitive arrays to serialize out to JSON
     int len = results.times.length;
     Object[] times = new Object[len];
     Object[] values = new Object[len];
     for (int i = 0; i < len; ++i) {
         times[i] = results.times[i];
         values[i] = results.values[i];
     }
     return J("times",times,"values",values);
 }

The error strategy is simple: don't worry about 404 or 500, just return some data that contains an error message.

 private Json error(String msg) {
     return J("error",msg);
 }

The test client was 33 lines of Python, another case of the right tool for the job, really just a matter of exercising standard Python libraries like httplib, json, pprint and urllib.

MiniJsonServer is what the Agile people so charmingly call a spike solution, i.e. what us less sophisticated hackers call an 'exploratory hack'. It is a hack because it transforms URLs into equivalent command-line parameters, and relies on the local state of the object to pass parameters.

We can do better than that, by annotating the arguments of the exposed methods in a style similar to that used by Cliche:

 public Json getSingleDataRange(
   @Param("ssid sensor id") String ssid,
   @Param("did detector id") int did,
   @Param("t1 start time") Time t1,
   @Param("t2 end time") Time t2
 ) {
 ...
 }

Then the parameters can be passed by name and meaningful errors generated semi-automatically. A good side-effect is that the parameters are now documented as well as named, and our little server can respond to a 'help' command with a useful summary of the server's functionality. (As the author of Cliche says, it is straightforward to re-arrange the JavaDoc of a function into this reflection-friendly form).

In this application there is freedom to specify times in various formats (not everyone likes to speak Java milliseconds-since-epoch) and we support that by defining an internal time wrapper type called Time and make the conversion more specific, depending on a flag tfmt.

There is always the temptation to push further; it's straightforward for MiniJsonServer to work as a more general local webserver. Some of this functionality fits the intended application, e.g. when the associated data from a sensor is some common binary format like an image, a request for data should return the data with that specific 'image/jpg' content type. (Encoding images in JSON strikes me as the kind of universal-language madness that got XML such a bad name.) But as I already have the Orbiter project for developing little local HTTP servers, this would be taking the fun too far.

Having a wide range of applications is good exercise for a framework, no matter how dinky. I've done a simple Swing console that integrates with Commandlet. The main method of such commandlets looks like this:

 public static void main(String[] args) {
     TestConsole me = new TestConsole();
     Console c = new Console("testing console","? ",me);
     me.go(c);
     c.setVisible(true);
 }

where Console is the Swing console window. It is relatively stupid and focussed on its job (which is the mark of a good class in my opinion) and implements the Displayable interface that provides a single method display. In turn, Commandlet implemnts a Evaluator interface providing eval. For maximum flexiblity, Commandlet also implements Displayable and provides a default console implementation; Console implements Evaluator so that direct subclasses can use it directly. This is all standard Java thinking and decouples the engine from the interface pretty well.

In summary, apart from having some fun and wanting to talk about it, I wanted to show that more expressive notations in Java can be invented, if you are prepared to sacrifice some correctness. This trade-off should be done as a conscious engineering decision, and shouldn't be decided solely using abstract rules. After all, guidelines are not commandments.

The example code is available here

1 comment:

  1. Wow, you have been getting around! I am constantly in awe of your ability & energy. Java is not my cup of (ahem) coffee, but, since everything is going the way of phones, I can see myself being dragged kicking & screaming that way, sooner or later. Sigh

    ReplyDelete