Thursday, 15 December 2011

MoonScript: A Lua-Based Language

Let a Thousand Flowers Bloom

There are literally hundreds of programming languages, most of which are historical curiosities, research tools or personal projects. You would think that most of the important innovations have happened, and existing mature tools available for all tastes. That is true, to a point: but we are still exploring the space of expressive notation for solving our programming problems. Here the practical programmer and the academic researcher diverge, of course; the latter sees most new languages as rehashes of things first done in the golden age of PL development.

The last few years has been a surprisingly fertile time for new programming languages. They range from Go (native code + new runtime), Scala (bytecode + JVM) to languages like CoffeeScript or Dart which translate to JavaScript. The trend is mostly towards static typing but with type inference helping to keep the syntax clear and uncluttered, with functional influences. There is some TIOBE and anecdotal evidence that the dynamic language usage has peaked, possibly because of the maintenance difficulties of large codebases (an explicit motivation for Dart's development, which has optional typing.)

MoonScript is inspired by CoffeeScript, but generates Lua code instead. JavaScript and Lua share many similarities like first-class functions, the m["name"] == m.name equivalence, and no conventional OOP constructs. People like the convenience and familiarity of class, however, so CoffeeScript, Dart and MoonScript provide explicit support.

Trying it Out

When reviewing a new language, it's useful to remember that programming languages take time and exposure to become mature. MoonScript is now on its second release, three months after the first, and getting out of its proof-of-concept phase. 'Leaf' Cocoran is doing all the right things; he has provided syntax files for some common editors, a binary for the notoriously compiler-shy Windows community, and has a good looking website with comprehensive documentation. And he is building on a very solid base, the battle-tested Lua 5.1 release.

Also, reviews are not tutorials; I won't try to cover all the syntax, but will try to give the flavour of the language.

MoonScript could generate Lua bytecode directly, but would hit the basic issue that the "Lua VM" does not have standardized bytecode. In particular, this would exclude MoonScript from using the excellent LuaJIT implementation, which is a known problem with the otherwise powerful Metalua dialect. And having a readable high-level language as the compilation output makes it easier to develop and trouble-shoot.

Although not compiler-shy, I am lazy and just downloaded the Windows binary (it's hard to resist a language delivered as a 212KB zip file. Besides, the 'non-POSIX' world doesn't get enough attention these days.) If you want to avoid the 'moon script.moon' incantation, then it's straightforward to tell Windows how to handle MoonScripts:

 ftype moonscript=d:\utils\bin\moon.exe %1 %*
 assoc .moon=moonscript

(These make global permanent changes, so you need to run as Administrator). Then add .moon to the system PATHEXT variable. The incantation is now just 'script'.

 -- script.moon
 print "Hi Ma", "No Parentheses!"

Wish-list Driven Design

In many ways, MoonScript is like a collection of features from a Lua user's wishlist; list comprehensions, shorthand for lambdas, switch statements, explicit classes, statements-as-expressions, switch statement, local-as-default. But Lua is the C of the dynamic languages; it has a small, very powerful core and 'conventional' syntax which has made it popular as an extension language. It moves slowly and carefully, and does not pander to 'Python envy'. MoonScript, on the other hand, will appeal to people with Python exposure, since indentation is used in the same way to delimit blocks.

I would not recommend MoonScript to a beginner in the same way I could for Lua itself, since it has a very concise syntax that appeals more to seasoned professionals. There is a lot more to remember, and always a number of ways to do similar things. This example shows the conciseness of expression:

 sqr = (x) -> x*x
 print sqr 4, sqr 5

The last statement in a function becomes its value; there's still return available but usually it's only needed explicitly when leaving a function before its end. print is still the same function it is in Lua, but parentheses are usually optional.

Well, cool: we've saved a lot of typing. That translates to something like this:

 local sqr
 sqr = function(x) return x*x end
 print (sqr(4),sqr(5))

Now saving on typing is an ambiguous virtue; if we wanted the ultimate compression, we would all be using APL and playing Code Golf. Classic Lua programmers prefer keywords to sigils, and fear that their language would end up like Perl. They say that reading code is a much more common operation than writing it. And that is fine and dandy (I mostly agree), which is why we need projects like MoonScript and Metalua to get a feeling for new notation.

Personally I like how MoonScript makes the functional aspects of Lua stand out more clearly. Higher-order functions are less cluttered with keywords:

 compose = (f,g) ->
   (...) -> f g ...
 printf = compose print, string.format
 printf "the answer is %d, '%s'", 42, "Everything"

List comprehensions are another favourite functional idiom:

 range = [i for i = 1,10,0.1]
 squares = [sqr(x) for x in *range]

Note the * syntax, which satisfies the common need to iterate directly over the elements of an array. (You can still say for _,x in ipairs range) These constructs result in inlined Lua code, which is a good decision given the very fast translation times for Lua and its modest resource needs.

Multiple for clauses are allowed:

 points = [{x,y} for x in *x_coords for y in *y_coords]

That generates about 17 lines of Lua output; hand-written would be about 10, which would still slow readers down and make them have to guess the programmer's intention.

Statements can be used as expressions, which is useful in functions:

 choose = (c,a,b) -> if c then a else b

In fact, there's a clever feature that for statements return tables:

 show = (t) ->
     for k,v in pairs t
         print k, v
 res = for i = 1,10 do 2*i
 show res

This is sufficiently clever to cause unintended consequences, so the rule is that this does not happen in function expressions and cause needless table generation:

 loop = -> for i = 1,10 do i
 print loop()
 -->
 (nada)

It's probably too clever to be truly wise, given that list comprehensions do the job more explicitly. However, it can be used to go beyond the list comprehension syntax:

 values = {1,4,3,5,2}
 v = for v in *values do switch v
     when 1 then 'bad'
     when 4 then 'cool'
     when 5 then 'excellent'
     else "uh-huh"
 -- v is now {'bad','cool','uh-huh','excellent','uh-huh'}
 --

New with this release are table comprehensions, which are a generalization to map-like tables. These use curly braces to distinguish themselves, and the expression must be double-valued, with the first value becoming the new key and the second the associated value. So this expression turns an array into a set, that is t["hello"] is true if ls contains the value "hello".

 t = {k,true for k in *ls}

At this point, it's good to remember that we are still in Lua land, and there is no automatic pretty-printing of tables - which is wise because tables are the ultimate flexible data structure. (You can choose the string representation of your own types easily enough.) There is a small library included which provides a generic table dumper:

 require 'moon'
 moon.p {
     one: 'hello',
     10,
     20
 }
 -->
 {
     [1] = 10
     [2] = 20
     [one] = "hello"
 }

(For those used to Lua, note the use of : for key-value assignments in table constructors; the key may also be a keyword. For those who are not, note that tables can both be 'arrays' and 'hashes' - they are general associative arrays which are particularly efficient at regular array indexing. Like electrons, they are both waves and particles.)

However, despite the perception, there are batteries for Lua and MoonScript can consume them happily.

Using Existing Lua Libraries

MoonScript already includes LuaFileSystem and LPeg as standard libraries.

 total = 0
 for f in lfs.dir "." if string.match f "%.lua$"
   size = lfs.attributes(f).size
   print f, size
   total += size
 print 'total',total

(It also comes with lua-getopt for argument parsing, but that's an awkward, no-documentation kind of library.)

My favourite command-line parser is Lapp, which is part of the Penlight suite of libraries, since it parses the actual flag names and types from the usage string:

 -- mhead.moon
 require 'pl'
 args = lapp [[
     A simple head utility
     -n,--lines (default 10) number of lines to print
     <file> (default stdin) file to print; can be stdin
 ]]
 for i = 1,args.n
     line = args.file\read!
     if not line then break
     print line

On Windows, the directory for Lua packages is relative to the executable. So if I install moon.exe to d:\utils\bin, then Lua packages must go in d:\utils\bin\lua. To install Penlight simply involves copying its pl directory to this lua directory. For other systems, MoonScript shares the usual global Lua module path.

To read configuration files (both Unix and INI-style) we can use pl.config

 require 'pl'
 test = [[
 # test.config
 # Read timeout in seconds
 read.timeout=10
 # Write timeout in seconds
 write.timeout=5
 #acceptable ports
 ports = 1002,1003,1004
 ]]
 t = config.read stringio.open test
 pretty.dump t
 [[ --output--
  {
     ports = {
       1002,
       1003,
       1004
     },
     write_timeout = 5,
     read_timeout = 10
   }
 ]]

Classes

By the time people get to Lua (or Go) they have been sufficiently indoctrinated to instantly miss classes. Lua prefers to define more general metaprogramming techniques, which allows a competent user to roll any kind of custom OOP framework. (Unfortunately, too many people do this, so interoperability becomes a problem.) But ultimately all that matters is that the duck quacks. Lua does require you to indicate that the quacking applies to a particular duck, or to all ducks, much as a C++ programmer uses -> and ::.

In Lua, a method call is:

 duck:quack()

And in Moonscript:

 duck\quack()

(Idiomatic MoonScript would actually be duck\quack! but let's change one variable at a time.)

I'm not 100% crazy about the backslash here, but : has now got a more conventional job defining value/key pairs in MoonScript (and there was no other punctionation available ;))

So here is a MoonScript class, bit by bit:

 import insert,concat,remove from table

the import statement is one of those nice-to-haves which would be very useful in Lua, since the equivalent is the noisy and redundant:

 local insert,concat,remove = table.insert, table.concat, table.remove

A class is an intented block containing methods

 class List
     new: (t) =>
         @ls = t or {}

Note the colon, instead of =; this reminds us that we are putting these functions into a table. The fat arrow => means that the function has an implicit self argument; here it's equivalent to (self,t) ->.

@ls is just sugar for self.ls, as in Ruby. (The new method will be indirectly called as a constructor).

Our chosen representation for a list is a table field ls. This can be manipulated by the usual Lua table functions, which we previously imported:

 add: (item) =>
     insert @ls,item
 insert: (idx,item) =>
     insert @ls,idx,item
 remove: (idx) => remove @ls,idx
 len: => #@ls
 __tostring: => '['..(concat @ls,',')..']'

table.insert is basically two functions wrapped up as one - it both appends and inserts. #@ls is one of those collisions of punctuation that one just has to get used to in any language - it helps to read it as #self.ls (and you can use self explicitly if you like.)

Any Lua metamethod can be specified in a class. A MoonScript class is the metatable for all its instance objects, with __index pointing back at the metatable so the object can find its methods. This is the most common way of implementing 'classes' in Lua.

Unfortunately, # cannot be overloaded in Lua 5.1 for tables; __len only applies to userdata defined by C extensions. (This is one of the welcome Lua 5.2 improvements)

 find: (item) =>
     for i = 1,#@ls
         if @ls[i] == item then return i
 remove_value: (item) =>
     idx = self\find item
     self\remove idx if idx
 remove_values: (items) =>
     for item in *items do self\remove_value item

if statements come in two flavours: multiline, where the following block is indented, and single statements that use then. The same applies to for, where the single statement version has an explicit do.

Any statement/expression can also have an if clause, called a 'line decorator'.

Now list comprehensions get their turn:

 index_by: (indexes) =>
     List [@ls[idx] for idx in *indexes]
 copy: => List [v for v in *@ls]

And finally we define extension, iteration and concatenation:

 extend: (list) =>
     other = if list.__class == List then list.ls else list
     for v in *other do self\add v
     self
 __concat: (l1,l2) -> l1\copy!\extend l2
 iter: =>
     i,t,n = 0,@ls,#@ls
     ->
         i += 1
         if i <= n then t[i]

Any instance of a MoonScript class will have a __class field; I make this check because it's useful to allow a list to be extended by a plain vanilla table.

The iter method needs some explanation: in Lua, the for in statement expects a function that it can repeatedly call to get each successive value of the sequence, until that function returns nil. This method returns a closure with upvalues i, t and n.

And, exercising List:

 ls = List!
 ls\add 10
 ls\add 20
 ls\remove_value 10
 ls\remove_value 15
 ls\add 30
 ls\remove 1
 for v in *{1,2,3} do ls\add v
 print ls, ls\len!
 print ls\index_by {2,3}
 for v in ls\iter!
     print v
 print (List{'a','b'})..(List{'A','B'})

No type is complete without a definition of equality. == in MoonScript, as in Lua, compares simple values, not referencews. Objects are equal if they are the same object; (string equality works as expected because all Lua strings are guaranteed to be internally unique.) Generally, this is a good idea (none of JavaScript's 'helpfullness' here!). The relavant metamethod is __eq:

 __eq: (l1,l2) ->
     return false if l1\len! != l2\len!
     l1, l2 = l1.ls, l2.ls
     for i = 1,#l1
         return false if l1[i] != l2[i]
     return true

As another example of a Lua wish-item, note that ls\add without arguments is a valid expression, called a 'function stub',

 ls = List {1,2}
 add = ls\add
 add 3
 add 4
 assert ls == List {1,2,3,4}

ls\add 3 is short for ls.add ls 3; ls\add is a partial function application that binds the self argument.

See here for the complete code listing.

use as a DSL

One of the things I look for in new languages is how amenable their syntax is to the creation of embedded domain-specific languages.

The MoonScript author has a static website generator called sitegen where you specify your configuration with a site.moon file:

 require "sitegen"
 site = sitegen.create_site =>
   deploy_to "leaf@leafo.net", "www/sitegen"
 @title = "Sitegen"
 @url = "http://leafo.net/sitegen/"
 add "index.html"
 add "doc/ref.md"
 site\write!

We've met the fat arrow (=>) before in methods, but it can be used anywhere. Here => is shorthand for (self) ->. Now the @title shortcut makes sense, since it's just self.title.

The Lua equivalent is certainly more verbose, although I'm not sure we're losing much on readability:

 require "sitegen"
 local site = sitegen.create_site(function(self)
    deploy_to("leaf@leafo.net","www/sitegen")
 self.title = "sitegen"
 self.url = "http://leafo.net/sitegen/"
 add "index.html"
 add "doc/ref.md"
end)
 site:write()

sitegen is the best existing example of real working MoonScript. As the author says modestly, "apart from the compiler". (That's right, MoonScript is now self-hosting.)

Problems and Irritations

You'll notice the necessary parentheses when concatenating two List expressions. Although we are encouraged to leave them out unless they are necessary, the rules are not as straightforward as they could be. For instance, this equivalence contradicts the above rule; note how whitespace matters:

 L"frodo" + L"bilbo" == L("frodo") == L("bilbo")
 L "frodo" + L "bilbo" = L("frodo" + L("bilbo"))

This comes from a Lua convention that parens are not needed if the function has a single string argument, but results in a special case that is likely to bite more than one person. It's easy enough to learn to read indentation as syntax, but making a single space syntactically significant is going too far. I would see this as destructive interference from the base language.

There are other inconsistencies, like you cannot say @\f to call another method when inside a method body. (This would lead to serious punctuation clusters like @\f! so perhaps this is a feature ;))

Generally, the compiler is not so very helpful and gives non-specific parse errors, and the line it does complain about is often not the problem line anyway. So I found myself having to read the manual carefully to work out issues. It reminded me of my Seventies experience with mainframe Fortran, where the result of painstaking work at the card puncher would lead to the equivalent of 'Que?'. And run-time errors would usually end up as a six-digit error code which you would then have to look up in the IBM manuals, which were bolted down like Medieval manuscripts. We've come a long way, and it shows how important good diagnostics are to learning a system: the compiler as tutor.

But hey, it's a young language, these are all fixable faults. It's a must-play for anybody interested in the future evolution of Lua, and could find an audience among CoffeeScript enthusiasts who might appreciate a lighter, faster runtime, especially with more generally useful batteries.

1 comment:

  1. I'm getting started on a project using the Corona SDK and am deciding whether to use Lua on its own, or compile with Moonscript. This article was really helpful in detailing some of the differences and what you found helpful. Thanks!

    ReplyDelete