HTML
HTML is the Domain Specific Language (DSL) that really has taken over the world. It is meant to be used as semantic markup, e.g <em>
is for
emphasis which can be then mapped to bold, or whatever. Indeed, without a good stylesheet it looks like utter crap. This form has been nicely styled with
PicoCSS:
<form>
<fieldset class="grid">
<input
name="login"
placeholder="Login"
aria-label="Login"
autocomplete="username"
/>
<input
type="password"
name="password"
placeholder="Password"
aria-label="Password"
autocomplete="current-password"
/>
<input
type="submit"
value="Log in"
/>
</fieldset>
</form>
HTML itself (and its data cousin XML) is a simplified form of SGML - which I present here as a example of a standards body having way too much fun.
HTML is a markup language, for presenting documents with structure. Since actually writing lots of HTML is tedious, there is a need to generate it from data.
One of the things buried in browsers is XSLT which was designed for the case of taking XML (such as an RSS feed) and converting it into HTML. It is not very pretty; here is a little taste:
…
<xsl:template match="myNS:Author">
-- <xsl:value-of select="." />
<xsl:if test="@company">
:: <b> <xsl:value-of select="@company" /> </b>
</xsl:if>
<br />
</xsl:template>
XSLT involves an XML document converting one kind of XML document into another kind. But conceptual simplicity does not necessarily mean easier and more convenient
More common these days to see templates, like this Go template example:
<h1>{{.PageTitle}}</h1>
<ul>
{{range .Todos}}
{{if .Done}}
<li class="done">{{.Title}}</li>
{{else}}
<li>{{.Title}}</li>
{{end}}
{{end}}
</ul>
This is certainly easier to read and write, but now we have two languages intermingling with each other - PHP is a good example of this style.
Let's see what a Nushell representation would look like. XML documents are records with tag
(a string), attributes
(a record) and content
(a list of child elements). Defining a command tag
to conveniently construct these elements, then the original form example looks like this:
form
(fieldset -c grid
(input -a {
name: 'login'
placeholder: 'Login'
autocomplete: 'username'
aria-label: 'Login'
})
(input -a {
type: 'password'
name: 'password'
placeholder: 'Password'
aria-label: 'Password'
autocomplete: 'current-password'
})
(input -a {
type: 'submit'
value: 'Log in'
})
)
)
Pop that through to xml -i 2 -s
(some indenting, and closing empty tags) and ... HTML.
Not bad at all, but a little more fussy than the original. The power of this representation is that it is code; can now have fun defining more specialized and better self-documenting input constructors. Or instead of 'Login' have (gettext 'Login')
and get localization.
Here's a more dynamic example, which wraps each string in a <li>
tag:
tag ul ([
'Here we go'
'A whole list of us'
] | each {|v| tag li $v})
We define a helper list
that does this. Together with a helper link
for <a>
elements we get this useful idiom:
(list ul li (
[
['Home' '/index.html']
['Help' '/manual.html']
] | each { link }
))
The complete little library is here
YAML
YAML (Stood for 'Yet Another Markup Language' initially, until its creators got more serious) is a popular way to represent arbitrarily nested data. The attraction is that there is little punctuation needed (as in what makes a shell work ) - JSON involves a lot more quoting and fussy positioning of commas.
a-map:
# with some key value pairs
one: 1
two: 2
three: drei
four: # and here's an array
- 4
- 40
- 400
title: a apparently straightforward notation
Unlike JSON, comments are allowed and whether something is a string or not is worked out for you. However, those rules are not obvious and the takeaway from years of experience is "just use quotes, man".
(Also from experience: normal users will completely fuck up things like indentation unless given an opinionated editing environment)
How did something so straightforward in intention get so unwieldly? First proposed in 2001, and first version in 2004: it had three parents - (contrast with JSON, which was a strictly-inforced subset of Javascript data literals proposed by Douglas Crockford, also in 2001.) They worked hard on a specification, and (in the ultimate analysis) simply had too much fun, like with the SGML process. It would have been better to have someone knock out a prototype over the weekend because they needed a more expressive data format.
YAML is declarative, like HTML. So templating is common, see this example from the Salt Stack
# Declare Jinja list
{% set users = ['fred', 'bob', 'frank']%}
# Jinja `for` loop
{% for user in users%}
create_{{ user }}:
user.present:
- name: {{ user }}
{% endfor %}
I had to deal with this shit at one point, and I'm still triggered by Captain Yaml when he's teamed up with Kid Jinja.
Ewww, as the mean girls say.
I have done some experiments with YAML expansion, inspired by these bad experiences.
This is a semi-useful example. Here is the data:
# animals.yml
data:
dog:
version: 1.2
ports: [2555]
cat:
version: 0.8
ports: [1023]
volumes: [/:/hostfs.ro]
And the template - the MAP
special form takes applies all the values in the original data map and constructs a new object for each one. The LIST
form does the same for each value in an array.
# services.yml
services:
MAP-k,v-in-data:
image: 'ourtech/((k)):((v.version))'
ports:
LIST-p-IN-v.ports: '((p)):((p))'
IF-v.volumes?:
volumes: ((v.volumes))
And the result is in a known format:
# docker-compose.yml
services:
cat:
image: ourtech/cat:0.8
ports:
- 1023:1023
volumes:
- /:/hostfs.ro
dog:
image: ourtech/dog:1.2
ports:
- 2555:2555
This experiment is in Go and I can clean it up and make it available if there's interest - I am not entirely convinced of the exact notation yet.
It's no surpise that Nu is a pleasant alternative to YAML, since it is also low-punctuation. This is an Ansible file, rendered in the Nu equivalent of JSON: Nuon (Nu Object Notation):
[{
name: 'Write hostname'
hosts: all
tasks: [
{
name: 'write hostname using jinja2'
ansible.builtin.template: {
src: templates/test.j2
dest: /tmp/hostname
}
}
]
}]
This is a tasteful way of integrating Jinja templating, since the templating is in separate files and one doesn't get that mad feeling from seeing two completely different notations in the same file:
# templates/test.j2
My name is {{ ansible_facts['hostname'] }}
With Nu, we can lean into the idea that configuration is executable. This is the same docker-compose example as previous:
use expand.nu *
{
services: (MAP $data.data {|v| NON-NULL {
image: $"ourtech/($v._KEY):($v.version)"
ports: (LIST $v.ports "{_}:{_}")
volumes: $v.volumes?
}
})
} | to yaml
MAP
operates on the values of a record (tho inserts the key as _KEY
); LIST
operates on a list, much like each
except with a special case for scalar values.
NON-NULL
lets us define a record literal that does not insert null
values. So the volumes
entry will not be filled if $v
does not have a volumes
key.
These helper commands are to be found here
'Executable Configuration' Are you Insane?
Nu is very shell-focused, so (a) can trivially shell out to external commands and (b) there are platform-independent ways to create and remove files. It isn't designed for sandboxing like Lua.
The safest bet would be running in a container with limited access to the host filesystem, CPU quotas, the works. What the Bomb Squad would call a controlled explosion.
A less defeatest option would be to use the view ir
command to dump and traverse the IR looking for call
opcodes and apply some sensible allow-list rules (or deny-list? I'm not sue yet. The command ir-scan
in the above module returns the list of commands referenced.)
But this is not intended as a solutions article; document/data DSLs often need to be generated, and templating involves two entirely different layers of language co-existing with each other. This is particularly hard to read as a human if dealing with YAML. A more expressive language offers a way out - at the cost of allowing arbitrary computation.
No comments:
Post a Comment