TOML vs YAML

So... Let's compare.

Syntax in General

TOML syntax is sometimes described as "noisy". I really don't understand the argument. Usually the string syntax is attacked. For example StrictYAML author calls it 'syntax typing'. I don't know why is it bad exactly but my counterpoint is that you can't ditch syntax typing when you try to do explicitly typed language. You can only get a stringly typed language then and StrictYAML is definitely a stringly typed language. Anyway, this does not apply to the regular YAML.

Verdict: both are fine to my taste. TOML is a bit more explicit, YAML is a bit more elegant, but neither is exactly more readable or something.

String Syntax

TOML:

# 4 ways:
str1 = 'literal string' # single line literal string
str2 = '''
also literal string
'''                     # multiline literal string
str3 = "basic string"   # basic string (with escape sequences)
str4 = """
also basic string
"""                     # multiline basic string

YAML:

# 10-ish ways:
str1: bare string       # bare string
str2: "quoted string"   # double quotes, escape sequences
str3: 'single quoted'   # single quoted, literal string
str4:
    multi
    line                # some abomination, never use it please
    bare                # single quoted and double quoted will follow the same weird rules when multiline
str5: |                 # literal block
    literal block
    seems legit
str6: |-                # literal block but it does not end in a newline
    literal block
    no newline
str7: |+                # literal block but it keeps all trailing newlines
    literal block
    all your newline
    are belong to us


str8: >                 # folded block (nl -> space)
    folded block
    weird but ok
str9: >-                # folded block but it does not end in a newline (why is it not folded by default?)
    folded block
    no newline
str10: >+               # folded block but it keeps all trailing newlines (who ever thought it would be a good idea?)
    folded block
    all your newline
    are belong to us

I wouldn't claim that TOML is the saint here, this syntax for example gives me shivers:

# "This," she said, "is just a pointless statement." in multiline double quotes:
str7 = """"This," she said, "is just a pointless statement.""""

but string folding rules and these block modifiers in YAML are above my pain tolerance.

Verdict: TOML. Both are bad but YAML is much much worse.

Explicit vs Implicit

The infamous Norway problem:

scandinavia:
- SE
- NO
- DK
# { "scandinavia": ["SE", false, "DK"] }

Or let's define some versions:

require:
    some/package:   1.2.3.4     # fine
    other/package:  1.2.3       # ok
    more/package:   1.2         # float!

Nothing of this is possible in TOML.

Verdict: TOML. Or always use explicit strings in YAML.

Safety

YAML is unsafe when unlimited. Many implementations are also unsafe by default.

TOML itself is safe from arbitrary code execution. It's not a 100% insurance from a faulty implementation but still much much safe.

Verdict: TOML.

DRY

YAML has nice feature: anchor / reference mechanism:

db_tpl: &tpl
    engine: pgsql
    host: localhost
db_prod:
    <<: *tpl
    name: db_prod
db_test:
    <<: *tpl
    name: db_test

But! If your config requires you to repeat yourself often, maybe the config model is wrong? Maybe you overconfigure. Maybe you need something else than config, like an actual DSL script.

Verdict: YAML. But does it really solve the problem or just hide it?

Data Structures

This will be weird to explain. YAML can express any structure of nested dictionaries and arrays.

TOML has a weird limitation. If we don't count inline arrays and inline dictionaries there are 2 structures:

  • [<dict><dict>...]<dict>

  • [<dict><dict>...]<array><dict>

This may be a painful limitation on your config, but again, maybe your config is overly complicated?

This however does not excuse the clumsy syntax for the array of dictionaries.

Verdict: YAML. Despite my rant here, it's a pure winner.

Implementations

Most YAML implementations are

  • unsafe by default on one side

  • not spec compliant on the other side

It's not a fault of the implementations themselves, the spec itself is both unsafe and too complicated and some parts (like composite keys) are meaningless in most mainstream languages.

Most TOML implementations are

  • not spec compliant

  • outdated or missing

And it's a chicken and egg problem: TOML is unpopular because it lacks good implementations and it lacks good implementations because it's unpopular.

Verdict: neither.

So My Choice Is

To use TOML. To design simple configs.

However the idea of inventing the 15th standard is still very tempting here.

Comments

Comments powered by Disqus