syntax-lab/src/languages/json/SYNTAX.md
2026-04-25 16:51:05 +02:00

3.8 KiB

JSON Syntax

This is a JSON-like concrete syntax used for parser and source span experiments.

The starting point is standard JSON syntax, with one deliberate experiment-friendly choice: a Document may contain zero or more JSON values. Standard JSON allows exactly one top-level value, but allowing a sequence makes it easier to test recovery and multiple independent syntax trees.

Grammar

This grammar is intentionally semi-formal. Whitespace may appear between the major syntactic parts shown below.

Document      ::= Value*

Value         ::= Object
                | Array
                | String
                | Number
                | "true"
                | "false"
                | "null"

Object        ::= "{" ObjectBody? "}"
ObjectBody    ::= Member ("," Member)*
Member        ::= String ":" Value

Array         ::= "[" ArrayBody? "]"
ArrayBody     ::= Value ("," Value)*

String        ::= '"' StringChar* '"'

StringChar    ::= UnescapedStringChar
                | Escape

Escape        ::= '\"'
                | "\\"
                | "\/"
                | "\b"
                | "\f"
                | "\n"
                | "\r"
                | "\t"
                | UnicodeEscape

UnicodeEscape ::= "\u" HexDigit HexDigit HexDigit HexDigit

Number        ::= "-"? Integer Fraction? Exponent?
Integer       ::= "0" | NonZeroDigit Digit*
Fraction      ::= "." Digit+
Exponent      ::= ("e" | "E") ("+" | "-")? Digit+

Digit         ::= "0" | "1" | "2" | "3" | "4"
                | "5" | "6" | "7" | "8" | "9"

NonZeroDigit  ::= "1" | "2" | "3" | "4"
                | "5" | "6" | "7" | "8" | "9"

HexDigit      ::= Digit | "a" ... "f" | "A" ... "F"

Whitespace    ::= " " | "\t" | "\n" | "\r"

Notes:

  • Document may be empty in this experiment.
  • Objects and arrays do not allow trailing commas.
  • Object keys must be strings.
  • Strings use JSON escapes; raw newline and raw carriage return are not allowed inside strings.
  • Number syntax follows JSON number rules, so leading zeroes like 012 are invalid.

Document

A document is a sequence of zero or more values.

true
true false null
{"x": 1} [1, 2, 3]

Empty input is valid for this experiment.

Values

A value is one of:

  • object
  • array
  • string
  • number
  • true
  • false
  • null

Examples:

null
true
false
"hello"
123
{"name": "Ada"}
[1, 2, 3]

Objects

Objects use braces and contain zero or more comma-separated members.

{}
{"x": 1}
{"x": 1, "y": 2}
{"nested": {"ok": true}}

Members are string keys followed by a colon and a value:

"name": "Ada"

Invalid objects:

{x: 1}
{"x" 1}
{"x": 1,}
{"x": 1 "y": 2}

Arrays

Arrays use brackets and contain zero or more comma-separated values.

[]
[1]
[1, 2, 3]
[true, false, null]
[{"x": 1}, ["nested"]]

Trailing commas are invalid:

[1, 2,]

Missing separators are invalid:

[1 2]
[true false]

Repeated separators are invalid:

[1,, 2]

Strings

Strings use double quotes.

""
"hello"
"quote: \""
"slash: \\"
"unicode: \u03bb"

Valid escapes:

"\""
"\\"
"\/"
"\b"
"\f"
"\n"
"\r"
"\t"
"\u0041"

Invalid strings:

"unterminated
"bad escape: \x"
"bad unicode: \u12"
"raw
newline"

Numbers

Numbers follow JSON number syntax.

Valid numbers:

0
-0
12
-12
1.5
0.25
1e10
1E-10
-12.34e+56

Invalid numbers:

01
-
1.
.5
1e
1e+
123abc

Keywords

The only keywords are:

true
false
null

These must match exactly:

true
false
null

Invalid keyword-like fragments:

True
FALSE
nil
nullish
truefalse

Delimiters

Objects and arrays must close with matching delimiters:

{"x": 1}
[1, 2, 3]

These are invalid:

{"x": 1]
[1, 2}
{"x": 1
[1, 2