diff --git a/src/languages/json/SYNTAX.md b/src/languages/json/SYNTAX.md index e69de29..d1c133f 100644 --- a/src/languages/json/SYNTAX.md +++ b/src/languages/json/SYNTAX.md @@ -0,0 +1,273 @@ +# JSON Syntax + +This is a JSON-like concrete syntax used for parser and source span experiments. + +The starting point is standard JSON syntax, with one deliberate experiment-friendly choice: a `Document` may contain zero or more JSON values. Standard JSON allows exactly one top-level value, but allowing a sequence makes it easier to test recovery and multiple independent syntax trees. + +## Grammar + +This grammar is intentionally semi-formal. `Whitespace` may appear between the major syntactic parts shown below. + +```txt +Document ::= Value* + +Value ::= Object + | Array + | String + | Number + | "true" + | "false" + | "null" + +Object ::= "{" ObjectBody? "}" +ObjectBody ::= Member ("," Member)* +Member ::= String ":" Value + +Array ::= "[" ArrayBody? "]" +ArrayBody ::= Value ("," Value)* + +String ::= '"' StringChar* '"' + +StringChar ::= UnescapedStringChar + | Escape + +Escape ::= '\"' + | "\\" + | "\/" + | "\b" + | "\f" + | "\n" + | "\r" + | "\t" + | UnicodeEscape + +UnicodeEscape ::= "\u" HexDigit HexDigit HexDigit HexDigit + +Number ::= "-"? Integer Fraction? Exponent? +Integer ::= "0" | NonZeroDigit Digit* +Fraction ::= "." Digit+ +Exponent ::= ("e" | "E") ("+" | "-")? Digit+ + +Digit ::= "0" | "1" | "2" | "3" | "4" + | "5" | "6" | "7" | "8" | "9" + +NonZeroDigit ::= "1" | "2" | "3" | "4" + | "5" | "6" | "7" | "8" | "9" + +HexDigit ::= Digit | "a" ... "f" | "A" ... "F" + +Whitespace ::= " " | "\t" | "\n" | "\r" +``` + +Notes: + +- `Document` may be empty in this experiment. +- Objects and arrays do not allow trailing commas. +- Object keys must be strings. +- Strings use JSON escapes; raw newline and raw carriage return are not allowed inside strings. +- Number syntax follows JSON number rules, so leading zeroes like `012` are invalid. + +## Document + +A document is a sequence of zero or more values. + +```json +true +true false null +{"x": 1} [1, 2, 3] +``` + +Empty input is valid for this experiment. + +## Values + +A value is one of: + +- object +- array +- string +- number +- `true` +- `false` +- `null` + +Examples: + +```json +null +true +false +"hello" +123 +{"name": "Ada"} +[1, 2, 3] +``` + +## Objects + +Objects use braces and contain zero or more comma-separated members. + +```json +{} +{"x": 1} +{"x": 1, "y": 2} +{"nested": {"ok": true}} +``` + +Members are string keys followed by a colon and a value: + +```json +"name": "Ada" +``` + +Invalid objects: + +```json +{x: 1} +{"x" 1} +{"x": 1,} +{"x": 1 "y": 2} +``` + +## Arrays + +Arrays use brackets and contain zero or more comma-separated values. + +```json +[] +[1] +[1, 2, 3] +[true, false, null] +[{"x": 1}, ["nested"]] +``` + +Trailing commas are invalid: + +```json +[1, 2,] +``` + +Missing separators are invalid: + +```json +[1 2] +[true false] +``` + +Repeated separators are invalid: + +```json +[1,, 2] +``` + +## Strings + +Strings use double quotes. + +```json +"" +"hello" +"quote: \"" +"slash: \\" +"unicode: \u03bb" +``` + +Valid escapes: + +```json +"\"" +"\\" +"\/" +"\b" +"\f" +"\n" +"\r" +"\t" +"\u0041" +``` + +Invalid strings: + +```json +"unterminated +"bad escape: \x" +"bad unicode: \u12" +"raw +newline" +``` + +## Numbers + +Numbers follow JSON number syntax. + +Valid numbers: + +```json +0 +-0 +12 +-12 +1.5 +0.25 +1e10 +1E-10 +-12.34e+56 +``` + +Invalid numbers: + +```json +01 +- +1. +.5 +1e +1e+ +123abc +``` + +## Keywords + +The only keywords are: + +```json +true +false +null +``` + +These must match exactly: + +```json +true +false +null +``` + +Invalid keyword-like fragments: + +```json +True +FALSE +nil +nullish +truefalse +``` + +## Delimiters + +Objects and arrays must close with matching delimiters: + +```json +{"x": 1} +[1, 2, 3] +``` + +These are invalid: + +```json +{"x": 1] +[1, 2} +{"x": 1 +[1, 2 +```