SYNTAX.md for Lisp

This commit is contained in:
Yura Dupyn 2026-04-25 16:34:19 +02:00
parent c3edf193c4
commit a824e2d9e8

View file

@ -0,0 +1,224 @@
# Lisp Syntax
This is a small Lisp-like concrete syntax used for parser and source span experiments.
## Grammar
This grammar is intentionally semi-formal. `Whitespace` may appear between the major syntactic parts shown below.
```txt
Program ::= Expr*
Expr ::= Identifier
| Number
| RoundList
| SquareList
RoundList ::= "(" Expr* ")"
SquareList ::= "[" SquareBody? "]"
SquareBody ::= LeadingComma? SquareItems? TrailingComma?
LeadingComma ::= ","
TrailingComma ::= ","
SquareItems ::= Expr ("," Expr)*
Number ::= Digit+
Identifier ::= IdentifierStart IdentifierPart*
IdentifierStart
::= AsciiLetter
| "_"
| "-"
IdentifierPart
::= IdentifierStart
| Digit
Digit ::= "0" | "1" | "2" | "3" | "4"
| "5" | "6" | "7" | "8" | "9"
AsciiLetter ::= "a" ... "z" | "A" ... "Z"
```
Notes:
- `Program` may be empty.
- `RoundList` elements are whitespace-separated only; commas have no special meaning there.
- `SquareList` elements require commas between neighboring expressions.
- `SquareList` permits at most one leading comma and at most one trailing comma.
- `foo(bar)` is allowed because it is parsed as two adjacent expressions: `foo` and `(bar)`.
## Program
A program is a sequence of zero or more expressions.
```lisp
foo
foo 123
foo(bar)
(foo 1 2) [bar, 3, baz]
```
Empty input is valid.
Whitespace may appear between expressions, but it is not required when the next expression starts with a delimiter:
```lisp
foo(bar)
```
is treated like:
```lisp
foo (bar)
```
## Expressions
An expression is one of:
- identifier
- number
- round list
- square list
## Identifiers
Identifiers must start with:
- ASCII letter
- `_`
- `-`
After the first code point, identifiers may contain:
- ASCII letters
- digits
- `_`
- `-`
Examples:
```lisp
foo
abc123
abc_123
name-with-dash
_private
-operator-like
```
Not identifiers:
```lisp
123abc
@foo
💥
```
## Numbers
Numbers are non-empty sequences of ASCII digits.
Examples:
```lisp
0
1
123
00123
```
No signs, decimals, exponents, separators, or non-ASCII digits are supported.
These are not valid numbers:
```lisp
-1
1.2
1e5
123abc
```
## Round Lists
Round lists use parentheses and contain zero or more expressions.
```lisp
()
(foo)
(foo 1 2)
(foo (bar 1) baz)
```
Round lists do not use separators. Commas are not special in round lists.
## Square Lists
Square lists use brackets and require commas between neighboring elements.
```lisp
[]
[foo]
[foo, bar]
[foo, 1, (bar 2)]
```
Square lists allow one optional leading comma:
```lisp
[,foo]
[,foo, bar]
```
Square lists allow one optional trailing comma:
```lisp
[foo,]
[foo, bar,]
```
Leading and trailing commas can be combined:
```lisp
[,foo, bar,]
```
Repeated leading commas are invalid:
```lisp
[, , foo]
```
Missing separators between neighboring elements are invalid:
```lisp
[foo bar]
[foo, bar baz]
```
Repeated separators after an element are invalid:
```lisp
[foo,, bar]
[foo, bar,,]
```
## Delimiters
Lists must be closed with the matching delimiter:
```lisp
(foo)
[foo]
```
These are invalid:
```lisp
(foo]
[foo)
(foo
[foo
```