Skip to content

Update the ABNF rules for single-line comments#32

Open
chirsz-ever wants to merge 1 commit into
JSONC-org:mainfrom
chirsz-ever:chirsz/ls-ps
Open

Update the ABNF rules for single-line comments#32
chirsz-ever wants to merge 1 commit into
JSONC-org:mainfrom
chirsz-ever:chirsz/ls-ps

Conversation

@chirsz-ever
Copy link
Copy Markdown
Contributor

The advantages of this modification are:

  • It better conforms to the definition of a single-line comment in ECMA-262, making it more concise.
  • It supports single-line comments ending with EOF in a natural way.
  • It can reject LP or SP characters appearing in single-line comments.

The corresponding problem is that it requires greedy matching, which is not specified in the ABNF specification (RFC 5234).

The JSON specification does not require greedy matching; ECMA-262 explicitly provides a description equivalent to greedy matching1:

The source text of an ECMAScript Script or Module is first converted into a sequence of input elements, which are tokens, line terminators, comments, or white space. The source text is scanned from left to right, repeatedly taking the longest possible sequence of code points as the next input element.

The grammar/railroad-diagram.html after running npm run railroad appears to have font differences compared to the current version on jsonc.org, so I have not submitted the corresponding changes.

Footnotes

  1. https://262.ecma-international.org/16.0/index.html?_gl=1*1bsegw2*_ga*MTYzMTg1NjA4OS4xNzc2ODUwNjM4*_ga_TDCK4DWEPP*czE3NzY4NTA2MzgkbzEkZzAkdDE3NzY4NTA2MzgkajYwJGwwJGgw#sec-ecmascript-language-lexical-grammar

Specify that greedy mode should be used.

Resolve cases where single-line comments end with EOF, LS or PS.
@DecimalTurn
Copy link
Copy Markdown
Member

DecimalTurn commented May 17, 2026

The corresponding problem is that it requires greedy matching, which is not specified in the ABNF specification (RFC 5234).

It's not part of the spec, but all the practical implementations of ABNF-based parsers I've seen make this assumption. It's likely the only sane approach.

Hence, this is uneccessary in the README.md for the grammar:

ABNF (RFC 5234) does not specify whether greedy matching should be used for repetitive syntax. However, greedy matching is mandatory in the single-line-comment rule to ensure correct behavior.

@DecimalTurn
Copy link
Copy Markdown
Member

Regarding the simplicity of having no single-line-comment-end rule, it does make things simpler which is nice, but we lose the semantic of LF and CR being valid line ending characters as opposed to PS and LS that are invalid.

This lost semantic only surfaces with the parser that will fail the parse because PS and LS can't be the start of the next token, while LF and CR can.

Aside from simplicity, the upside of your approach is that in terms of tokenisation the line endings gets tokenized as a regular whitespace just like the other whitespace in the document instead of having a seperate adhoc tokenization for that specific whitespace at the end of a single line comment.

All things considered, simplicity is probably the best approach here.

@DecimalTurn
Copy link
Copy Markdown
Member

@chirsz-ever, would you like this to be merged in the initial release (v0.1) or in the future v1.0-rc1 release?

If you want it to be the initial release, the exclusion for PS and LS can't be there, since pretty much no implementation of JSONC supports this exclusion. We can only make that change in v1.

@chirsz-ever
Copy link
Copy Markdown
Contributor Author

Do you have any plans or roadmap for versions of the JSONC specification? I haven't found any descriptions of possible different versions (v0.1, v1.0-rc1, etc) on the website or in the repository.

You are welcome to directly modify the content of this PR or push a version unrelated to LS and PS to the main branch.

My suggestion is to add trailing commas and reject LS and PS characters in single-line comments all at once in a final version. The reality is that most popular parsers support both comments and trailing commas, while LS and PS in single-line comments are extremely rare edge cases that almost never occur in practice. Therefore, my suggested approach should not actually cause many problems.

Differences between versions can be confusing, which I think is why Douglas Crockford decided not to version JSON and never change the syntax. Perhaps we should also adhere to the same principle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants