Skip to content

Conversation

@liancheng
Copy link

@liancheng liancheng commented Dec 22, 2025

TL;DR: This PR fixes a correctness issue where the current grammar fails to parse certain legal Jsonnet object comprehensions.

Note

The first commit contains the major change, while the second commit contains changes from running tree-sitter generate.

When parsing an object comprehension, the current grammar cannot handle the following legal cases:

  • Field with a trailing comma:

    {
      ['f' + i]: i,
      for i in std.range(1, 10)
    }
    (document
      (object
        (member
          (field
            (fieldname
              (binary
                left: (string
                  (string_start)
                  (string_content)
                  (string_end))
                operator: (additive)
                right: (id)))
            (id)))
        (ERROR
          (fieldname
            (id))
          (ERROR)
          (ERROR))))
    

    What makes it worse, by default, jsonnetfmt adds a trailing comma when the field is on a separate line.

  • Field with a leading object local binding:

    {
      local j = 1,
      ['f' + i]: i + j,
      for i in std.range(1, 10)
    }
    (document
      (object
        (member
          (objlocal
            (local)
            (bind
              (id)
              (number))))
        (member
          (field
            (fieldname
              (binary
                left: (string
                  (string_start)
                  (string_content)
                  (string_end))
                operator: (additive)
                right: (id)))
            (binary
              left: (id)
              operator: (additive)
              right: (id))))
        (ERROR
          (fieldname
            (id))
          (ERROR)
          (ERROR))))
    
  • Field with a leading assertion:

    {
      assert true,
      ['f' + i]: i + j
      for i in std.range(1, 10)
    }
    (document
      (object
        (member
          (assert
            (true)))
        (member
          (field
            (fieldname
              (binary
                left: (string
                  (string_start)
                  (string_content)
                  (string_end))
                operator: (additive)
                right: (id)))
            (binary
              left: (id)
              operator: (additive)
              right: (id))))
        (ERROR)
        (ERROR)))
    

This is because the field with a trailing comma, the leading object local binding, and the leading assertion always match commaSep($.member, true) branch, making it impossible to match the $.objforloop branch.

Following the Jsonnet specification, a Jsonnet object comprehension allows:

  1. Exactly one field, whose field name must be computed.
  2. Zero or more leading and following object local bindings and assertions around the field.
  3. An optional comma after the field when there is no following object local bindings or assertions.

Unfortunately, all these combined make building a grammar strictly following the specification non-trivial, and lead to convoluted grammar rules.

Note

The current grammar does not strictly follow the specification either, as it allows static field names in object comprehensions, which is illegal, e.g.:

{
  f: i
  for i in std.range(1, 10)
}

This PR suggests a fix with a more relaxed but much simpler grammar. Basically, an objinside is now 1 or more members followed by optional forspec and compspec. However, it comes with the following drawbacks, which are arguably acceptable:

  • It allows multiple fields in an object comprehension.

    For example, the new grammar does not fail the following illegal document:

    {
      ['x' + i]: i + j,
      ['x' + i]: i + j,
      for i in std.range(1, 10)
    }
  • For object comprehensions the current grammar parses, this PR removes the objforloop node, which is a somewhat backward-incompatible change. However, it does not affect any existing queries.

    Given the following document:

    {
      ['item' + i]: i
      for i in std.range(0,10)
    }

    Before this PR:

    (document
      (object
        (objforloop
          (field
            (fieldname
              (binary
                left:
                  (string
                    (string_start)
                    (string_content)
                    (string_end))
                operator: (additive)
                right: (id)))
            (id))
          (forspec
            (id)
            (functioncall
              (fieldaccess (id) last: (id))
              (args (number) (number)))))))
    

    After this PR:

    (document
      (object
        (member
          (field
            (fieldname
              (binary
                left:
                  (string
                    (string_start)
                    (string_content)
                    (string_end))
                operator: (additive)
                right: (id)))
            (id)))
        (forspec
          (id)
          (functioncall
            (fieldaccess (id) last: (id))
            (args (number) (number))))))
    

    Note that the objforloop node is replaced by a member node, and the child forspec node is lifted as a sibling of the member node.

@liancheng liancheng marked this pull request as ready for review December 22, 2025 22:01
@liancheng
Copy link
Author

liancheng commented Dec 22, 2025

I personally hit production Jsonnet documents containing object comprehension expressions with leading object local bindings. Although this is an imperfect fix, it does parse all legal object comprehension expressions.

Given the following Jsonnet document:

{
  ['x' + i]: i,
  for i in std.range(1, 10)
}

Here is how tree-sitter parses it in Neovim before and after this PR:

  • Before:

    image
  • After:

    image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant