-
-
Notifications
You must be signed in to change notification settings - Fork 194
Description
Did you check existing issues?
- I have read all the tree-sitter docs if it relates to using the parser
- I have searched the existing issues of tree-sitter-python
Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)
tree-sitter 0.25.0
Describe the bug
When parsing Python code, there seems to be a strange behaviour concerning the location of comments placed directly below the function definition.
Here an example where the parsing seems to work properly.
The function func_2 has no arguments, is described by a three-line docstring. The function body is starting with a three-line comment string, and the expression is a global variable definition:
def func_2():
'''
a function with no specific return type and no parameters
'''
#
# assign a global variable
#
global glbStrVar1As expected, the three-line comment string is shown below the "expression_statement" which contains strings to represent the docstring. The position of the comment lines are marked by ">". They are clearly (as expected) located as children below the block element:
function_definition [Point(row=136, column=0) - Point(row=182, column=20)]
def [Point(row=136, column=0) - Point(row=136, column=3)]
identifier [Point(row=136, column=4) - Point(row=136, column=10)]
parameters [Point(row=136, column=10) - Point(row=136, column=12)]
( [Point(row=136, column=10) - Point(row=136, column=11)]
) [Point(row=136, column=11) - Point(row=136, column=12)]
: [Point(row=136, column=12) - Point(row=136, column=13)]
block [Point(row=137, column=4) - Point(row=182, column=20)]
expression_statement [Point(row=137, column=4) - Point(row=139, column=7)]
string [Point(row=137, column=4) - Point(row=139, column=7)]
string_start [Point(row=137, column=4) - Point(row=137, column=7)]
string_content [Point(row=137, column=7) - Point(row=139, column=4)]
string_end [Point(row=139, column=4) - Point(row=139, column=7)]
> comment [Point(row=141, column=4) - Point(row=141, column=5)]
> comment [Point(row=142, column=4) - Point(row=142, column=30)]
> comment [Point(row=143, column=4) - Point(row=143, column=5)]
global_statement [Point(row=145, column=4) - Point(row=145, column=21)]
global [Point(row=145, column=4) - Point(row=145, column=10)]
identifier [Point(row=145, column=11) - Point(row=145, column=21)]
Now the strange part: The function func_3 also has no arguments, is NOT described by a three-line docstring. The function body is starting with a three-line comment string, and the expression is some assignment:
def func_3():
#
# call a function
#
typedDefaultParameter = func_1(typedDefaultParameter)Parsing this part of the code leads basically to a concrete-syntax-tree which shows an element order that I would not have expected. The position of the comment lines are, again, marked by ">". It can be seen that the comment lines are now located above the block element which introduces the body of the function:
function_definition [Point(row=184, column=0) - Point(row=193, column=17)]
def [Point(row=184, column=0) - Point(row=184, column=3)]
identifier [Point(row=184, column=4) - Point(row=184, column=10)]
parameters [Point(row=184, column=10) - Point(row=184, column=12)]
( [Point(row=184, column=10) - Point(row=184, column=11)]
) [Point(row=184, column=11) - Point(row=184, column=12)]
: [Point(row=184, column=12) - Point(row=184, column=13)]
> comment [Point(row=186, column=4) - Point(row=186, column=5)]
> comment [Point(row=187, column=4) - Point(row=187, column=21)]
> comment [Point(row=188, column=4) - Point(row=188, column=5)]
block [Point(row=190, column=4) - Point(row=193, column=17)]
expression_statement [Point(row=190, column=4) - Point(row=190, column=57)]
assignment [Point(row=190, column=4) - Point(row=190, column=57)]
identifier [Point(row=190, column=4) - Point(row=190, column=25)]
= [Point(row=190, column=26) - Point(row=190, column=27)]
call [Point(row=190, column=28) - Point(row=190, column=57)]
identifier [Point(row=190, column=28) - Point(row=190, column=34)]
argument_list [Point(row=190, column=34) - Point(row=190, column=57)]
( [Point(row=190, column=34) - Point(row=190, column=35)]
identifier [Point(row=190, column=35) - Point(row=190, column=56)]
) [Point(row=190, column=56) - Point(row=190, column=57)]
Steps To Reproduce/Bad Parse Tree
- take the function definition snippets for
func_2andfunc_3 - parse the code into a tree representation
- observe the location of the representations of the three comment lines
Expected Behavior/Parse Tree
the expected behaviour would show the comment (which is meant to describe the 1st expression within the block) below the block element. Regardless of the existence of a leading docstring, as shown here:
function_definition [Point(row=184, column=0) - Point(row=193, column=17)]
def [Point(row=184, column=0) - Point(row=184, column=3)]
identifier [Point(row=184, column=4) - Point(row=184, column=10)]
parameters [Point(row=184, column=10) - Point(row=184, column=12)]
( [Point(row=184, column=10) - Point(row=184, column=11)]
) [Point(row=184, column=11) - Point(row=184, column=12)]
: [Point(row=184, column=12) - Point(row=184, column=13)]
block [Point(row=190, column=4) - Point(row=193, column=17)]
> comment [Point(row=186, column=4) - Point(row=186, column=5)]
> comment [Point(row=187, column=4) - Point(row=187, column=21)]
> comment [Point(row=188, column=4) - Point(row=188, column=5)]
expression_statement [Point(row=190, column=4) - Point(row=190, column=57)]
assignment [Point(row=190, column=4) - Point(row=190, column=57)]
identifier [Point(row=190, column=4) - Point(row=190, column=25)]
= [Point(row=190, column=26) - Point(row=190, column=27)]
call [Point(row=190, column=28) - Point(row=190, column=57)]
identifier [Point(row=190, column=28) - Point(row=190, column=34)]
argument_list [Point(row=190, column=34) - Point(row=190, column=57)]
( [Point(row=190, column=34) - Point(row=190, column=35)]
identifier [Point(row=190, column=35) - Point(row=190, column=56)]
) [Point(row=190, column=56) - Point(row=190, column=57)]
Repro
# please see within the 1st section for code examples. the line numbers within the tree representation might possibly be different when trying. The snippets have been cut and taken from a real code example.