-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Sometimes, it would be very convenient to implement a parser using left recursion, like in this example:
number = digit.at_least(1).map(''.join).map(int)
@generate
def expr():
lhs = yield expr
op = yield string('+') | string('-')
rhs = yield number
if op == '+':
return lhs + rhs
else:
return lhs - rhs
expr |= numberNote that this cannot be easily rewritten to use number +/- expr, because we need to parse 1 - 2 - 3 as (1 - 2) - 3 and not 1 - (2 - 3).
This currently doesn't work because calling expr inside of expr causes infinite recursion. The proper way to implement this is to unwrap the recursion manually, i.e. to use an explicit loop that consumes and adds/subtracts a sequence of numbers. However, this is much less convenient than implementing 'the mental model' directly.
This RFC proposes for parsy to be able to do such a transformation automatically.
A proof-of-concept (barely readable) implementation is here, and here's an explanation in pseudocode (that skips over many small details):
recursion_detected = False
parser_is_running = False
saved_result = None
def invoke_parser():
global parser_is_running, recursion_detected, saved_result
if not parser_is_running:
# we are the first invocation of this parser at this index
parser_is_running = True
res = normal_logic()
else:
# we are the second left-recursive invocation
recursion_detected = True
return saved_result or fail('recursion detected')
# we are the first invocation again
if not recursion_detected or res.is_fail():
parser_is_running = False # and other cleanup
return res
while True:
saved_result = res
res = normal_logic()
if res.is_fail():
parser_is_running = False # and other cleanup
return saved_resultThis works as if the recursion magically failed at just the right depth, giving the alternative sub-parsers (number in the example above) a chance to try their logic.
Unresolved questions:
- Should this be implemented for every parser, or should this be an explicit wrapper? If latter, do we call it
parsy.recursion_catcher(),Parser.left_recursive()or what? - Does this work for cases that are more complex than a simple left recursion in one of the parsers?
- How do we make it thread- (and multiple independent invocation-) safe?