-
Notifications
You must be signed in to change notification settings - Fork 594
Even smaller toke #17241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Even smaller toke #17241
Conversation
@arc, this would be a good branch for smoke-testing on non-Linux systems. But without Thank you very much. |
Does this fix the recursion bug introduced in the last toke update? See #17220 |
I couldn't make this crash with the #17220 code, but assuming the recursion will be converted to iteration seems dangerous to me. Reviewing the code it still looks like it's recursing. |
8214886
to
e51b401
Compare
@jkeenan I'm not aware of a way to rename a branch without breaking any extant pull request pointing to it, so I've pushed an additional copy of this branch to |
On 11/1/19 11:51 AM, Aaron Crane wrote:
@jkeenan <https://github.com/jkeenan> I'm not aware of a way to rename a
branch without breaking any extant pull request pointing to it, so I've
pushed an additional copy of this branch to |smoke-me/arc/smaller-toke-bis|.
Yes, git and github have limitations around re-naming of remote
branches. Thanks for the branch; I've started smoking it.
jimk
|
@arc, this is really impressive. Nice work. |
This sounds excellent :-) |
It's still recursing deeply:
That's with "1\n" followed by many spaces followed by "\n" (and I didn't try to see how deep it went with my input) |
Sorry, I thought I had the updated PR checked out. I can't reproduce the recursion any more. Sorry about the noise. |
This removes a goto label.
With the removal of another goto label!
This permits some additional pleasing simplifications.
I introduced these parameters as part of mechanically refactoring goto-heavy logic into subroutines. However, they aren't actually needed through most of the code. Even in the recursive case (in which yyl_try() or one of its callees will call itself), we can reset the variables to zero.
This makes calls to it much easier to understand.
I thought I was going to end up using this for more stuff, but I've found better approaches. This commit also removes two more goto targets.
With this commit, yyl_try() has few enough arguments that the RETRY() macro no longer serves any useful purpose; delete it too.
There's exactly one place where we need to consult it (and that only for producing good error messages in a specific group of term-after-term situations). The reason for passing it around was so that it could be reset to false early on in the process of lexing a token, while then allowing the three separate cases that might need to set it true to do so independently. Instead, centralise the logic of determining when it needs to be true.
I tagged Dave just because he's done some lexer work in the past, and expressed some interest at P5H. But I think this has had enough positive comments now, so I'm going to rebase and merge. |
The downside of writing these calls recursively is that not all compilers will compile the tail-position calls as jumps; that's especially true in earlier versions of this refactoring process (where yyl_try() took a large number of arguments), but it's not in general something we can expect to happen — especially in the presence of `-O0` or similar compiler options. This can lead to call-stack overflow in some circumstances. Most recursive calls to yyl_try() occur within yyl_try() itself, so we can easily replace them with an explicit `goto` (which is what most compilers would use for the recursive calls anyway, now that yyl_try() takes ≤3 parameters). There are only two other recursive-call cases. One is yyl_fake_eof(), which as far as I can tell is never called repeatedly within a single file; this seems safe. The other is yyl_eol(). It has exactly two distinct return paths, so this commit moves the retry logic into its yyl_try() caller. With this change, we no longer seem to trigger call-stack overflow. Closes #17220
e51b401
to
18828ce
Compare
This branch continues the work merged in 5015bd0, factoring code out of
Perl_yylex()
and its callees, in the hope of making the lexer easier to understand locally.After these changes, the largest remaining piece of
Perl_yylex()
is just over 900 lines (down from originally >4100), and consists of a single switch statement, all of whosecase
groups are independent.This branch also contains a note in perldelta that this major refactoring has taken place.