-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
CommonJS tokenizer attempt #326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test for shared dependency bundle bug
affcddd to
8cfea68
Compare
ac41caa to
c6f98b4
Compare
c6f98b4 to
8c75664
Compare
|
Did some perf measurements on this and the tokenizer is less than a quarter of the speed unfortunately - http://jsperf.com/tokenizer-v-regex-commonjs-require-extraction/2. Will experiment with optimizations. |
|
@crisptrutski your magical performance eye would be very valued here.... |
|
I managed to optimize the function by using regular expressions for the seeking states instead of stepping everything - http://jsperf.com/tokenizer-v-regex-commonjs-require-extraction/4 (guybedford/extract-requires@817736a). It's still 60% slower but better than 98% slower! I also conveniently remembered now tokenizing can't be done comprehensively because division is indistinguishable from regular expressions without deeper lexing knowledge. So I believe that leaves us with:
I'm tempted to go with (1) because it doesn't seem a strong enough reason on its own to use a full parser. But I'm open to (3). |
|
(1) will mean some libraries won't be able to be installed at all (as per #311), right? I encountered this bug while trying to install the Could we still solve this by running a more expensive parsing "offline" (when installing locally or via CDN), and perhaps accept a more naive (and in this case broken) parsing when done dynamically from the original source? |
|
Ok I've managed to work out a way to combine tokenizing and regexes to form a MORE accurate method that won't fall down with division confusion. @theefer that is I have a replacement PR now that can work for this problem. The performance results are in http://jsperf.com/tokenizer-v-regex-commonjs-require-extraction/5. The question here is now:
@crisptrutski if you have any further perf suggestions for this updated version at https://github.com/guybedford/extract-requires/blob/master/extract-requires.js let me know. I'm going to look into the specific twitter API case now and see how plausible a work-around is. |
|
For what it's worth, I've updated the branch with this valid option as well. |
|
Closing for now as a work around as @theefer says. We can come back to this one day if necessary, possibly comparing performance of this approach to Traceur parsing directly. |
For #311, here is a tokenizing approach for CommonJS require extraction.
I'm hosting the tokenizer project separately at https://github.com/guybedford/extract-requires, and will be expanding the tests, this is just the very first commit for now.
Looking promising so far. Seeing how performance benchmarks compare will be the big thing.
//cc @theefer