Default / lazy parser not cached #253

Rafiot · 2025-01-31T10:22:07Z

I'm guessing I'm doing something wrong, because it makes very little sense when reading the doc but in short, on my machine, the slowest parser is ua-parser-rs.

That's what I'm doing in ipython (but I have the same results just with the python interpreter, it's just easier to time it):

from ua_parser import parse
ua_string = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36'

%timeit parse(ua_string).with_defaults()

With ua-parser-rs:

734 ms ± 27.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

With just google-re2

188 ms ± 10.9 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

After I uninstall ua-parser-rs and google-re2

1.07 ms ± 33.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

I'm on ubuntu 24.10 with Python 3.12.7.

Do you have any idea what's going on there? Something related to the cache being reloaded on every call? I couldn't find a way to avoid that.

The text was updated successfully, but these errors were encountered:

masklinn · 2025-01-31T17:04:27Z

note: I'm leaving the comment for posterity and the information in it is technically true, but it's a reply to a complete misunderstanding of the report, so you can skip it if you're facing the same issue, the comments afterwards are the investigation "proper", and the fix has been released as part of 1.0.1.

I'm guessing I'm doing something wrong [...] Something related to the cache being reloaded on every call? I couldn't find a way to avoid that.

For what that's worth¹ only the basic python parser is cached by default²: on my benchmarks while a (sufficiently large) cache does improve the performance of the native parsers I considered the additional memory not really worth it given what little effect it had on what I considered a real-world benchmark.

Since you're looping on a single user agent string, it makes sense that the one cached parser would be faster than the two natives, as it's basically an ideal situation for a cache (once the UA has been parsed once future calls are just a cache hit), however the difference in scale you report is significantly larger than I would have expected, and so is the difference between re2 and regex. I'll have to see how ipython interacts with things when I'm able.

In the meantime, could you provide details for the machine you're running things on? (e.g. CPU model / architecture) I've mostly been benching on my dev machine (might have been smart to have some benches I could run on GHA, I should do that).

And could you try the bench script provided with ua-parser? Most of my benchmarking has been done using https://raw.githubusercontent.com/ua-parser/uap-python/refs/heads/master/samples/useragents.txt so I'd be grateful if you could download that file and run

python -mua_parser bench useragents.txt

Then maybe try to run a pared down configuration on id.txt (that's the same user agent repeated 100000 times):

python -mua_parser bench id.txt --caches none sieve --cachesizes 1

note that it might take a fairly long time (possibly minutes)

it's documented but not made super clear in the readme as I assumed it wouldn't generally be visible or relevant ↩
you can create a custom parser with caching though, cf cache and other advanced parser customisations in the documentation ↩

Rafiot · 2025-02-01T00:41:57Z

For context, my real-world use-case is here: https://github.com/Lookyloo/lookyloo/blob/main/lookyloo/helpers.py#L389
It is triggered when I submit a URL to capture, don't specify a user agent, fallback to the default one from chromium, which is then parsed.
I noticed the API took ~1s to respond when it should be immediate (and it is with the basic python parser).

Either way, it's not really relevant in this context, and the amount of UAs I need to parse doesn't require anything highly performant, but I have long running processes so I'm happy to use the rust parser, if I can make sure it is loaded only once.

And I think it is the way to go (?):

import ua_parser

base = ua_parser.regex.Resolver(ua_parser.loaders.load_lazy_builtins())
ua_parser.parser = ua_parser.Parser(base)
ua_string = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36'

%timeit ua_parser.parse(ua_string).with_defaults()

=> 24 μs ± 383 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

CPU on my machine: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz / x86_64
CPU on the server: Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz / x86_64

Benchmarks (on my machine)

useragents.txt: 75158 lines, 20322 unique (27%)
basic              : 50.65s ( 674us/line)
basic-lru-10       : 52.29s ( 696us/line)
basic-lru-20       : 52.49s ( 698us/line)
basic-lru-50       : 51.95s ( 691us/line)
basic-lru-100      : 50.82s ( 676us/line)
basic-lru-200      : 47.56s ( 633us/line)
basic-lru-500      : 41.02s ( 546us/line)
basic-lru-1000     : 34.04s ( 453us/line)
basic-lru-2000     : 28.64s ( 381us/line)
basic-lru-5000     : 20.56s ( 274us/line)
basic-s3fifo-10    : 53.35s ( 710us/line)
basic-s3fifo-20    : 52.96s ( 705us/line)
basic-s3fifo-50    : 47.29s ( 629us/line)
basic-s3fifo-100   : 44.54s ( 593us/line)
basic-s3fifo-200   : 40.24s ( 535us/line)
basic-s3fifo-500   : 34.81s ( 463us/line)
basic-s3fifo-1000  : 30.15s ( 401us/line)
basic-s3fifo-2000  : 25.08s ( 334us/line)
basic-s3fifo-5000  : 18.84s ( 251us/line)
basic-sieve-10     : 56.27s ( 749us/line)
basic-sieve-20     : 53.51s ( 712us/line)
basic-sieve-50     : 48.31s ( 643us/line)
basic-sieve-100    : 44.51s ( 592us/line)
basic-sieve-200    : 40.12s ( 534us/line)
basic-sieve-500    : 34.16s ( 455us/line)
basic-sieve-1000   : 28.32s ( 377us/line)
basic-sieve-2000   : 23.23s ( 309us/line)
basic-sieve-5000   : 18.25s ( 243us/line)
re2                :  4.58s (  61us/line)
re2-lru-10         :  5.02s (  67us/line)
re2-lru-20         :  4.93s (  66us/line)
re2-lru-50         :  5.02s (  67us/line)
re2-lru-100        :  4.57s (  61us/line)
re2-lru-200        :  4.37s (  58us/line)
re2-lru-500        :  3.72s (  50us/line)
re2-lru-1000       :  3.35s (  45us/line)
re2-lru-2000       :  2.69s (  36us/line)
re2-lru-5000       :  2.01s (  27us/line)
re2-s3fifo-10      :  4.65s (  62us/line)
re2-s3fifo-20      :  4.54s (  60us/line)
re2-s3fifo-50      :  4.38s (  58us/line)
re2-s3fifo-100     :  4.03s (  54us/line)
re2-s3fifo-200     :  3.93s (  52us/line)
re2-s3fifo-500     :  3.16s (  42us/line)
re2-s3fifo-1000    :  2.71s (  36us/line)
re2-s3fifo-2000    :  2.31s (  31us/line)
re2-s3fifo-5000    :  1.95s (  26us/line)
re2-sieve-10       :  4.57s (  61us/line)
re2-sieve-20       :  4.63s (  62us/line)
re2-sieve-50       :  4.33s (  58us/line)
re2-sieve-100      :  4.14s (  55us/line)
re2-sieve-200      :  3.73s (  50us/line)
re2-sieve-500      :  3.08s (  41us/line)
re2-sieve-1000     :  2.65s (  35us/line)
re2-sieve-2000     :  2.31s (  31us/line)
re2-sieve-5000     :  1.84s (  25us/line)
regex              :  2.58s (  34us/line)
regex-lru-10       :  2.75s (  37us/line)
regex-lru-20       :  2.69s (  36us/line)
regex-lru-50       :  2.68s (  36us/line)
regex-lru-100      :  2.62s (  35us/line)
regex-lru-200      :  2.82s (  38us/line)
regex-lru-500      :  2.39s (  32us/line)
regex-lru-1000     :  2.12s (  28us/line)
regex-lru-2000     :  1.74s (  23us/line)
regex-lru-5000     :  1.42s (  19us/line)
regex-s3fifo-10    :  2.94s (  39us/line)
regex-s3fifo-20    :  2.93s (  39us/line)
regex-s3fifo-50    :  2.69s (  36us/line)
regex-s3fifo-100   :  2.54s (  34us/line)
regex-s3fifo-200   :  2.32s (  31us/line)
regex-s3fifo-500   :  1.99s (  27us/line)
regex-s3fifo-1000  :  1.71s (  23us/line)
regex-s3fifo-2000  :  1.46s (  19us/line)
regex-s3fifo-5000  :  1.18s (  16us/line)
regex-sieve-10     :  2.85s (  38us/line)
regex-sieve-20     :  2.88s (  38us/line)
regex-sieve-50     :  2.66s (  35us/line)
regex-sieve-100    :  2.53s (  34us/line)
regex-sieve-200    :  2.27s (  30us/line)
regex-sieve-500    :  1.96s (  26us/line)
regex-sieve-1000   :  1.72s (  23us/line)
regex-sieve-2000   :  1.48s (  20us/line)
regex-sieve-5000   :  1.23s (  16us/line)
legacy             : 45.64s ( 607us/line)

masklinn · 2025-02-01T11:57:44Z

For context, my real-world use-case is here: https://github.com/Lookyloo/lookyloo/blob/main/lookyloo/helpers.py#L389 It is triggered when I submit a URL to capture, don't specify a user agent, fallback to the default one from chromium, which is then parsed. I noticed the API took ~1s to respond when it should be immediate (and it is with the basic python parser).

Oooh I see now, I completely misunderstood your report (probably because mentions of cache kinda prime me with how much time I spent on that), I'm dreadfully sorry, you're talking about caching the parser itself after lazily instantiating it!

At a glance, it looks like I broke the parser memoization in #230: I forgot to keep the assignment of the parser onto the parser global: 6fb7b58#diff-f71b6cb0226b7966e6f4e7aa9b42e7482dc8eb9410323f3f0536dde7f04130c1L135-R140 That is consistent with what you report: the regex and re2 parsers have high instantiation overhead because they have to construct the regex filter, whereas the basic / pure python parser pretty much just stores the matchers it's handled.

Thanks for the report! And sorry again for the misunderstanding.

And I think it is the way to go (?):

import ua_parser

base = ua_parser.regex.Resolver(ua_parser.loaders.load_lazy_builtins())
ua_parser.parser = ua_parser.Parser(base)
ua_string = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36'

That looks about right, yes. There should actually be a from_matchers method on ua_parser which builds the resolver for you using the "best available resolver" heuristics.

Rafiot · 2025-02-01T12:45:17Z

Do not worry at all! I should have explained my issue better.

For now, I'll just use the default parser (as it is not blocking), but will go back to the rust one as soon as the bug is fixed: as it's on long running processes, it makes sense to have one long(ish) initialization, and quick parsing.

@Rafiot

Reported by @Rafiot: the lazy parser is not memoised, this has limited effect on the basic / pure Python parser as its initialisation is trivial, but it *significantly* impact the re2 and regex parsers as they need to process regexes into a filter tree. The memoization was mistakenly removed in ua-parser#230: while refactoring initialisation I removed the setting of the `parser` global. - add a test to ensure the parser is correctly memoized, not re-instantiated every time - reinstate setting the global - add a mutex on `__getattr__`, it should only be used on first access and avoids two threads creating an expensive parser at the same time (which is a waste of CPU) Fixes ua-parser#253

masklinn · 2025-02-01T14:21:50Z

FWIW I just published 1.0.1 which should fix the issue: using your test case (kinda I'm just using timeit at the CLI), on 1.0.0 I get:

589 usec per loop on the basic parser
85.6 msec per loop on the re2 parser
349 msec per loop on the regex parser

Which does track with your observations at least in terms of scaling.

With 1.0.1 off of pypi,

1.54 usec per loop on the basic parser
24.9 usec per loop on the re2 parser
13.6 usec per loop on the regex parser

And thanks yet again for the report, and sorry for the trouble.

Rafiot · 2025-02-01T23:44:13Z

Excellent, thank you very much, it works as expected!

masklinn changed the title ~~Cache not cached (?)~~ Default / lazy parser not cached Feb 1, 2025

masklinn mentioned this issue Feb 1, 2025

Fix memoisation of lazy parser & bump version #255

Merged

masklinn closed this as completed in #255 Feb 1, 2025

masklinn closed this as completed in ce12905 Feb 1, 2025

masklinn mentioned this issue Feb 2, 2025

profile *construction* of the parsers ua-parser/uap-rust#25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Default / lazy parser not cached #253

Default / lazy parser not cached #253

Rafiot commented Jan 31, 2025

masklinn commented Jan 31, 2025 •

edited

Loading

Uh oh!

Rafiot commented Feb 1, 2025 •

edited by masklinn

Loading

Uh oh!

masklinn commented Feb 1, 2025 •

edited

Loading

Uh oh!

Rafiot commented Feb 1, 2025

Uh oh!

masklinn commented Feb 1, 2025 •

edited

Loading

Uh oh!

Rafiot commented Feb 1, 2025

Uh oh!

Default / lazy parser not cached #253

Default / lazy parser not cached #253

Comments

Rafiot commented Jan 31, 2025

masklinn commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Footnotes

Uh oh!

Rafiot commented Feb 1, 2025 • edited by masklinn Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

masklinn commented Feb 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rafiot commented Feb 1, 2025

Uh oh!

masklinn commented Feb 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rafiot commented Feb 1, 2025

Uh oh!

masklinn commented Jan 31, 2025 •

edited

Loading

Rafiot commented Feb 1, 2025 •

edited by masklinn

Loading

masklinn commented Feb 1, 2025 •

edited

Loading

masklinn commented Feb 1, 2025 •

edited

Loading