Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Cache Parse() results #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 15, 2015
Merged

Cache Parse() results #26

merged 1 commit into from
Oct 15, 2015

Conversation

mattrobenolt
Copy link
Member

Each parse takes ~2ms on my machine, and it's pretty common throughout the life of a running
process, to parse identical user-agent strings. This adds a very primitive cache similar in vein
to the cache inside the urlparse package.

Before:

$ python -m timeit -s 'from ua_parser.user_agent_parser import Parse' 'Parse("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.52 Safari/537.36")'
100 loops, best of 3: 2.14 msec per loop

After:

$ python -m timeit -s 'from ua_parser.user_agent_parser import Parse' 'Parse("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.52 Safari/537.36")'
1000000 loops, best of 3: 0.956 usec per loop

Cache memory overhead:

Given the user agent of Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.52 Safari/537.36, we get 280 bytes per parsed object.

>>> sys.getsizeof(Parse('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.52 Safari/537.36'))
280

So that means a MAX_CACHE_SIZE of 20, will incur an overhead of 5600 bytes. Granted, this is also ignoring the cache key used in the _parsed_cache dict, but we're in the ballpark of a few KB total at most. :)

Each parse takes ~2ms on my machine, and it's pretty common throughout the life of a running
process, to parse identical user-agent strings. This adds a very primitive cache similar in vein
to the cache inside the `urlparse` package.

Before:
```
$ python -m timeit -s 'from ua_parser.user_agent_parser import Parse' 'Parse("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.52 Safari/537.36")'
100 loops, best of 3: 2.14 msec per loop
```

After:
```
$ python -m timeit -s 'from ua_parser.user_agent_parser import Parse' 'Parse("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.52 Safari/537.36")'
1000000 loops, best of 3: 0.956 usec per loop
```
@@ -205,12 +209,20 @@ def Parse(user_agent_string, **jsParseBits):
A dictionary containing all parsed bits
"""
jsParseBits = jsParseBits or {}
return {
key = (user_agent_string, repr(jsParseBits))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not happy with using repr() like this, but we need a hashable type to use as a key.

mattrobenolt added a commit to getsentry/sentry that referenced this pull request Oct 11, 2015
selwin pushed a commit to selwin/uap-python that referenced this pull request Oct 14, 2015
updating ua object to match implementations
elsigh added a commit that referenced this pull request Oct 15, 2015
@elsigh elsigh merged commit e5445f6 into ua-parser:master Oct 15, 2015
@mattrobenolt mattrobenolt deleted the cache branch October 15, 2015 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants