Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit acd4042

Browse files
authored
Update readme.md
1 parent e334c85 commit acd4042

File tree

1 file changed

+91
-127
lines changed

1 file changed

+91
-127
lines changed

readme.md

Lines changed: 91 additions & 127 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,36 @@
1-
# **Python Cache: How to Speed Up Your Code with Effective Caching**
1+
# Python Cache: How to Speed Up Your Code with Effective Caching
22

33
This article will show you how to use caching in Python with your web
44
scraping tasks. You can read the [<u>full
55
article</u>](https://oxylabs.io/blog/python-cache-how-to-use-effectively)
66
on our blog, where we delve deeper into the different caching
77
strategies.
88

9-
## **How to implement a cache in Python**
9+
## How to implement a cache in Python
1010

1111
There are different ways to implement caching in Python for different
1212
caching strategies. Here we’ll see two methods of Python caching for a
1313
simple web scraping example. If you’re new to web scraping, take a look
1414
at our [<u>step-by-step Python web scraping
1515
guide</u>](https://oxylabs.io/blog/python-web-scraping).
1616

17-
### **Install the required libraries**
17+
### Install the required libraries
1818

1919
We’ll use the [<u>requests
2020
library</u>](https://pypi.org/project/requests/) to make HTTP requests
2121
to a website. Install it with
2222
[<u>pip</u>](https://pypi.org/project/pip/) by entering the following
2323
command in your terminal:
2424

25+
```bash
2526
python -m pip install requests
27+
```
2628

27-
Other libraries we’ll use in this project, specifically time and
28-
functools, come natively with Python 3.11.2, so you don’t have to
29+
Other libraries we’ll use in this project, specifically `time` and
30+
`functools`, come natively with Python 3.11.2, so you don’t have to
2931
install them.
3032

31-
### **Method 1: Python caching using a manual decorator**
33+
### Method 1: Python caching using a manual decorator
3234

3335
A [<u>decorator</u>](https://peps.python.org/pep-0318/) in Python is a
3436
function that accepts another function as an argument and outputs a new
@@ -42,143 +44,116 @@ saving them in the cache for future use.
4244
Let’s start by creating a simple function that takes a URL as a function
4345
argument, requests that URL, and returns the response text:
4446

47+
```python
4548
def get_html_data(url):
46-
47-
response = requests.get(url)
48-
49-
return response.text
49+
response = requests.get(url)
50+
return response.text
51+
```
5052

5153
Now, let's move toward creating a memoized version of this function:
5254

55+
```python
5356
def memoize(func):
57+
cache = {}
5458

55-
cache = {}
56-
57-
def wrapper(\*args):
58-
59-
if args in cache:
60-
61-
return cache\[args\]
62-
63-
else:
64-
65-
result = func(\*args)
59+
def wrapper(*args):
60+
if args in cache:
61+
return cache[args]
62+
else:
63+
result = func(*args)
64+
cache[args] = result
65+
return result
6666

67-
cache\[args\] = result
67+
return wrapper
6868

69-
return result
70-
71-
return wrapper
7269

7370
@memoize
74-
7571
def get_html_data_cached(url):
72+
response = requests.get(url)
73+
return response.text
74+
```
7675

77-
response = requests.get(url)
78-
79-
return response.text
80-
81-
The wrapper function determines whether the current input arguments have
76+
The `wrapper` function determines whether the current input arguments have
8277
been previously cached and, if so, returns the previously cached result.
8378
If not, the code calls the original function and caches the result
84-
before being returned. In this case, we define a memoize decorator that
85-
generates a cache dictionary to hold the results of previous function
79+
before being returned. In this case, we define a `memoize` decorator that
80+
generates a `cache` dictionary to hold the results of previous function
8681
calls.
8782

88-
By adding @memoize above the function definition, we can use the memoize
89-
decorator to enhance the get_html_data function. This generates a new
90-
memoized function that we’ve called get_html_data_cached. It only makes
83+
By adding `@memoize` above the function definition, we can use the memoize
84+
decorator to enhance the `get_html_data` function. This generates a new
85+
memoized function that we’ve called `get_html_data_cached`. It only makes
9186
a single network request for a URL and then stores the response in the
9287
cache for further requests.
9388

94-
Let’s use the time module to compare the execution speeds of the
95-
get_html_data function and the memoized get_html_data_cached function:
89+
Let’s use the `time` module to compare the execution speeds of the
90+
`get_html_data` function and the memoized `get_html_data_cached` function:
9691

92+
```python
9793
import time
9894

99-
start_time = time.time()
10095

96+
start_time = time.time()
10197
get_html_data('https://books.toscrape.com/')
102-
10398
print('Time taken (normal function):', time.time() - start_time)
10499

105-
start_time = time.time()
106100

101+
start_time = time.time()
107102
get_html_data_cached('https://books.toscrape.com/')
108-
109-
print('Time taken (memoized function using manual decorator):',
110-
time.time() - start_time)
103+
print('Time taken (memoized function using manual decorator):', time.time() - start_time)
104+
```
111105

112106
Here’s what the complete code looks like:
113107

114-
\# Import the required modules
115-
108+
```python
109+
# Import the required modules
116110
from functools import lru_cache
117-
118111
import time
119-
120112
import requests
121113

122-
\# Function to get the HTML Content
123114

115+
# Function to get the HTML Content
124116
def get_html_data(url):
117+
response = requests.get(url)
118+
return response.text
125119

126-
response = requests.get(url)
127-
128-
return response.text
129-
130-
\# Memoize function to cache the data
131120

121+
# Memoize function to cache the data
132122
def memoize(func):
123+
cache = {}
133124

134-
cache = {}
135-
136-
\# Inner wrapper function to store the data in the cache
137-
138-
def wrapper(\*args):
139-
140-
if args in cache:
141-
142-
return cache\[args\]
143-
144-
else:
145-
146-
result = func(\*args)
125+
# Inner wrapper function to store the data in the cache
126+
def wrapper(*args):
127+
if args in cache:
128+
return cache[args]
129+
else:
130+
result = func(*args)
131+
cache[args] = result
132+
return result
147133

148-
cache\[args\] = result
134+
return wrapper
149135

150-
return result
151-
152-
return wrapper
153-
154-
\# Memoized function to get the HTML Content
155136

137+
# Memoized function to get the HTML Content
156138
@memoize
157-
158139
def get_html_data_cached(url):
140+
response = requests.get(url)
141+
return response.text
159142

160-
response = requests.get(url)
161-
162-
return response.text
163-
164-
\# Get the time it took for a normal function
165143

144+
# Get the time it took for a normal function
166145
start_time = time.time()
167-
168146
get_html_data('https://books.toscrape.com/')
169-
170147
print('Time taken (normal function):', time.time() - start_time)
171148

172-
\# Get the time it took for a memoized function (manual decorator)
173-
149+
# Get the time it took for a memoized function (manual decorator)
174150
start_time = time.time()
175-
176151
get_html_data_cached('https://books.toscrape.com/')
152+
print('Time taken (memoized function using manual decorator):', time.time() - start_time)
153+
```
177154

178-
print('Time taken (memoized function using manual decorator):',
179-
time.time() - start_time)
180-
181-
Here’s the output:
155+
And here’s the output:
156+
![](images/output_normal_memoized.png)
182157

183158
Notice the time difference between the two functions. Both take almost
184159
the same time, but the supremacy of caching lies behind the re-access.
@@ -190,82 +165,70 @@ increase the number of calls to these functions, the time difference
190165
will significantly increase (see [<u>Performance
191166
Comparison</u>](#performance-comparison)). 
192167

193-
### **Method 2: Python caching using LRU cache decorator**
168+
### Method 2: Python caching using LRU cache decorator
194169

195170
Another method to implement caching in Python is to use the built-in
196-
@lru_cache decorator from functools. This decorator implements cache
171+
`@lru_cache` decorator from `functools`. This decorator implements cache
197172
using the least recently used (LRU) caching strategy. This LRU cache is
198173
a fixed-size cache, which means it’ll discard the data from the cache
199174
that hasn’t been used recently.
200175

201-
To use the @lru_cache decorator, we can create a new function for
176+
To use the `@lru_cache` decorator, we can create a new function for
202177
extracting HTML content and place the decorator name at the top. Make
203-
sure to import the functools module before using the decorator: 
178+
sure to import the `functools` module before using the decorator: 
204179

180+
```python
205181
from functools import lru_cache
206182

207-
@lru_cache(maxsize=None)
208183

184+
@lru_cache(maxsize=None)
209185
def get_html_data_lru(url):
186+
response = requests.get(url)
187+
return response.text
188+
```
210189

211-
response = requests.get(url)
212-
213-
return response.text
214-
215-
In the above example, the get_html_data_lru method is memoized using the
216-
@lru_cache decorator. The cache can grow indefinitely when the maxsize
217-
option is set to None.
190+
In the above example, the `get_html_data_lru` method is memoized using the
191+
`@lru_cache` decorator. The cache can grow indefinitely when the `maxsize`
192+
option is set to `None`.
218193

219-
To use the @lru_cache decorator, just add it above the get_html_data_lru
194+
To use the `@lru_cache` decorator, just add it above the `get_html_data_lru`
220195
function. Here’s the complete code sample:
221196

222-
\# Import the required modules
223-
197+
```python
198+
# Import the required modules
224199
from functools import lru_cache
225-
226200
import time
227-
228201
import requests
229202

230-
\# Function for getting HTML Content
231203

204+
# Function to get the HTML Content
232205
def get_html_data(url):
206+
response = requests.get(url)
207+
return response.text
233208

234-
response = requests.get(url)
235-
236-
return response.text
237-
238-
\# Memoized using LRU Cache
239209

210+
# Memoized using LRU Cache
240211
@lru_cache(maxsize=None)
241-
242212
def get_html_data_lru(url):
213+
response = requests.get(url)
214+
return response.text
243215

244-
response = requests.get(url)
245-
246-
return response.text
247-
248-
\# Getting time for Normal function to extract HTML content
249216

217+
# Get the time it took for a normal function
250218
start_time = time.time()
251-
252219
get_html_data('https://books.toscrape.com/')
253-
254220
print('Time taken (normal function):', time.time() - start_time)
255221

256-
\# Getting time for Memoized function (LRU cache) to extract HTML
257-
content
258-
222+
# Get the time it took for a memoized function (LRU cache)
259223
start_time = time.time()
260-
261224
get_html_data_lru('https://books.toscrape.com/')
262-
263-
print('Time taken (memoized function with LRU cache):', time.time() -
264-
start_time)
225+
print('Time taken (memoized function with LRU cache):', time.time() - start_time)
226+
```
265227

266228
This produced the following output:
229+
![](images/output_normal_lru.png)
267230

268-
### **Performance comparison**
231+
### Performance comparison
269232

270233
In the following table, we’ve determined the execution times of all
271234
three functions for different numbers of requests to these functions:
@@ -280,6 +243,7 @@ three functions for different numbers of requests to these functions:
280243
As the number of requests to the functions increases, you can see a
281244
significant reduction in execution times using the caching strategy. The
282245
following comparison chart depicts these results:
246+
![](images/comparison-chart.png)
283247

284248
The comparison results clearly show that using a caching strategy in
285249
your code can significantly improve overall performance and speed.

0 commit comments

Comments
 (0)