Most of the developers do not use cached_property and lru_cache from functools standard library but also does not cache HTTP request/response into outside file/database. Example in this article are tested under Python 3.8
Usage functools.cached_property
Let say you have an intensive calculation. It takes time and CPU usage. It happens all the time. There is a need to calculate some values for webshop each time the client access site. Example usage of cached_property:
from functools import cached_property import statistics from time import time class DataSet: def __init__(self, sequence_of_numbers): self._data = sequence_of_numbers @cached_property def stdev(self): return statistics.stdev(self._data) @cached_property def variance(self): return statistics.variance(self._data) numbers = range(1,10000) testDataSet = DataSet(numbers) start = time() result = testDataSet.stdev result = testDataSet.variance end = time() print(f"First run: {(end - start):.6f} second") start = time() result = testDataSet.stdev result = testDataSet.variance end = time() print(f"Second run: {(end - start):.6f} second") start = time() result = statistics.stdev(numbers) result = statistics.variance(numbers) end = time() print(f"RAW run: {(end - start):.6f} second")
Output would look similar to this:
First run: 0.247226 second Second run: 0.000002 second RAW run: 0.242232 second
You can run code online: Python code example IDE Online
Usage functools.lru_cache
lru_cache is a decorator that is used for function using memoizing callable that saves up to the maxsize most recent calls. Again you have a lot of calculation and you want to save some results (the example we calculate N and N+1 we need just one step instead of re-calculating complete N+1) of early calculation that helps us to build next result with cached ones.
from functools import lru_cache from time import time @lru_cache(maxsize=None) def fib(n): if n < 2: return n return fib(n-1) + fib(n-2) start = time() result = [fib(n) for n in range(40000)] end = time() print(f"First run: {(end - start):.6f} second") start = time() result = [fib(n) for n in range(40000)] end = time() print(f"Second run: {(end - start):.6f} second") start = time() result = [fib(n) for n in range(39999)] end = time() print(f"Third run: {(end - start):.6f} second") start = time() result = [fib(n) for n in range(40001)] end = time() print(f"Fourth run: {(end - start):.6f} second") print(fib.cache_info())
Output would be:
First run: 0.278697 second Second run: 0.017155 second Third run: 0.017530 second Fourth run: 0.065415 second CacheInfo(hits=199997, misses=40001, maxsize=None, currsize=40001)
The first call is cached. The second one is re-using cache, the third one is N-1 and the fourth is N+1.
As we can see in the last 3 cases - we re-use cache. This could be used for database, calculation, any CPU usage that we want to repeat or operation we want to keep in cache.
Here is an online IDE you can run and view: lru_cache example
HTTP request caching
With lru_cache we could also cache web requests for static pages. Other options are to keep the result in the file based on our input data.
Let us see first options:
from functools import lru_cache import urllib.request from time import time @lru_cache(maxsize=32) def get_pep(num): 'Retrieve text of a Python Enhancement Proposal' resource = 'http://www.python.org/dev/peps/pep-%04d/' % num try: with urllib.request.urlopen(resource) as s: return s.read() except urllib.error.HTTPError: return 'Not Found' start = time() for n in 8, 290, 308, 320, 8, 218, 320, 279, 289, 320, 9991: pep = get_pep(n) #print(n, len(pep)) end = time() print(f"First run: {(end - start):.6f} second") print(get_pep.cache_info()) print("\n") start = time() for n in 8, 290, 308, 320, 8, 218, 320, 279, 289, 320, 9991: pep = get_pep(n) #print(n, len(pep)) end = time() print(f"Second run: {(end - start):.6f} second") print(get_pep.cache_info())
If we run this code, we get:
First run: 0.897728 second CacheInfo(hits=3, misses=8, maxsize=32, currsize=8) Second run: 0.000026 second CacheInfo(hits=14, misses=8, maxsize=32, currsize=8)
You can run this code: HTTP Caching
Now let us talk about real projects in real life. You have IP or word and you need to check or to get a replacement. But you have 2^32-1 IP or 50 million words. And you don't want to lose all information you got from these services. But caching inside of python is not enough for this. So what are we going to do? We put the result in a file or database.
Example code:
import urllib.request from time import time def get_pep(num): 'Retrieve text of a Python Enhancement Proposal' resource = 'http://www.python.org/dev/peps/pep-%04d/' % num f = "" ff = "" try: f = open(str(num),"r") txt_file = f.read() return txt_file # Do something with the file except IOError: nothing = "a" try: with urllib.request.urlopen(resource) as s: ff = open(str(num),"w+") txt = s.read() ff.write(str(txt)) return txt except urllib.error.HTTPError: return 'Not Found' start = time() for n in 8, 290, 308, 320, 8, 218, 320, 279, 289, 320, 9991: pep = get_pep(n) end = time() print(f"First run: {(end - start):.6f} second") print("\n") start = time() for n in 8, 290, 308, 320, 8, 218, 320, 279, 289, 320, 9991: pep = get_pep(n) end = time() print(f"Second run: {(end - start):.6f} second")
You can run code caching results from http This code produce something similar to:
First run: 4.196623 second Second run: 0.358382 second
Why is this better ? in short: if you have 20 million keys, words, something and you run day by day - then it is better to keep in database or files. This example (file, writing to file) is the simplest proof of concept. I am lazy to implement MySQL, PostgreSQL, or SQLite records to keep.