Wednesday, March 2, 2016

LRU Caches

Precious little pieces preserving the balance of nature.
Python features LRU caches. For this purpose, the decorator @functools.lru_cache is provided. You can configure the size of the cache as well as whether equal arguments of different types should be distinguished.

RLU stands for "least recently used", i.e. if the maximum size of the cache has been reached and a new item is to be inserted, the item with the oldest access timestamp will be discarded to make room for the new resident. The cache size can be unlimited which especially useful for short running scripts.

Let's get our hands dirty:
from time import time

def fib(n):
    return fib(n-1) + fib(n-2) if n > 1 else 1

for j in range(0, 40, 5):
    a = time()
    f = fib(j)
    b = time()
    print('{t:10.8f} fib({j})={f}'.format(t=b-a, j=j, f=f))
Which results in:
0.00000048 fib(0)=1
0.00000048 fib(0)=1
0.00000238 fib(5)=8
0.00001502 fib(10)=89
0.00017858 fib(15)=987
0.00217247 fib(20)=10946
0.01955438 fib(25)=121393
0.21021986 fib(30)=1346269
2.30364680 fib(35)=14930352
Runtime increases dramatically.

What if you really need fib(100)? It seems you are screwed then, right? Let's see how lru_cache remedies the situation here:
from functools import lru_cache

@lru_cache(maxsize=None)  # unlimited cache size
def fib(n):
    return fib(n-1) + fib(n-2) if n > 1 else 1

for j in range(0, 1000, 100):
    a = time()
    f = fib(j)
    b = time()
    print('{t:10.8f} fib({j})={f:e}'.format(t=b-a, j=j, f=f))

print(fib.cache_info())
Et voilĂ :
0.00000381 fib(0)=1.000000e+00
0.00016809 fib(100)=5.731478e+20
0.00013471 fib(200)=4.539737e+41
0.00013041 fib(300)=3.595793e+62
0.00014305 fib(400)=2.848123e+83
0.00013804 fib(500)=2.255915e+104
0.00013733 fib(600)=1.786845e+125
0.00017452 fib(700)=1.415308e+146
0.00013733 fib(800)=1.121024e+167
0.00014663 fib(900)=8.879303e+187
CacheInfo(hits=907, misses=901, maxsize=None, currsize=901)
Handsome execution times even for very large Fibonacci numbers.

Conclusion
  • performance gained - yay
  • intention of the implementation preserved - yay
  • recursion depth issue not solved - try range(0, 10000, 1000)
That's for now. There'll be another post covering automatic cache invalidation. Stay tuned.

Best,
Sven

No comments:

Post a Comment