Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Programming with Judy: C LanguageJudy Version 4.0

Appendix B Caching and Memory Management

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

Why use caching?

Cache is a small, fast memory holding recently accessed data. The purpose of caching is to avoid unnecessary reloads of hot data that is likely to be used soon or frequently. CPU caching is usually used to access processor memory but other types of caching can also be used for a local copy of data accessible over a network (such as web browser caching or disk buffer caching).

CPU "cache lines" exist because CPUs are becoming so much faster than RAM. When a computer must fill a cache line by accessing RAM (cache line fill), it can cause a 30-150 instruction delay (freeze) during which the CPU has to wait. Detailed profiles reveal that many programs spend a lot of time waiting for load and store instructions.

The caching curve

The figure below shows the knee in the caching curve that often occurs when the population (the number of valid keys or indexes) grows beyond a certain point. Note that as the population increases, the time to access any index in the population also increases exponentially showing a strong upward trend in the graph at a certain point.

Figure B-1 The caching curve

The caching curve

What happened here? The software requires access to data that is not in cache. Most of the time this means having to access an index that is not in cache memory and having to fill cache from main memory. The more frequently cache-line fills are required the more performance degrades over time. If this gets bad enough, the computer spends most of its time thrashing by repeatedly accessing main memory. Hence the saying, "Thrashing is virtual crashing."

Caching and memory efficiency

Judy uses minimal memory to achieve efficient access to stored data. Instead of caching data on local disk or network server, Judy arrays are memory resident.

Judy is cache-line optimized software: able to retrieve lines of cached data directly from the computer's cache with maximum speed. Judy minimizes cache-line fills, accessing an average of 4 to 6 cache-line fills to get to any value in an array. That means you can store one, one hundred, or 4 billion values in a Judy array. On average, Judy requires three cache-line fills to reach the data in a four gigabyte (4GB) element array. A one terabyte (1TB) element array would add one cache line fill.

Judy will never take more than 6 levels of indirection to reach the data.

Judy also minimizes the amount of memory required for array indexes. The amount of memory used is proportional to the array population, the density of the data, and the clustering of the data. JudyL was designed to allocate not more than 12 bytes per index for 32-bit machines and 24 bytes per index for 64-bit machines. For JudyL, eight bytes of memory or less per index is easily achievable on a 32-bit machine (16-bytes or less on a 64-bit machine), a ratio that is better than most other in-memory search and retrieval methods.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.