On most modern computers, the CPU runs at a faster speed than the rest of the computer. This is mainly due to the amount of time it takes for an electrical signal to travel across the motherboard. The CPU is small, so signals take less time for signals to travel, so it can be clocked at a faster rate.
Because of this, each time the CPU needs to access RAM memory (which is located on the motherboard), it has to wait a few cycles until the data has been read or written. This wastes time and slows the processor down.
Cache memory is a small amount of very fast memory which is usually located on the CPU chip (or very close to it on the motherboard). The CPU can access cache memory more quickly than normal RAM, but there is much less of it.
Cache memory is used to store bot instructions and data, so the can be accessed faster by the CPU.
The system tries to keep the most frequently used data in the cache. It uses an algorithm to decide which data should go in the cache. Algorithms can be quite complex, but here are some simple techniques:
Often the cache memory is arranged in levels:
Some systems have more levels. Even the slowest cache memory is faster than the system RAM, otherwise there would be no point using it!
Often, each core has its own fast cache (e.g. the L1 cache), but the slower caches are shared between all the cores. The faster caches are always on the CPU chip, but the slowest cache is sometimes on the motherboard close to the chip. Here is an example, although the details vary for different systems:
Copyright (c) Axlesoft Ltd 2021