Click to See Complete Forum and Search --> : Cache In Multicore Programming


angeln123
April 1st, 2009, 10:38 AM
Cache is memory placed between the processor and main system memory (RAM). While cache is not as
fast as registers, it is faster than RAM. It holds more than the registers but does not have the capacity of
main memory. Cache increases the effective memory transfer rates and, therefore, overall processor
performance. Cache is used to contain copies of recently used data or instruction by the processor. Small
chunks of memory are fetched from main memory and stored in cache in anticipation that they will be
needed by the processor. Programs tend to exhibit both temporal locality and spatial locality.
Temporal locality is the tendency to reuse recently accessed instructions or data.
Spatial locality is the tendency to access instructions or data that are physically close to items
that were most recently accessed.
One of the primary functions of cache is to take advantage of this temporal and spatial locality
characteristic of a program. Cache is often divided into two levels, level 1 and level 2.

Level 1 Cache
Level 1 cache is small in size sometimes as small as 16K. L1 cache is usually located inside the processor
and is used to capture the most recently used bytes of instruction or data.
Level 2 Cache
Level 2 cache is bigger and slower than L1 cache. Currently, it is stored on the motherboard (outside the
processor), but this is slowly changing. L2 cache is currently measured in megabytes. L2 cache can hold
an even bigger chunk of the most recently used instruction, data, and items that are in the near vicinity

than L1 holds. Because L1 and L2 are faster than general - purpose RAM, the more correct the guesses of
what the program is going to do next are, the better the overall system performance because the right
chunks of data will be located in either L1 or L2 cache. This saves a trip out to either RAM or virtual
memory or, even worse, external storage.

Compiler Switches for Cache?
Most developers doing multicore application development will not be concerned with manually
managing cache unless, of course, they are doing kernel development, compiler development, or other
types of low - level system programming. However, compiler options that give the compiler a hint as to
how much L1 or L2 cache is available or a hint about the properties of the L1 or L2 cache can be found in
most of the mainstream compilers in use. For example, the Sun C++ compiler has an xcache switch. The
man page for that switch shows the syntax and its use.
- xcache=c defines the cache properties that the optimizer can use. It does not guarantee that any
particular cache property is used. Although this option can be used alone, it is part of the expansion of
the - xtarget option; its primary use is to override a value supplied by the - xtarget option.
- xcache=16/32/4:1024/32/1 specifies the following:

Level 1 cache has: Level 2 cache has:
16K bytes 1024K bytes
32 - byte line size 32 - byte line size
4 - way associativity Direct mapping

Developing software to truly take advantage of CMP requires careful thought about the instruction set of
the target processor or family of processors and about memory usage. This includes being aware of
opportunities for optimizations, such as loop unrolling, high - speed vector manipulations, SIMD processing,
and MP compiler directives, and giving compilers hints for values such as the size of L1 or L2 cache.

more....http://www.computercore2.com/