While shrinking an existing
CPU's component size or cooling it better allows you to increase its
Clock Speed,
physics dictates that computer chips can only get so small (unless you get
exotic). Adding more stages to the instruction pipeline allows for the processor to do more in less time, but excessive pipeline length makes non
linear code execute slowly. How can we continue to make processors faster now that all of our old tricks are starting to run into a brick wall, CPU makers asked themselves? Simple, pack more than one of these
modern processor cores in each computer.
The multi-core processor tries to integrate as many CPU cores and its components as possible in a given package size. Despite the fact that CPU cores can still only do one thing at a time (until we come up with something else), the benefits to this are twofold:
- The computer can handle as many processes as there are cores.
- A process can take all the cores for maximum throughput.
The only problem is that this only boosts the performance of programs that are highly CPU-bound. That is, they spend most of their time crunching numbers. Good examples of CPU-bound programs include image and video editing programs and 3D model renders. Highly I/O bound programs idle most of the time, and thus, do not benefit from more than one CPU. These programs include word processors, Internet browsers, and even the GUI of the OS. Performance benefits for games varies. Some games scale with more cores, while others won't see it.
An additional problem is the use of non CPU core resources. While it is fairly obvious that adding an additional CPU wont help much when it is the GPU that is the bottleneck there are other issues. An easily overlooked problem is where the CPU stores the data. Heavy processing tends to involve a lot of data. You don't want the processor to wait for the memory to send the data. Especially not if the multiple processors fight with each other for accessing the memory.
There's also the issue of coordination. You do not want a CPU core to work on a process that another is working on. Nor do you want the cores stamping on another's memory space. This was especially a problem with the
Sega Saturn and part of the reason why developers hated it. Granted, Sega's
Virtua Fighter Remix (the less said about the first version, the better) did provide a good example of how to use the system, but this was a new thing at the time. Many programmers stuck with the tried and true programming for one processor. Another issue is whether or not the OS can address those cores.
A common misconception is that multi-core processors are the same as a single core with a multiplied speed. A single core with multiplied speed will always handle instructions faster than a multi-core processor can, especially with processes that can't run on more than one core. For example, if Process A, B, and C had run time completion of 1, 2, and 3 respectively, a quad-core processor will get this done in 3 seconds (each process gets a core). A single core processor of multiplied speed will get them done in (1 + 2 + 3)/4 or 1.5 seconds.
The limitations of parallelizing code are spelled out by
Amdahl's Law
on
The Other Wiki.