Random Access Memory / Useful Notes

https://static.tvtropes.org/pmwiki/pub/images/812368b15a632ebc5ebd21fbb23a3689.jpg

Not to be confused with the album by Daft Punk.

While the CPU is a heart of a computer system, moving the data to and from and processing it as required, this data still needs to be held somewhere. That's where RAM comes into play. RAM stands for Random Access Memory — any place in the memory can be written at any time without having to wait. This contrasts with sequential access memory, where you have to rewind or fast-forward a tape or wait for a certain time to access data.

Most people simply call RAM "memory" now. It comprises the main operating part of the computer's storage hierarchy. Just as Clock Speed is misunderstood to be the only measure of Central Processing Unit power, capacity is thought to be the only important measurement in when it comes to Random Access Memory.

Memory is not all about capacity. Unless a system or game is idle, memory will not stay with the same data indefinitely. It's constantly moving data on and off the memory chips to handle the ever changing data. In other words, capacity is important, but so is how fast it can move data on and off the chip. In situations where the machine has to multitask (such as PCs, PlayStation 3, and Xbox 360), capacity can increase performance, but the returns diminish quickly (i.e., if you double the RAM, it might really boost performance, but if you double it again, it won't do much). More available RAM is helpful for storing more data that you wish to use immediately. It prevents more frequent access to the slower hard drive/DVD/Blu-Ray disc, which to a processor takes an eternity.

Like a CPU, memory speed is measured in Clock Speed in between latency. And latency tends to affect memory more than processors. This is because one also has to take into account the speed of the bus, the shared electrical pathway between components. With RAM embedded on the CPU die, there is a very short distance and a dedicated pathway that the bits can travel across, while RAM placed in other areas requires the bits to travel the shared bus, which may have other devices using it. This means factors such as the bus speed and the number of other devices requiring the bus can contribute to data-transfer latency. Even the physical length of the bus can become a non-trivial factor in how fast data can be moved in and out of RAM.

In addition to clock speed, latency, and capacity, memory is also measured in bandwidth. Bandwidth is the amount of data that flows between the processors and the memory. Bandwidth tends to have a much higher maximum capacity than the memory capacity, typically 500 to 1000 times greater. This is unlikely to ever be all used up (why bandwidth size is called a "theoretical maximum"; it could reach that maximum, in theory). It's just to ensure the smoothest running between the memory and processors. How these measurements compare depends on the type or memory.

One of the problems with memory and the CPU during the development of computers is something called the Memory Wall. While performance of the CPU from 1986 to 2000 improved annually at about 56%, RAM performance only improved by about 10% annually. Thus it's only a matter of time before RAM becomes too slow for the CPU, that is, the CPU will do its task and sit idle waiting for more data to or from RAM. However, while improvements in efficiency of the CPU (for example, Intel's Core 2 processors versus the Pentium D processors) have stalled this problem, physics essentially dictates unless memory performance starts improving, CPU performance will start suffering.

A misunderstood aspect of memory is that more memory automatically equates to better performance. This probably started around in The '90s when "just good enough" computers were sold. Technology was improving at such a rapid pace that the amount of RAM in a recently purchased computer may not be enough to run a program a half year down the road. The amount of RAM available to a computer is a massive YMMV in terms of performance. But the test is actually simple to determine if a system would benefit from more. If RAM is constantly full and using the hard drive's swap file^note, the system could definitely benefit from more RAM. If RAM is barely being used, then the system isn't really using it so adding more won't help. This is changing on modern operating systems, however, where extra memory is passively used to hold extra data files for fast access by programs, filling up the longer the system is on. If the memory is needed for active use, then the cache is pushed out to make room.

As a tangible example, imagine you're grocery shopping. You opt for the smaller basket at first. Later in life or at some point, you start to require more goods. If you continue to use the basket, it overflows and you have to complete the shopping trip and unload what you have back home and come back another time. However, if you use the much larger cart, you can fit more at once and do everything in one trip. But just because adding one shopping cart made your life easier doesn't mean adding another will. (Though you can "cache" groceries you may need for future use into the extra cart and set that aside, much like an operating system does for pre-fetching data that may be needed later.)

There are several ways to classify the RAM types, but the two most used are the technological classification (that is, by the technology underlining each type), and usage classification, breaking the types by their purpose. Here they are:

Base PC Memory models:

Like the Commodore Amiga, PC memory are segmented into several different subtypes.

Conventional memory: The base computer memory type, defined by up to the first 640 kilobytes of RAM. This is the base memory of the computer, and is addressable by all PC programs.
Upper Memory Area (UMA): On many systems, it is possible to populate the motherboard with more RAM beyond 640k, up to 1MB. This excess RAM area is known as the Upper Memory Area and co-exists with "memory holes" intended for communication with EMS cards and other peripherals. On systems where more memory is needed, more then 640k worth of RAM DIP-chips are installed, and OS and device drivers work in tandem marking which are memory holes and which are extra UMA memory. These UMA locations can then be "backfilled" by UMA-aware programs, in blocks.
Expanded Memory Specification (EMS): The result of a joint venture between Lotus, Intel and Microsoft, the specification was designed to work around the 1MB limit imposed by the 8086 processor. The memory technique breaks up memory on an expansion card into pages of 64k, and opens a "memory window" along with a special page register, both located at specific parts of the upper memory area but below the 1MB limit. EMS-aware programs could write the page it desires into the page register memory, which will dictate to the memory mapper on the card which 64k page would be made available to the program via the memory window (this works somewhat similarly to certain NES game cartridges). Despite the introduction of XMS memory, EMS remained the standard for many years, which EMS emulators appearing when XMS became the norm solely due to the standard being so widespread in the business world.
Extended Memory Specification (XMS): With the arrival of the 286 CPU and their much larger memory address limit, manufacturers realize that they can stop using paging to improve read and write speeds. Unlike EMS, XMS is straightforward to use and lacks the complicated page-swapping mechanism used by EMS - since it resides immediately above the 1MB memory location (memory location between 640k through 1023k are typically reserved for communication with expansion cards or for backfilled memory using the "Upper Memory Area" scheme)- and as a result is exponentially faster compared to EMS cards. However, due to the widespread use of older programs that use EMS memory, EMS emulators like QEMM became a must. MS-DOS 5.0 onwards ship with the EMM386 EMS emulator included in the box. The XMS standard remains in use to this day. Under MS-DOS, XMS memory must be enabled via a driver, typically HIMEM.SYS, due to MS-DOS' 16-bit legacy code which remained largely unchanged through it's lifespan. A program must then use an extender (or in the case of Windows 9x, use it's own memory manager) to enter 32-bit mode to access the XMS memory. Windows NT and it's various offsprings as well as various Unix flavors, boots into 32-bit (and later, 64-bit) from the get-go, does not need drivers and extenders, and programs can access all XMS memory directly.

By technology:

Historical

Well before modern memory types became available, early machines still needed to store their data — even the ENIAC, which didn't even have storable program (it was controlled by sequentially wiring all the modules together) had some storage for data. Initially this was the very straightforward and obvious solution — static memory, that is, keeping the data in the electronic circuits named triggers, or flip-flops, that could remain in one of the two stable states. But because word size in those early machines was somewhere between 20 to 40 bits, and one flip-flop can hold at most two bits of information, while requiring at least four electronic valves at the time when the only available type of valve was a huge and fragile vacuum tube, having more than couple dozens of such "registers" was simply impractical.

That's where everything got interesting. To hold bigger amounts of data several technologies were used, some of them being decidedly odd. Like storing the data as an acoustic waves (yes, bursts of sound) in mercury-filled tubes, or magnetic pulses on a rotating drum. Technically, these types of memory weren't even random-access, they were sequential, but they simulated RAM relatively well. Then there was a technology where the bits were stored as dots on the phosphor surface of a CRT — which had the advantage that the programmer could literally see the bits stored, which often helped in the software debugging.

But most of these technologies were not terribly practical; they were expensive, slow and (especially in the case of mercury delay lines) environmentally dangerous. Dr. An Wang (then of IBM) proposed a solution which took the industry by storm — magnetic core memory.

Core memory consisted of thousands of tiny (1-2 mm wide) donut-shaped magnetic cores, set on a grid of metal wires. By manipulating the voltages put on these wires, the state of any individual core could be read or written. Since there were no moving parts, as with a delay line or a drum, access time was much quicker. Core memory was also substantially denser than either delay-line or CRT memory, and used less power as well. It also held its content when the power was off, which was widely used at the time.

In addition to their compact size (for example, a unit holding 1K, a rather generous amount of the time, was a flat frame only 20x20x1 cm square), they were also rather cheap. Cores had to be assembled by hand, even in their last days (early attempts to mechanize the process failed and were abandoned once semiconductor RAM appeared), so most manufacturers used the cheap labor of East Asian seamstresses and embroiderers (who had been made redundant by the widespread adoption of sewing machines) thus making it affordable. Most Mainframes and Minicomputers used core memory, and it was ingrained into the minds of the people who worked on them to such extent that even now you can meet a situation when the word "core" is used as a synonym for RAM, even though computers in general haven't used it since The '70s.

Solid State RAM

Solid state RAM was the technology that finally ended the core era. It was an outgrowth of the attempts to miniaturize electronic circuits. Transistors had replaced vacuum tubes in early computers relatively quickly, due to their smaller size, reliability (they had no glass envelopes to break or filaments to burn out) and much lower power consumption. However, even the smallest transistors at the time were about the size of a small pencil eraser, and it took hundreds of them to make a working computer, so computers still remained bulky and expensive. In The '50s two engineers independently figured how to put several transistors and other electronic components on the same piece of semiconductor, and thus the integrated circuit was born. The sizes of the electronic circuits started to shrink almost overnight, and one of the first applications of them in the computer industry was for RAM.

Static RAM, as mentioned above, is a type of memory where each bit is represented by a state of a certain type of circuit called a flip-flop. With one IC replacing several transistors and their attendant circuitry, static memory became much more affordable, and started appearing in larger and larger amounts. The main advantage of static memory is that it's very quick — basically, its speed is only limited by the speed of the physical processes inside the transistor, and these are extremely rapid. It also requires power only to write something, and takes only a token amount when reading or storing, so it dissipates almost no heat. But still, each bit of static memory takes two to four transistors to store, so it remains relatively bulky and expensive.
Dynamic RAM, on the other hand, uses capacitors to store bits (it requires generally one capacitor and, maybe, one diode to store one bit, which takes much less silicon space), so it's much more compact and thus cheap. Unfortunately, capacitors tend to lose charge over time, so they have to be periodically recharged, usually by reading the memory and writing the same data again, called "memory refresh". This process takes either the attention of the CPU, or the additional support circuitry on the memory chip itself, and, to add insult to injury, the need to constantly refresh the memory contents means that when the power gets turned off, all memory gets completely erased — core, being magnetic, was completely non-volatile, and static RAM required so little power that it could be kept alive with a simple lithium watch battery. Still, the enormous density that DRAM offers makes it the most affordable and used type of the memory ever.
Magnetic RAM is basically a return to core on a new level, where each ferrite donut of the old-style core is replaced by a ferrite grain in an IC. It has the density advantage of a DRAM (there is some penalty, but it's not that big), its speed is closer to static RAM, it's completely non-volatile and it can be written as fast at it is read (not to mention as many times as needed), negating most of the Flash Memory drawbacks. Unfortunately, due to Flash selling like ice-cream on a hot day, few producers could spare their fabs to produce it, and it requires significant motherboard redesign to boot. This and several technological bottlenecks seem to lock it in Development Hell for the time.
- On a side note, there's also an issue with security with non-volatile memory. For example, if a computer doing encryption had non-volatile memory, a clever hacker could turn off the machine, take out the memory, and do a dump without fear of losing the contents. For the same thing to happen with DRAM (it actually loses memory over time, not instantaneously), the person would have to dump the RAM chip in liquid nitrogen to slow the discharge process to a crawl. However, with the advent of on-the-fly encryption co-processors that encrypts data on the memory management unit itself before anything is even written out to the memory, non-volatile memory is making a comeback in a highly upgraded form.
There's a new type of memory is on the horizon based on memristors. While theorized in the 1971 as the fourth passive two-terminal electrical component, it wasn't actually fabricated until 2008. Memristors have the property that resistance increases when current is flowed through one way, and decreases when current goes the other. This changes the voltage across the part, which can be used to read a 0 or 1. Since it's a passive part, it's very fast and requires no power to retain its state. Currently this technology is being marketed on server-class hardware as NV-DIMM (Non-Volatile Dual-Inline Memory Module). With information going to and from the CPU now being encrypted before it is even written to memory by a dedicated co-processor, when used with the correct hardware, this type of memory is relatively secure.

Obsolete modern RAM types

Fast Page RAM - an evolution of regular DRAM, from the 286 era up until the early Pentium Era. It had a refresh rate of up to 70ns. A typical module of the era would hold up to 8MB of Fast Page RAM. They run at 66MHz speed. Also worth noting is that halfway through the RAM's lifespan, there was a slot design change and the amount of pin count for the connector went up from 30 to 72. The former is often known as a SIMM or Single Inline Memory Module, while the latter is known as a DIMM, or Dual Inline Memory Module.
Extended Data Output (EDO) RAM - Starting from the middle of the Pentium Era, this RAM type emerged to replace Fast Page RAM. It is electronically backwards compatible Fast Page RAM and uses the same 72-pin Dual Inline Memory Module slots, and also runs at the same 66MHz speed as it's predecessor. However the refresh rate has been increased to 60ns. An enhanced version supporting Burst operations (BEDO RAM) was introduced late into the RAM type's life, but by then the market has already chosen SDRAM as it's successor due to SDRAM being the cheaper of the three (the RAM was also put in competition with Rambus DRAM). An EDO module can be up to 128MB in size^note
Single Data Rate (SDR) RAM - Initially marketed as Synchronous Dynamic RAM (SDRAM), the technology was later renamed to Single Data Rate RAM to imply that it is the precursor to Double Data Rate RAM. Introduced at 66MHz speed late in the Pentium's life and facing competition from both BEDO RAM and Rambus DRAM, this RAM type eventually emerged as the consumer's choice due to its affordability, and is the direct predecessor of DDR RAM. It was constantly worked on, and when it was finally usurped by DDR RAM in the Pentium 4 era, it had gained speeds of up to 133MHz and a module can be up to 512MB in size.

DDR RAM

The DDR stands for "Double Data Rate". Typically, RAM processes the data once per clock cycle, while this kind of memory does it twice. It does come at the cost of slightly slower latency, but doubling the clock speed is a huge advantage for gaming. DDR became commercially available, and the Xbox was the first console to use DDR memory, while the competing Playstation 2, and later 3, used the competing Rambus DRAM (see below). Each generation of DDR has reduced the operating voltage, which means it uses less power for each memory transfer. However, increasing speeds mean overall power use may still be higher.

Currently, we're into the fifth generation of DDR RAM. The generations are as follows:

The original DDR RAM (sometimes retroactively called DDR1): 266MHz-400MHz, Module size ranges from 128MB up to 2GB
DDR2 RAM: 533MHz-1066MHz, Modules sizes range from 512MB up to 4GB. Few motherboards and processors support driving the RAM at 1066MHz speed without overclocking, most maxes out at 800MHz.
DDR3 RAM: 800MHz-2.8GHz, module size range from 1GB to 16GB, with 32GB modules on the roadmap. It is still being developed in tandem with DDR4 RAM (presumably because of embedded or lower cost applications) and as of 2015 reaches speeds of up to 2.8GHz. However, most consumer CPUs supported only up to 2133MHz, after which manufacturers switched to use DDR4 RAM. The last consumer chips to use DDR3 RAM are the AMD Kaveri and the Intel Broadwell CPUs.
DDR4 RAM: 1.6GHz-4.3GHz. Memory module size starts at 4GB. The RAM reached it's full 4.3GHz potential by the end of 2016. As of late 2019, CPUs still only officially support a maximum RAM speed of 3.2GHz. However, newer CPUs unofficially support much higher speeds, with over 5GHz possible with overclocking. The last consumer chips to use DDR4 RAM are the AMD Zen 3 and Intel Rocket Lake CPUs.
DDR5 RAM: 4.8GHz-8.4GHz. Launched in mid 2020, and the first modules arriving in mid-2021 with Intel's Alder Lake CPUs being the first to support the standard. AMD's Zen 4 CPUs too uses DDR5 memory. DDR5 RAM is a lot more like GDDR5X in that it actually has two parallel access lines to the RAM module as opposed to the predecessors' single line, making it more like quad data rate RAM on paper. DDR5 RAM is also the first of its kind to support oddball sizes (for example, 24GB, 48GB and so on, per module).

GDDR RAM is a variant of DDR designed specifically for use with GPUs. It allows higher memory bandwidth as well as adding some extra functions, such as the ability to fill whole memory blocks with a single colour. The cost however, is higher latency, but most of the work GPUs do is highly predictable, so memory requests can be made ahead of time. Although based on DDR RAM, it has evolved somewhat separately and so doesn't quite match up in terms of generations. GDDR4 and 5 were both based on DDR3. This was followed by GDDR5X, which is technically quad data rate and not really DDR at all. GDDR6 is an evolution of this, diverging further from the standard DDR. The first commercial GPUs using GDDR6 were released in 2018, and as of mid-2019 it is used by all new GPUs from both Nvidia and AMD.

Rambus DRAM

"Rambus Dynamic RAM" focuses on slightly higher bandwidth, and much higher clock speed. It does come at the cost of higher power consumption, higher capacity, and slower latency. The last one has been reduced in later versions, to the point where the XDR variant on the PS3 has latency no slower than DDR memory.

This meant the earlier versions were not that good for graphics. It didn't hurt the PlayStation 2, which used it for regular memory, not for video memory, but the Nintendo 64 did use it for video memory. This was one of many bottlenecks that kept the system from performing as well as its graphics looked.

Rambus DRAM is evidently good for video playback, hence why the PS2 and PS3 are considered such good movie players for their times. The PlayStation Portable doesn't use that kind of memory, given that the increased power consumption would drain the battery. This has meant that UMD movie playback on TVs is notably washed out. It was briefly used in the early 2000s for home PCs; however, although it was indeed blazing fast, upgrading it was way too expensive due to the high licensing fees that module manufacturers ended up passing down to the consumers, and many motherboard manufacturers felt that the licensing fees Rambus charged was too high (and again, those who put up with the high licensing costs passed the fee down to the consumers- this made both the RAM modules and motherboards appear more expensive than the other option, which is SDRAM).

Although Rambus produced a specification for XDR2, the idea had already effectively been outcompeted by GDDR and was never used. The Playstation 3 was the last significant product to use this memory type.

EDRAM

Typically, a Graphics Processing Unit does not have cache. Video memory fills that role. But "Embedded Dynamic RAM" is pretty damn close. It's stuck right next to the processor instead of inside of it. The gain is larger size (but still much smaller than standard memory), and its clock speed still matches the processor. The tradeoffs are smaller bandwidth (but still about 10 to 100 times more than standard memory), and slower latency (but still much, much faster than standard memory), and increased manufacturing cost.

The greatly increased speed means that the video memory can handle about the same amount of data than standard memory of a much larger size. The PS2, GameCube, Wii, and 360 all use this kind of memory, while the Xbox One uses this memory type exclusively for graphics. People thought the size was too small, but once they caught on, the systems ran just fine with that memory. Future systems with the Cell Processor plan on using this kind of memory, as it's practically fast enough to keep up with that processor.

IT-SRAM

1T-SRAM is something of a Non-Indicative Name. The technology uses DRAM (which typically uses 1 transistor, hence "1T"), but has the support circuitry built in to handle DRAM refreshing so that the memory controller on the system doesn't have to do it. This makes it look like SRAM to the rest of the system. However, this limits how fast the RAM can operate. Probably its most famous use was being the system RAM in the GameCube and Wii.

High Bandwidth Memory (HBM)

HBM is the result of a wall that GDDR type memory hit. That is, even though GDDR5 has reached an impressive 7.0 GHz, it takes a lot of power to run it. In order to reduce power consumption and increase memory bandwidth at the same time, AMD teamed up with Samsung and SK Hynix to create HBM. The idea is to simply stack RAM dies on top of each other, use high density through-silicon-vias as communication channels, and use an interposing layer as the base that the GPU also sits on to talk to the memory. The result is a staggering 4096-bit bus interface in its first implementation and dozens of watts in power savings for the same amount of memory. The concept is similar to package-on-package manufacturing used in system-on-chip companies, where the processor is stacked on top of RAM in the same package. HBM is currently in its third generation, with the only difference from the first generation being a faster signalling speed, the memory bus is still 4096-bit wide.

As of 2023, usage of HBM has be mostly relegated to top-end GPU designs. The main issue is cost, with the last reported cost of HBM 2 being nearly 3x that of GDDR 5, and that's without the interposer. In addition, it' likely that for volume production, HBM may present yield issues. If the final assembly has any problems, it's much harder to troubleshoot and isolate what is the problem in order to fix it. GDDR-based VRAM designs are modular and easier to isolate a troublesome component to replace it.

By usage:

Registers

This is the fastest type of memory available. All processors typically have registers, which is normally very fast static RAM. These store about a word of data, the number of bits the processor can handle at once. The most important one is the instruction counter, as it holds where the next instruction is. Another one that's always found is the status word. Others may be present and some can be used by the user or not. Interestingly enough, complex instruction set processors have relatively few registers. Reduced instruction set processors easily have over 100. For example, most modern x86 CPUs, being a very advanced RISC machines internally, use automatic renaming of their ~128 internal registers to simulate several sets of the 14 traditional x86 registers, thus allowing several CISC instructions to be run at once.

Cache

Just in case you are wondering, it's pronounced "cash", not "ca-shay". This is memory stuck right in the CPU. Why? Well often the CPU needs to store data for certain processing. It doesn't need to store a lot, but it needs to store it in memory as fast as possible. Cache fills that purpose. By sticking it right inside the processor, the latency is no slower than the processor's, and the clock speed matches. It does mean that the cache can only be so large. The 360 has the most cache of any home consoles, and it's just 1 megabyte in size. But it's the speed that counts, since it's designed to keep up with the CPU. Many modern PCs and consoles have multiple cache levels of varying sizes, for example a modern PC from 2010 has at least three levels. Usually, Level 1 is the fastest, but has the smallest amount of storage, and as the level goes higher, the speed is reduced but storage amount is increased.

Useful Notes / Random Access Memory

Previous

Index

Next

Useful Notes / Random Access Memory

Edit Locked

Previous

Index

Next