Useful Notes / Gaming Audio

Gaming audio has evolved a great deal from the first machines to feature it in the 1970s.

In the beginning: Beeps and clicks

On early machines, there either wasn't any sound hardware, or the sound hardware was extremely simple, being able to click a speaker or play simple tones. This was also the default sound system on IBM PC compatibles for many years, as well as on early Apple computers like the Apple II, IIe and IIc (The Apple IIGS had PCM sample playback capabilities courtesy of an Esoniq PCM codec).

Programmable sound generators

One step up from simple beeps and clicks was the programmable sound generator (PSG), a set of oscillators on a chip that could be programmed in real time. The simplest ones, like the Texas Instruments 76496 and GI/Microchip 8910, had 3 square-wave channels and 1 white noise channel, all with independent volume controls. At the other end of the spectrum was the Commodore 64's SID, a full 3-channel hybrid analog synthesizer with sine, square and triangle wave oscillators, filters and white noise. The Atari ST and MSX also used a PSG audio chip, though later versions of the Atari ST also included a PCM audio codec for rudimentary speech and sound effects support. The Atari 8-Bit Computers used a POKEY chip, which is also a PSG. Also, most sound cards released for the Apple II were PSG-based. Many PSGs could be fooled into playing back sampled audio by feeding PCM values into the volume control registers thousands of times a second, as heard on some Game Gear games by "Say-Gah!", and have even been used to do CPU-driven speech synthesis on several accounts. It's safe to say that from the late 70s to the early 80s, PSG chips were technically the backbone of gaming audio. Still not convinced? Creative Lab's first sound card for the PC before the widely-successful SoundBlaster was the CMS (Creative Music System, later rebranded as the Game Blaster), which was PSG-based as well.

FM synthesizers

The next step up was the FM synthesizer. FM synthesizers work by combining tones of various frequencies together in real time, with up to 4 oscillators working together to make a note. The technique works best for woodwind and many key instruments like the harpsicord (one key instrument that the FM synthesizer cannot reproduce reliably is the grand piano, whose ADSRnote  qualities proved to be too difficult to simulate using FM); early FM synths had problems with percussion sounds (these sounds tended to be "flat", especially with first and second generation synthesizers, but they were still a problem with third generation synths), and string instruments (these sounded "plasticky" and "toyish"), but most of the problems were ironed out with later generation synthesizers.

When FM synthesis was popular, Yamaha owned the patents, so pretty much all arcade and console games that used FM used a Yamaha FM synthesizer chip to do the work. The Sega Genesis had a Yamaha FM+ TI PSG chip inside.

Additionally, the Yamaha OPL chips were also found on the MoonSound, MSX Music and MSX Sound expansion cards for the MSX, and the popular AdLib sound card for PCs (and became the defacto standard until usurped by the SoundBlaster in the early 90s), as well as in most SoundBlaster PC sound cards and clones to provide AdLib compatibility. There was even an OPL-1 based FM Synthesis module (allegedly upgradable to OPL-2) for the Commodore 64 in case the user needs even better quality music than the SID can provide.

The one biggest drawback with FM synthesis is that it cannot reproduce PCM audio at all, meaning to some people, this is taking one step backwards instead. This resulted in the influx of "hybrid" cards mentioned below, in which many cards couple a FM synthesizer (usually a OPL-2, later OPL-3) with a PCM codec for speech and sound effects. This oversight also saw Adlib users getting a Covox Speech Thing or Disney Sound Source to supplement the Adlib's musical capabilities.

Sample playback and PCM (Wavetable) synthesis

The crown of gaming audio, Pulse-Code Modulation systems work from actual samples of instruments, making their sound much richer. Since they can reproduce practically any sound, all sorts of odd effects are possible. PCM engines typically don't do any mathematical synthesis on their own, preferring instead to mix samples together at various speeds and volumes; however, high-end samplers used in music composition can filter the sound and do all sorts of other tricks. DSPs may be present to add effects like echo and reverb.

The first popular gaming platforms to use a PCM synthesis chipset were the Amiga, SNES, and believe it or not, Pinball systems, mainly those that used Midway's DCS PCM Synthesis board (which also saw use in Mortal Kombat and Revolution X cabinets, since it not only reproduces instruments more faithfully, but one of the many tricks PCM synthesis could do was transparently loop fully-voiced music tracks, which is an important feature of the latter game). The NES and Sega Genesis both had rudimentary PCM support, but this was mainly used for pre-recorded voices, sound effects, and drums. The modified OPL2 chip (called an OPN2) used by the Genesis has a PCM codec mode, but the Genesis can also resort to manipulating the PSG to play back PCM sounds if needed- notable as it's how the Sonic 3 Launch Base zone BGM managed to have a percussion track and still have the "Go!" voice samples). Pretty much every system introduced since uses PCM.

On PCs, PCM started to take over from FM-only cards in the early 1990s, when the first sound cards with samplers on-board and audio codecs appeared. On the low end were "hybrids" which used PCM sample playback for sound effects and speech, but FM synthesis for music. Most SoundBlaster cards except the AWE, Live!, Audigy and X-Fi series of cards were these, as were the numerous SoundBlaster "clones".

On the higher end we have cards that were full PCM Wavetable Synthesis devices. These used audio samples provided by the user for music synthesis, but offered rudimentary PCM sample playback on a separate codec for sound effects and speech as well. These cards were later joined by Aureal's Vortex and NVidia's SoundStorm, which used the same DLS format as Microsoft's DirectMusic software Synthesizer. However on the PC end, the larger publishing houses were slow to take advantage of these cards and full support only appeared in the mid-90s despite the first of them appearing as early as 1991. As a matter of fact, fans of such cards blame the poor uptake on the fact that most major publishing houses chose to not support such cards when porting games to the PC. However, support for such cards appeared very early on with publishers dedicated to the PC platform like Apogee Software, Epic Games and ID Software. In fact, many of Epic Games' titles sound better on the Gravis Ultrasound than anything else (caused by fully supporting the wavetable engine of an Ultrasound, but not the wavetable engines of competing cards- AWE32/64 support on their games are basically no different from SoundBlaster 16 mode- software mixing only with no support for the EMU wavetable synthesizer.

As CPU power increased, especially after the Pentium and PowerPC processors became popular (around 1995), PC games began using software PCM engines to play instruments and sound effects. However, it wasn't until the early 2000s that wavetable sound cards became a niche and PCs switched fully to software-driven PCM engines. Much of the delay could be attributed to sloppy code and poor optimization, however, as the Mac had no problems with software-driven synthesis, while PCs saw bad CPU load spikes and frame rate issues when playing music using software-driven synthesis until at least the Pentium III era. As of 2014, no consumer cards have a hardware wavetable chip anymore and cards with such circuits are now only found in the realm of professionals.

Red Book CD audio

Once games started to ship on C Ds, Red Book audio for game soundtracks became common. The audio could be played from the CD just like music on a music CD, while the game data lived in memory. This technology was actually developed in tandem with PCM sample playback and competed with PCM synthesis, and is sometimes used together with the former (for example, in the PC port of Wipe Out and Quake II, where the music is played from the music CD partition of the disc while the sound effects are played through PCM sample playback). A nice side-effect of this would be that the game CD is its own soundtrack CD and the soundtrack can be enjoyed on any regular CD player, and it also adds an extra layer of complexity for copy protection in that multi-partition game discs are difficult to duplicate reliably. Additionally, the music often sounds better than PCM sampled music, since real instruments could be played and recorded. On the downside, however, looping music tends to be difficult if not impossible to implement- as evident in Sonic CD on the Sega CD, where the music had a short fade-out and fade-in section when repeating.

This was one of the main draws of the Apple Macintosh in the early 90s, when educational and adventure games alike started using these for music as an alternative to FM synthesis note .

Compressed audio files

With the move from CDs to DVDs (and later, digital downloads), game developers could no longer use Red Book audio for their games. Additionally, the one drawback of CD audio meant that transparently looping music is difficult if not impossible. So they turned to another technique - compressed audio files. Essentially, audio files used in modern games today are like standard MP3s, except with a different compression algorithm and metadata regarding loop points. Such files have all the advantages of Red Book audio with several more, such as better looping. Initially, early processors were not powerful enough to handle this without choking (though this is technically only true for the PC largely due to the inefficient APIs while Macs had no such issues. Nevertheless developers often overlook the method due to writing games to be multi-platform). Today, most triple-A games tend to use proprietary audio formats like AD-X and Bink Audio, while indie games tending to use consumer formats such as MP3 and OGG. Also, processors have not only gotten leaps and bounds faster, but had also went multi-core, making it trivial to decode compressed music while still having enough grunt to handle the general graphics and gameplay logic without choking up. Coupled with the fact that games are now often better optimized than before, the earlier issues that plagued software-driven wavetable synthesis no longer applies. As a matter of fact, compressed audio files tend to use less CPU power than software-driven wavetable since the CPU only has to decode two channels of audio as opposed to wavetable where the CPU has to interpret and generate 16 channels of audio and then downmix them to two. The only thing holding the technology back was Windows' inefficient API- it wasn't until the conception of Direct X, specifically DirectSound, and the new WDM audio model which has multiple audio stream mixing capabilities that premiered with Windows 98, that the issue was resolved. On earlier versions of Windows, unless the software itself does the mixing (something that most developers don't do), it wouldn't be able to play the BGM and sound effects at the same time.

Many of the proprietary formats are driven by the API or licensed game engine. For example, a game using the CRI middleware will tend to use ADX or CRI Audio.

See also AwesomeMusic.Video Games, Pac Man Fever.