Graphics rendering is the process of composing the images that a player sees on the screen while playing a game. Quite a few tasks are needed to produce even a simple graphical game like Super Mario Bros. to a more modern game like Gears of War or Modern Warfare. Since 2D and 3D drawing methods are entirely different, they'll be covered in different sections.
2D Graphics rendering2D graphics can be summed up in two different methods: Raster graphics and Vector graphics.
Raster GraphicsRaster graphics is a very common method and involves drawing elements on the screen pixel by pixel based on what's in some memory buffer.
Initially there were only two major methods of raster graphics: a text mode and direct drawing. In the former, the video processing chip had a table of characters in ROM that it knew how to draw. Thus rendering something was looking up values in this table and drawing them. As it was higher resolution than direct drawing, sometimes special characters were put in the table to allow for basic tiling graphics, though this was a hack job at best and smooth motion was practically impossible. In direct drawing, it started off as requiring the software to draw the pixel in the image when the display was just about to draw it. Later, when memory was more affordable, each pixel was drawn in that memory before it was sent to the display.
2D consoles in the later 8-bit era started taking a hybrid approach to rendering graphics. The screen was divided up into layers that were static with simple animations and meant to be the foreground and background of the view, while sprites were meant for interactive objects that had complex animations. Everything was built up from tiles on a fixed grid from a tilemap, similar to text mode, in order to make it easier on memory requirements. As hardware got better, layers could be blended together for transparency effects.
One limitation in all 2D consoles was how many sprites the video processor could handle at once per horizontal line. If there were too many, sprites could simply be dropped and those would be rendered invisible. The Atari VCS, Colecovision, NES, and many other early consoles had a quirk to try and avoid some sprites being dropped all the time by rotating through which sprites got to be displayed and which didn't, which resulted in flickering (most famously in the Atari VCSs version of Pac-Man).
Today 2D raster graphics directly draw on the screen. Elements are typically pulled from a picture map that contains known sizes of certain elements. The difference between this and tiled 2D is that each element can now be placed on the pixel level rather from a grid position that may have, for example, a 4x4 pixel area and that the elements can be any arbitrary size.
Vector GraphicsVector graphics on the other hand is a mathematical approach. Everything is rendered on the fly using points and lines that connect them. The first games to use vector graphics were Atari's Asteroids and Battlezone. The Vectrex was also a console based on vector graphics. Early vector graphics were simply wireframes of the model or image being rendered, hence there was a lack of color and features other than the outline. Eventually the spaces could be filled as hardware got more powerful.
The advantage of vector graphics is its infinite scalability. Because everything is created on the fly, a low resolution vector image will look just as good as a high definition one. Whereas if you scaled a low resolution raster graphics, you would get a blurry or pixelated high resolution one, often with ugly results (though there are AI based upscaling algorithms that do a really good job at preserving detail). Its major downside is it's more computationally expensive to render an image this way, since everything has to be calculated rather than plucked from a table. To put it in another way, raster graphics is like putting together a clip-art scene while vector graphics requires the artist to draw everything in.
Graphical user interfaces elements are built from vectors because of their need to scale with some raster elements like icons. Fonts are also built from vectors because of their need to scale as well.
3D Graphics renderingMuch like 2D Graphics rendering, 3D has two main methods.
Voxel 3D GraphicsVoxel is a portmanteau of volumetric and pixel, or more precisely, volumetric picture element. In a 3D space, this would be the most basic element, akin to a pixel being the smallest element in a picture. In fact, a voxel model is basically like how raster graphics work. It's an old concept and is still relatively unused due to hardware constraints (See disadvantages).
Voxels are advantageous for a few reasons:
- Voxels can represent a 3D object in a similar way that a picture represents a 2D one. Imagine what you can do to a 2D picture and apply it with another dimension.
- Since voxels fill up a space to represent an object, you could break apart objects without the need for creating new geometry as in a polygon based renderer. You would simply break off a chunk of the object.
- Voxels can have their own color, eliminating the need for textures entirely.
However, there's still a few things to overcome:
- Voxels require a lot more memory than a 2D image (or even 3D models). A 16x16 with 1 byte per pixel for instance, requires 256 bytes to store. A 16x16x16 with 1 byte per voxel model, requires 4096 bytes. A way around this is to find groups of voxels and clump them together into one big voxel.
- Detailed voxel models are computationally expensive to setup. Hence they are limited mostly to major industries that need the detail.
Voxlap and Atomontage are currently the only known voxel based graphical engines with potential for games. Other games used voxel based technology for certain elements. Minecraft uses voxels for map data, Command & Conquer: Tiberian Sun uses them for vehicles.
Polygonal 3D GraphicsMuch like 2D vector graphics, polygonal 3D graphics go for an mathematical approach to represent the object. The polygons themselves have all the benefits of vector graphics. The other elements are typically constrained to the same as raster graphics.
Polygonal 3D graphics are comprised of the following elements:
- Vertex: An point in space that represents the coordinates of a polygon's vertex. This is the smallest unit of a 3D scene.
- Polygon: A polygon is a 2D plane that occupies a space between 3 or 4 vertices. Early polygonal graphics used quadrilaterals as the most basic unit because it was computationally simple. Triangles are used today because they are the smallest polygonal element.
- Texture: Textures are an image applied to a polygon to achieve some effect. It allows detailing the polygon without spending more polygons. A basic example is applying a brick wall picture to a flat polygon to make it look like a brick wall.
- Sprite and particles: These are 2D elements. Sprites usually always face the player regardless of the viewpoint, something called "billboarding" in the industry. Particles are sprites that form together to create complex effects, like explosions and smoke.
In the early days of 3D graphics, the CPU was responsible for rendering everything, which was known as software rendering. Things didn't really take off though until they started including floating point units, which allowed for much smoother animations. As graphics hardware got more powerful, they soon started taking over parts of the rendering process. Eventually it accumulated today, where the CPU's only graphics rendering job is to create display lists, which are instructions on what the GPU needs to do in order to render the scene.
- Job batching: The CPU takes the data from the game code (physics solutions, AI decisions, user input etc.) and creates a batch of jobs called display lists. These are instructions to tell the GPU how to render graphics or compute things.
- Transform: All vertices and polygons are positioned in 3D space.
- Geometry Instancing: Some models are only loaded once and any entity using the model will use that one copy, rather than load yet another copy. This saves memory and rendering time.
- Primitives generation/Tessellation: Polygons can be added or subtracted as necessary to detail the silhouette. This saves memory, but not necessarily rendering time.
- Ray Tracing (if used): From the viewpoint of the monitor, rays are cast out and they interact with the 3D scene. The final color of the pixel is determined by what the ray returns as visible along with lighting and coloring properties with it.
- Clipping: Once the world is setup, it would be inefficient to have to render things that can't be seen by the player. Clipping removes assets that can't be seen by the player so it won't be rendered. For example, if one is in a house, the 3D engine will not render rooms that are not immediately visible to the player. A common form of clipping is called backface culling, which is when the same polygon is invisible when viewed from one side but not the other.
- Rasterization: This takes the 3D scene and creates a 2D representation of it for the monitor. This is done by laying the pixels over the 3D scene and if a triangle passes over a sample point in the pixel, the triangle will be rendered for that pixel. There are two types:
- Immediate: Polygons are rasterized at the desired screen resolution. Easiest to implement, but requires a lot of memory bandwidth as entire frames of information need to be passed around. It can also cause excessive rendering since covered polygons may be rendered and lighted.
- Tile-based: The screen size is divided into tiles, usually 16x16 or 32x32. Polygons are sorted to figure out if a polygon is in that tile and which parts need lighting. Then the GPU works on these tiled chunks. Harder to implement but is efficient, as each tile can live in the GPU's cache, memory transfers are less bandwidth intensive, and since polygons are sorted, it prevents rendering of polygons that can't be seen.
- This article shows the differences between the two modes.
- Lighting: The color of each pixel is figured out.
- Forward rendering: The gist of forward rendering is for every piece of geometry, do all shader operations on a single render target, which becomes the final output. While very simple to implement, it scales poorly with the amount of geometry entities and lights.
- Deferred rendering: Geometry is rendered without lighting to generate multiple render targets, usually consisting of applied textures (see below) and depth. These render targets are combined, then lighted. While more complicated to implement, it scales basically to just the number of lights you have. The only downside is it has trouble with transparency and some anti-aliasing methods.
- Forward Plus: An improvement over Forward Rendering. The scene is broken up into tiles and then for each tile, figure out how many lights are actually influencing it. Afterwards, apply forward rendering as usual but use only the lights for that tile rather than consider every single light. This scales just as well as deferred rendering without having its shortcomings.
- An article explaining the differences between Forward, Deferred, and Forward Plus can be found at here
- Post Processing: Special coloring effects are applied to the 2D output.
- Final Output: The image is buffered to be sent to the monitor.
- Diffuse map: Basically this is just a picture to provide what the object looks like. For example, a brick wall picture could be placed on a polygon to resemble one.
- Specular map: Shows where and how shiny the object is.
- Environment map: A texture with an approximation of the world's surroundings. This is used in simple reflections. It could be either in the form of sphere maps, cube maps or in some cases paraboloid maps, which can produce cubemap-like reflections while using fewer passes.
- Bump map: A texture that affects how lighting is done on the surface of a polygon. It can make a flat surface look like it has features.
- Normal map: A type of bump map that stores a pixel's "direction" it's facing. This can make a flat object look like it was made of many polygons.
- Parallax map: A type of bump map that stores a pixel's "depth". Most commonly used with bricks, where the bricks look like they can obscure the cement holding them, but the entire wall is actually flat.
- Height/Displacement Map: Used to show how much a polygon should stick out. Originally used for landscaping to create easy terrain. Used more commonly for tessellation, or adding polygons to enhance the model's silhouette.
- Flat shading: Lighting is calculated on the polygonal level. It makes everything have a polygonal/pixelated look.
- Gouraud (per-vertex) lighting: Lighting done on each vertex, each pixel is shaded accordingly based on the light the vertex received. It's a better lighting effect than flat shading, but it does produce poor lighting results occasionally.
- Phong (Per-pixel) lighting: Lighting done on each individual pixel. Basically lighting is done on each pixel with every polygon (after clipping) and texture that the pixel covers.
- Ray tracing: Each pixel casts a ray or multiple rays that simulate light. Color is calculated by how this ray interacts with objects
- Ray casting: Color is determined by the first object the ray intersects. This was used in early 3D games like Wolfenstein 3D
- Path tracing: Multiple rays at random angles per pixel are shot out into the scene, with the culmination of these rays determining the final color of the pixel. This is considered to be the holy grail of real-time 3D rendering, and has been too computationally expensive to perform for real-time applications to an acceptable degree until late 2018.
- Materials: A composite type of property for objects which combines texture mapping, sound, and physics. For example, if the game engine comes with a wood material, applying it to an object makes it look like wood, scrape like wood, sound like wood, and break like wood. Likewise, applying a metallic material would make the same object look like metal, shine like metal, and sound like metal.
- High dynamic range lighting: An enhancement for lighting. In standard lighting, all light sources are clamped to a dynamic range of 256 values. This causes strange artifacts in lighting. For instance, if a surface reflects 50% of light, then light from the sun can look as bright as a flashlight. With HDR lighting, lighting is done in a higher dynamic range, then is sampled down. This allows light from the sun to remain as bright as it should be, even if the reflectivity is low. Note this is different than outputting HDR for displays.
- Physically based rendering: A style of rendering gaining ground since 2012. It's not a more complex type of lighting, but a change in philosophy to make lights and lighting effects behave realistically. An example of this is "conservation of energy": a reflection of light cannot be brighter than the light itself, which may have been a case in some scenes to achieve a desired effect.