Graphics Rendering / Media Notes

https://static.tvtropes.org/pmwiki/pub/images/graphics_rendering_pipelines_9937.jpg

Graphics rendering is the process of composing the images that a player sees on the screen while playing a game. Quite a few tasks are needed to produce even a simple graphical game like Super Mario Bros. to a more modern game like Gears of War or Modern Warfare. Since 2D and 3D drawing methods are entirely different, they'll be covered in different sections.

2D Graphics rendering

2D graphics can be summed up in two different methods: Raster graphics and Vector graphics.

Raster Graphics

Raster graphics is a very common method and involves drawing elements on the screen pixel by pixel based on what's in some memory buffer.

Initially there were only two major methods of raster graphics: a text mode and direct drawing. In the former, the video processing chip had a table of characters in ROM that it knew how to draw. Thus rendering something was looking up values in this table and drawing them. As it was higher resolution than direct drawing, sometimes special characters were put in the table to allow for basic tiling graphics, though this was a hack job at best and smooth motion was practically impossible. In direct drawing, it started off as requiring the software to draw the pixel in the image when the display was just about to draw it. Later, when memory was more affordable, each pixel was drawn in that memory before it was sent to the display.

2D consoles in the later 8-bit era started taking a hybrid approach to rendering graphics. The screen was divided up into layers that were static with simple animations and meant to be the foreground and background of the view, while sprites were meant for interactive objects that had complex animations. Everything was built up from tiles on a fixed grid from a tilemap, similar to text mode, in order to make it easier on memory requirements. As hardware got better, layers could be blended together for transparency effects.

One limitation in all 2D consoles was how many sprites the video processor could handle at once per horizontal line. If there were too many, sprites could simply be dropped and those would be rendered invisible. The Atari VCS, Colecovision, NES, and many other early consoles had a quirk to try and avoid some sprites being dropped all the time by rotating through which sprites got to be displayed and which didn't, which resulted in flickering (most famously in the Atari VCS’s version of Pac-Man).

Today 2D raster graphics directly draw on the screen with elements pulled from a picture map. Unlike early 2D consoles, these picture maps can have elements of arbitrary size, though for the sake of making things easier for the computer to process, are typically limited to a power of 2. Also unlike 2D consoles, these elements can be placed anywhere on the screen, rather than be forced into fixed positions based on a tile map.

Vector Graphics

Vector graphics on the other hand is a mathematical approach. Everything is rendered on the fly using points and lines that connect them. The first games to use vector graphics were Atari's Asteroids and Battlezone. The Vectrex was also a console based on vector graphics. Early vector graphics were simply wireframes of the model or image being rendered, hence there was a lack of color and features other than the outline. Eventually the spaces could be filled as hardware got more powerful.

The advantage of vector graphics is its infinite scalability. Because everything is created on the fly, a low resolution vector image will look just as good as a high definition one. Whereas if you scaled a low resolution raster graphics, you would get a blurry or pixelated high resolution one, often with ugly results. While AI upscaling can do a good job of guessing the detail, it still won't make say Mario's sprite from the first Super Mario Bros. look like something in his later 2D incarnations on the DS.

Its major downside is it's more expensive to render an image this way, since everything has to be calculated rather than plucked from a table. To put it in another way, raster graphics is like putting together a clip-art scene while vector graphics requires the artist to draw everything in.

A major use of vector graphics are graphical user interfaces elements because of their need to scale. Raster elements may be used for things like icons or diagrams. Fonts are also built from vectors because of their need to scale as well.

3D Graphics rendering

Much like 2D Graphics rendering, 3D has two main methods.

Voxel 3D Graphics

Voxel is a portmanteau of volumetric and pixel, or more precisely, volumetric picture element. In a 3D space, this would be the most basic element, akin to a pixel being the smallest element in a picture. In fact, a voxel model is basically like how raster graphics work. It's an old concept and is still relatively unused due to hardware constraints (See disadvantages).

Voxels are advantageous for a few reasons:

Voxels can represent a 3D object in a similar way that a picture represents a 2D one. Imagine what you can do to a 2D picture and apply it with another dimension.
Since voxels fill up a space to represent an object, you could break apart objects without the need for creating new geometry as in a polygon based renderer. You would simply break off a chunk of the object.
Voxels can have their own color, eliminating the need for textures entirely.

However, there's still a few things to overcome:

Voxels require a lot more memory than a 2D image (or even 3D models). A 16x16 with 1 byte per pixel for instance, requires 256 bytes to store. A 16x16x16 with 1 byte per voxel model, requires 4096 bytes. A way around this is to find groups of voxels and clump them together into one big voxel.
Detailed voxel models are computationally expensive to setup. Hence they are limited mostly to major industries that need the detail.

Despite these limitations, voxels were used occasionally in games during the 90s, such as in the Comanche series and for mapping the ground detail in Outcast. It gained more prominence towards the late 2000s and throughout the 2010s, especially when Minecraft hit the scene, which uses voxels for mapping data. Voxels are also used readily in polygonal 3D graphics to aid in lighting and other graphical effects.

Some attempts have been made to make a game engine using voxels entirely, examples including Voxlap and Atomontage. Some games have also used voxels for all assets as well, such as Teardown

Polygonal 3D Graphics

Much like 2D vector graphics, polygonal 3D graphics go for an mathematical approach to represent the object. The polygons themselves have all the benefits of vector graphics. The other elements are typically constrained to the same as raster graphics. Modern advances have taken inspiration from sources like puppetry and sculpting to create more artistically advanced methods of creation and animation.

Polygonal 3D graphics are comprised of the following elements:

Vertex: An point in space that represents the coordinates of a polygon's vertex. This is the smallest unit of a 3D scene.
Polygon: A polygon is a collection of connected vertices. Their representation can either be just the lines connecting them (much like how you'd draw a 3D cube with two squares and lines to the corners), or the space between them is filled in. A collection of polygons is typically called a mesh or an object. Early polygonal graphics used quadrilaterals as the most basic unit because it was computationally simple, later moving to triangles, the smallest polygonal shape, as hardware caught up to the task of rendering more finely-grained meshes. By the late 2010s, further improvements in both hardware and artistic techniques led to a hybrid approach; lightweight and easy-to-manipulate quadrilaterals are favored for sculpting, animation, and storage, with surfaces subdivided into clean grids of optimized triangles for final rendering or fine-grained physics simulation.
Texture: Textures are images applied to polygons to achieve some effect. It allows detailing the polygon without spending more polygons. A basic example is applying a brick wall picture to a flat polygon to make it look like a brick wall.
Sprites: These are 2D elements, typically made of one or two polygons in pure 3D engines. Sprites tend to always face the player in an effect called "billboarding." Sprites were common in early 3D games to depict most objects that weren't part of the world geometry such as enemies, items, and even greenery such as trees. Today they build up "fuzzy" things like grass, fur, and leaves, but are also used to depict far away objects.
Armature: also refered to as a "rig" or "skeleton", an armature is a stick-figure-like system of control points (usually called "bones") used for advanced animation, like characters or complex objects. The vertices of a mesh are assigned to follow the movements of bones through an often-tedious process called "weight painting", with the end result being puppet-like control of the mesh in a way that can be intuitively animated by hand or with motion capture. Advanced rigs frequently include some degree of automation, such as marionette-like controls for dynamic placement of feet and hands, and, of course, Jiggle Physics.
Shader: a set of rules defining how light interacts with a surface or object, based on the textures they are given. Simple shaders might only calculate shadows or a particular type of reflected highlight, while complex principled shaders can take input from a stack of texture types to simulate a huge variety of materials in a unified process. When extreme detail and realism are required, many different shaders might be combined to compensate for each method's limitations. For instance, rather than create a glowing crystal by brute force simulation, an artist might pair a realistic glass shader for the surface with a simple shader that efficiently "fakes" the interior effects in a way that might look better than full physically-accurate simulation.

In the early days of 3D graphics, the CPU was responsible for rendering everything, which was known as software rendering. Things didn't really take off though until they started including floating point units, which allowed for much smoother animations. As graphics hardware got more powerful, they soon started taking over parts of the rendering process. Eventually it accumulated today, where the CPU's only graphics rendering job is to create display lists, which are instructions on what the GPU needs to do in order to render the scene.

open/close all folders

3 D rendering steps

Job batching: The CPU takes the data from the game code (physics solutions, AI decisions, user input etc.) and creates a batch of jobs called display lists. These are instructions to tell the GPU how to render graphics or compute things.
Geometry Work: 3D models are moved to their new location, with some optimizations or other work done on the models with the ultimate goal of presenting as much detail as possible with the minimum amount of processing. Vertex, geometry, and mesh shading stages of the rendering pipeline happen here.
- Geometry Instancing: Some models are only loaded once and any entity using the model will use that one copy, rather than load yet another copy. This saves memory and rendering time.
- Primitives generation/Tessellation: Polygons can be added or subtracted as necessary to detail the silhouette. This saves memory, but not necessarily rendering time.
Transform: The coordinates of models are transferred from world space (how the game sees the world) to camera space (how the player sees the world)
Ray Tracing (if used): From the viewpoint of the player camera, rays are cast out and they interact with the 3D scene. The final color of the pixel is determined by what the ray returns as visible along with lighting and coloring properties with it.
Clipping: After getting the scene from the camera's point of view, it'd be inefficient to have to render things that can't be seen by the player. Clipping removes assets that can't be seen by the player so it won't be rendered. For example, if one is in a house, the 3D engine will not render rooms that are not immediately visible to the player. A common form of clipping is called backface culling, which is when the same polygon is invisible when viewed from one side but not the other.
Rasterization: This takes the 3D scene and creates a 2D representation of it for the monitor. This is done by laying the pixels over the 3D scene and if a triangle passes over a sample point in the pixel, the triangle will be rendered for that pixel. In 3D graphics parlance, these samples are known as "fragments." There are two types of rasterization:
- Immediate: Polygons are rasterized at the desired screen resolution. Easiest to implement, but requires a lot of memory bandwidth as entire frames of information need to be passed around. It can also cause excessive rendering since covered polygons may be included.
- Tile-based: The screen size is divided into tiles, usually 16x16 or 32x32. Polygons are sorted to figure out if a polygon is in that tile and which parts need lighting. Then the GPU works on these tiled chunks. Harder to implement but is more efficient as each tile can live in the GPU's cache, memory transfers are less bandwidth intensive, and since polygons are sorted, it prevents rendering of polygons that can't be seen.
- This article shows the differences between the two modes.
Texture mapping: Various textures are applied to the polygons. This requires some complex math to project a 2D image onto a 3D surface properly. Since there's not much variation in how to map a texture to a polygon, GPUs have been able to do this really fast with dedicated, fixed function hardware.
Lighting: The color of each pixel is figured out, usually based on the textures that were applied earlier. The pixel shader stage happens here.
- Forward rendering: The gist of forward rendering is for every object, shade it based on the lights in the scene to a single render target, which becomes the final output. While simple to implement, the computation complexity becomes objects multiplied by lighting. A common optimization is to limit how far a light can influence objects, which may often result in either a low number of dynamic lights or dynamic lights with an unrealistic short range.
- Deferred rendering: Geometry is rendered without lighting to generate multiple render targets, usually consisting of applied textures (see below) and depth. These render targets are combined, then shaded. While more complicated to implement, the computation complexity increases by objects plus lighting. The downsides are it doesn't support transparent objects as it only considers the front-most thing the pixel "sees" and it only supports one lighting model so you can't mix say Cel Shading with realistic lighting.
  - When it was the dominant rendering method, deferred rendering was typically used for a bulk of the scene, while forward rendering was used in other places, such as rendering semi-transparent objects.
- Forward Plus: An improvement over Forward Rendering. The scene is broken up into tiles and then for each tile, figure out how many lights are actually influencing it. Afterwards, apply forward rendering as usual but use only the lights for that tile rather than consider every single light. This scales as good as deferred rendering for up to around 100 light sources, but scales better when going beyond that.
- This article explains the differences between Forward, Deferred, and Forward Plus.
Post Processing: Special coloring effects are applied to the 2D output.
Final Output: The image is buffered to be sent to the monitor.

Texture types

The overall point of a texture is to provide detail to polygons or meshes without needing to spend more polygons to represent the detail. While there are some obvious downsides, the tradeoff is faster rendering times. A texture type is usually called a "map", because it's a "map" on how the pixels should look on screen.

Absorption Map: defines where a type of volumetric shader should absorb light in the way smoke or fog might.
Bump map: A bump map creates the effect of raised or lowered details on flat surfaces, such as tiny scratches on metal or glass, or the grain of wood. Works best with small details and simple distortions, and can become obvious at extreme viewing angles.
Diffuse map: Basically this is just a picture defining the color of a surface. For example, a brick wall picture could be placed on a polygon to resemble one.
Displacement map: Tells the render engine to treat the surface as actually being raised or lowered in an area, as if sculpting the actual mesh. Was normally used for large flat meshes, such as in landscaping, as a fast way to create features. Is now combined with tessellation, where it can sculpt the model to create a highly detailed one from a basic one.
Emission map: defines areas of an object to emit light instead of interact with it.
Environment map: A texture with an approximation of the world's surroundings. This is used in simple reflections, and to create realism when accuracy is less important than looking good or matching a scene it will be composited into. It could be either in the form of sphere maps, cube maps or in some cases paraboloid maps, which can produce cube map-like reflections while using fewer passes.
Normal Map: Alters the angle at which the render engine "sees" the pixel- it might be used to create the effect of dents in metal, distortions in glass, or scales on a reptile. Works best for shallow details that need realistic shading, and bizarre Alien Geometry effects where part of an object interacts with light and shadow in strange ways.
Parallax Map: Alters how "deep" the rendering engine sees the pixel. This allows even more three-dimensional detail to be seen on the surface of a polygon. For instance, if you had a parallax map of a grill over a vent, moving up and down will give the appearance that the vanes of the grill obscure each other. The illusion however fails in certain points, such as seeing the corner of a parallax mapped cube. Also if the map isn't detailed enough, you can see discreet steps in the depth of the pixel.
Refraction map: combines with an IQR value to create light-distorting effects, like glass or the "ripple" of a Negative Space Wedgie
Scattering Map: Similar to an absorption map, but with a glow-like effect for simulating subtle atmospherics and haze
Shadow Map: Adds shadows dynamically (i.e., calculated per frame) to surfaces, rather than rely on the diffuse map to bake this in. This is generated by essentially re-rendering the scene from the light's point of view, minus several steps since this only needs to see what's blocking the light. While cheaper than ray-traced shadows for each individual light, calculations can easily pile up with more lights and the resolution of the shadow map to the point where ray tracing becomes cheaper.
Specular map: Shows where and how shiny the object is.
Subsurface Scattering map: defines a type of internal translucent glow seen in materials like skin or plastic
Transparency map: Shows where an object should be translucent or transparent

Lighting methods

Flat shading: Lighting is calculated on the polygonal level. It makes everything have a polygonal/pixelated look.
Gouraud (per-vertex) lighting: Lighting done on each vertex, each pixel is shaded accordingly based on the light the vertex received. It's a better lighting effect than flat shading, but it does produce poor lighting results occasionally.
Phong (Per-pixel) lighting: Lighting done on each individual pixel. Basically lighting is done on each pixel against the polygons and textures it covers.
Ray tracing: A family of algorithms where from the point of view, a ray or multiple rays are shot out to simulate light. Color is calculated by how these rays interacts with objects.
- Ray casting: Color is determined by the first object the ray intersects. This was used in early 3D games like Wolfenstein 3-D
- Sparse Voxel Octree: Instead of the world being represented by polygons, it's represented by relatively large voxels. This makes it easier to detect if a ray is hitting something. However, due to the size of the voxels, it's mostly used for soft lighting effects like global illumination or shadowing.
- Ray Marching ^note: An approach where from the camera, the game asks where is the closest point to some object. From that point, make a sphere with a diameter from the camera to said point, then "march" to the edge of the sphere in the direction of the ray. Keep doing this until the sphere is considered small enough to say the ray hit something. While faster than path tracing, this is only so with sufficiently large spheres. Using larger spheres means it can't resolve fine details.
- Path tracing: This is normally what's used when the term "ray tracing" is used. Multiple rays at random angles per pixel are shot out into the scene, with the culmination of these rays determining the final color of the pixel. The more rays that can be shot, the better the result. This is considered to be the holy grail of real-time 3D rendering, and has been too computationally expensive to perform for real-time applications to an acceptable degree until late 2018.

Other terms

High dynamic range lighting: An enhancement for calculating lighting. In standard lighting, all light sources are clamped to a dynamic range of 256 values, which can lead to strange artifacts. For instance, if a surface reflects 50% of light, then light from the sun can look as bright as a flashlight. With HDR lighting, lighting is done in a higher dynamic range, then sampled down. This allows light from the sun to remain as bright as it should be, even if the reflectivity is low. Note this is different than outputting HDR for displays.
Level of detail (LOD) and Mipmapping: An early optimization in rendering. If a 3D model or texture is far enough away from the player camera, it would be wasteful to render its full detailed version since the player can only see some of it. To prevent this from happening, the 3D model or texture is swapped with a lower detailed version of it. The term LOD itself is generally only used for 3D models, while mimapping is only used for textures though LOD may also be used for textures.
Materials: A composite type of property for objects which combines texture mapping, sound, and physics. For example, if the game engine comes with a wood material, applying it to an object makes it look like wood, scrape like wood, sound like wood, and break like wood. Likewise, applying a metallic material would make the same object look like metal, shine like metal, and sound like metal.
Particles: A system that simulates "fuzzy" things like smoke or fire, but can also be used for more ordered but larger systems like schools of fish, sparks flying from electrical arks, or even galaxies. The graphical element used tends to be a collection of sprites, and then combined with a physics system that controls where the particles come from (emission) and how they interact in the world (simulation). While fast to render these fuzzy things, particles tend to be done in a vacuum and are not exactly influenced by the rest of the game world. For instance, smoke particle systems tend clip through solid objects as if it doesn't exist.
Physically based rendering: A style of rendering gaining ground since 2012. It's not a more complex type of lighting, but a change in philosophy to make lights and lighting effects behave realistically. An example of this is "conservation of energy": a reflection of light cannot be brighter than the light itself, which may have been a case in some scenes to achieve a desired effect.
Volumetrics: a type of shader that considers the space inside an object rather than its surface. An Energy Ball, for instance, might be defined by a simple cube and a round gradient within that space.

Media Notes / Graphics Rendering

2D Graphics rendering

Raster Graphics

Vector Graphics

3D Graphics rendering

Voxel 3D Graphics

Polygonal 3D Graphics

Previous

Index

Next

Media Notes / Graphics Rendering

Edit Locked

2D Graphics rendering

Raster Graphics

Vector Graphics

3D Graphics rendering

Voxel 3D Graphics

Polygonal 3D Graphics

Previous

Index

Next