Nobody denies that Intel's 0.13-micron-architecture Pentium 4 "Northwood" is a powerful processor, but it's only the second strongest chip in many PC gamers' desktops. The GeForce 6800 Ultra graphics processing unit (GPU) that Nvidia Corp. introduced last week has over 220 million transistors -- as many as four Northwoods or two of AMD's Athlon 64s, without using millions for on-chip cache as the PC processors do. If you think technical jargon about clock speeds and memory interfaces is only for CPUs, it's time you got up to speed on another silicon arena, where the warring superpowers are Nvidia and ATI Technologies instead of Intel and AMD. Let's take an introductory look at the engines that power today's games' and animated films' increasingly realistic 3D worlds.
Nvidia coined the acronym GPU -- and defined it as a single-chip processor with integrated transform, lighting, triangle setup/clipping, and rendering engines, capable of producing at least 10 million polygons per second -- when introducing its GeForce 256 in August 1999. Archrival ATI tries to avoid the term, referring to "visual processing units" (or, for its motherboard chipsets, "integrated graphics processors"), but the GPU tag has become popular enough for our purposes.
Whether in a game player's PC or a scientific engineer's workstation, the GPU is designed to take a load off the system processor by handling the majority of 3D rendering and setup duties. Ever-more-complex transform and lighting (T&L) engines and vertex and pixel processors have promoted GPUs' growth in size and complexity, but most consist of the same basic components, seen below in ATI's block diagram of its R350 (Radeon 9700 Pro) core.
Some old-school parts of the chip include the 2D engine for productivity applications and image editing -- once the main performance consideration, now considerably overshadowed by the 3D circuitry -- and interfaces that pass data in and out of the GPU, whether the AGP bus (soon to be pushed aside by PCI Express) or various interfaces for various types of monitors such as CRTs and LCDs. (If you run into the term RAMDAC, it's short for Random Access Memory Digital-to-Analog Converter and converts digital image data to analog for a CRT's red, green, and blue electron guns; a higher RAMDAC clock offers faster screen-refresh rates.)
Moving further into the GPU brings us to today's main attractions, the 3D-specific components. These may differ in terms of naming conventions and architectural design, but remain pretty consistent in terms of function, and include various setup engines, memory compression algorithms (HyperZ III in ATI's chart), an antialiasing unit, memory interface, and 3D rendering engine.
Rendering Hardware
Strictly speaking, of course, when we talk about 3D games or 3D graphics, we're almost always talking about something viewed on a 2D screen. The latter is made up of pixels, while a computerized 3D model is composed of meshes of polygons -- triangles whose corners (vertexes) define three points in space (three sets of X, Y, and Z coordinates). Using more and smaller triangles makes objects look smoother and more realistic, just as using more pixels smoothes out jagged lines in 2D.
Rendering is the process of converting a 3D model to 2D for display. The pace of progress in rendering processor and memory architectures has outpaced that of any other PC technology in recent years; just as today's superscalar CPUs can execute multiple instructions in one clock cycle, GPUs have grown from four to eight to, in the case of Nvidia's brand-new GeForce 6800, 16 parallel pipelines, each able to render one pixel per clock.
The pipelines, combined with texture-mapping units (TMUs), produce the end-result image data. More pipelines mean faster performance, while more TMUs mean better-looking pixels; the speed-versus-quality equation is often noted in pipeline x TMU form, with ATI's Radeon 9800 series featuring an 8x1 design while Nvidia's GeForce FX 5900 has a 4x2 architecture.
Real-world speed measurement for GPUs involves not megahertz or gigahertz, but the number of pixels rendered. This can be expressed as pixel fillrate -- the number of pipelines times clock speed, such as 8 times 380MHz to yield 3.04 gigapixels/sec for the Radeon 9800 XT. Another popular spec is texel fillrate, or number of textured pixels per second, which multiplies the pixel fillrate by the number of TMUs per pipe.
Two very popular buzzwords in the GPU marketplace are pixel and vertex shaders. These are actually programs or functions performed by pixel and vertex processors within the GPU, which load data into registers, execute shader instructions, and render various visual effects and textures. Successive versions of Microsoft's DirectX programming specification, such as the current DirectX 9 versus its predecessor DirectX 8.1, permit more complex vertex and pixel shaders with more instructions and higher mathematical accuracy, which in turn permit more realistic-looking models.
The programmable nature of current vertex and pixel shaders means developers can not only use default instructions, but design new ones to fit their custom needs. Pop Finding Nemo into the DVD player, and you'll see the benefit of high-end pixel shaders and their ability to render complex surfaces.
Add www.earthwebhardware.com to your favorites Add www.earthwebhardware.com to your browser search box IE 7 | Firefox 2.0 | Firefox 1.5.xReceive news via our XML/RSS feed