diff --git a/_posts/2025-01-22-riva128-part-1.md b/_posts/2025-01-22-riva128-part-1.md index b6e612b..d9f9a33 100644 --- a/_posts/2025-01-22-riva128-part-1.md +++ b/_posts/2025-01-22-riva128-part-1.md @@ -30,45 +30,93 @@ Please contact me @ thefrozenstar_ on Discord, via the 86Box Discord, my email a ## Introduction -The NVIDIA RIVA 128 is a graphics card released in 1997 by NVIDIA (nowadays of AI and $2000 overpriced quad-slot GPU fame). It was a Direct3D 5.0-capable accelerator, and one of the first to use a standard graphics API, such as DirectX, as its "native" API. I have been working on emulating this graphics card for the last several months; currently, while VGA works and the drivers are loading successfully on Windows 2000, they are not rendering any kind of accelerated output yet. Many people, including me, have asked for and even tried to develop emulation for this graphics card and other similar cards (such as its successor, "NV4" or RIVA TNT), but have not succeeded yet (although many of these efforts continue). This is the first part of a series where I explore the architecture and my experiences in emulating this graphics card. I can't guarantee success, but if it was successful, it appears that it would be the first time that a full functional emulation of this card has been developed (although later NVIDIA cards have been emulated to at least some extent, such as the GeForce 3 in Cxbx-Reloaded and Xemu). +The NVIDIA RIVA 128 is a graphics card released in 1997 by NVIDIA, nowadays of AI and $2000 overpriced quad-slot GPU fame. It was a Direct3D 5.0-capable accelerator, and one of the first to use a standard graphics API such as DirectX as its "native" API. I have been working on emulating this graphics card for the last several months; currently, while VGA works and the drivers are loading successfully on Windows 2000, they are not rendering any kind of accelerated output yet. Many people, myself included, have asked for and even tried to develop emulation for this graphics card and other similar cards (such as its successor, "NV4" or RIVA TNT), but have not succeeded yet, although many of these efforts continue. This is the first part of a series where I explore the architecture and my experiences in emulating this graphics card. I can't guarantee success, but if it was successful, it appears that it would be the first time that a full functional emulation of this card has been developed; although later NVIDIA cards have been emulated to at least some extent, such as the GeForce 3 in Cxbx-Reloaded and Xemu. -This is the first part of a series of blog posts that aims to demystify, once and for all, NVIDIA RIVA 128. This is the first part, which will dive into the history of NVIDIA up to the release of the NVIDIA RIVA 128, and a brief overview of how the RIVA 128 actually works. The second part will dive into the architecture of NVIDIA's drivers and how they relate to the hardware, and the third part will follow the lifetime of a graphics object from birth to display on the screen in extreme detail. Then, part four and an unknown number of parts after part four will go into detail on the experiences of developing a functional emulation for this graphics card. +This is the first part in a series of blog posts that aims to demystify, once and for all, NVIDIA RIVA 128. This first part will dive into the history of NVIDIA up to the release of the RIVA 128, and a brief overview of how the chip actually works. The second part will dive into the architecture of NVIDIA's drivers and how they relate to the hardware, and the third part will follow the lifetime of a graphics object from birth to display on the screen in extreme detail. Then, part four and an unknown number of parts after part four will go into detail on the experiences of developing a functional emulation for this graphics card. --- -## A brief history +## A not so brief history ### Beginnings -NVIDIA was conceived in 1992 by three engineers from LSI Logic and Sun Microsystems - Jensen Huang (now one of the world's richest men, still the CEO and, apparently, mobbed by fans in the country of his birth, Taiwan), Curtis Priem (whose boss almost convinced him to work on Java instead of founding the company) and Chris Malachowsky (a veteran of graphics chip development). They saw a business opportunity in the PC graphics and audio market, which was dominated by low-end, high-volume players such as S3 Graphics, Tseng Labs, Cirrus Logic and Matrox (the only one that still exists today - after exiting the consumer graphics market in 2003 and ceasing to design graphics cards entirely in 2014). The company was formally founded on April 5, 1993, after all three left their jobs at LSI Logic and Sun between December 1992 and March 1993. Immediately (well, after the requisite $3 million of venture capital funding was acquired - a little nepotism owing to their reputation helped) began work on its first generation graphics chip; it was one of the first of a rush of dozens of companies attempting to develop graphics cards - both established players in the 2D graphics market such as Number Nine and S3, and new companies, almost all of which no longer exist - and many of which failed to even release a single graphics card. The name was initially GXNV ("GX next version", after a graphics card Malachowsky led the development of at Sun), but Huang requested him to rename the card to NV1 in order to not get sued. This also inspired the name of the company - NVIDIA, after other names such as "Primal Graphics" and "Huaprimal" were considered and rejected, and their originally chosen name - Invision - turned out to have been trademarked by a toilet paper company. In a perhaps ironic twist of fate, toilet paper turned out to be an apt metaphor for the sales, if not quality, of their first product, which Jensen Huang appears to be embarassed to discuss when asked, and has been quotes as saying "You don't build NV1 because you're great". The product was released in 1995 after a two-year development cycle and the creation of what NVIDIA dubbed a hardware simulator, but actually appears to have been simply a set of Windows 3.x drivers intended to emulate their architecture, called the NV0 in 1994. + +NVIDIA was conceived in 1992 by three LSI Logic and Sun Microsystems engineers: Jensen Huang (now one of the world's richest men, still the CEO and, apparently, mobbed by fans in his country of birth Taiwan), Curtis Priem (whose boss almost convinced him to work on Java instead of founding the company) and Chris Malachowsky (a veteran of graphics chip development). They saw a business opportunity in the PC graphics and audio market, which was dominated by low-end, high-volume players such as S3 Graphics, Tseng Labs, Cirrus Logic and Matrox[^matrox]. The company was formally founded on April 5, 1993, after all three left their jobs at LSI Logic and Sun between December 1992 and March 1993. + +[^matrox]: Only Matrox is both still around and still in the graphics space, after exiting the consumer market in 2003 and ceasing to design graphics cards entirely in 2014 before recently returning with Intel Arc-based designs. Cirrus Logic is still around as an audio chip designer, stemming from their acquisition of Crystal in the 1990s. + +After the requisite $3 million of venture capital funding was acquired (a little nepotism owing to their reputation helped), work immediately began on a first generation graphics chip; this was one of the first of a rush of dozens of companies attempting to develop graphics cards - both established players in the 2D graphics market such as Number Nine and S3, and new companies, almost all of which no longer exist - many of which failed to even release a single graphics card. The name was initially GXNV for "GX next version", after a graphics card Malachowsky led the development of at Sun, but Huang requested him to rename the card to NV1 in order to not get sued. This also inspired the name of the company - NVIDIA, after other names such as "Primal Graphics" and "Huaprimal" were considered and rejected, and their originally chosen name of "Invision" turned out to have been trademarked by a toilet paper company. + +In a perhaps ironic twist of fate, toilet paper turned out to be an apt metaphor for the sales, if not quality, of their first product, which Jensen Huang appears to be embarassed to discuss when asked, and has been quoted as saying "You don't build NV1 because you're great". The product was released in 1995 after a two-year development cycle and the creation of what NVIDIA dubbed a hardware simulator, but actually appears to have been simply a set of Windows 3.x drivers intended to emulate their architecture, called the NV0 in 1994. ### The NV1 -The NV1 was a combination graphics, audio, DRM (yes, really) and game port card implementing what NVIDIA dubbed the "NV Unified Media Architecture (UMA)"; the chip was manufactured by SGS-Thomson Microelectronics - now STMicroelectronics - on the 350 nanometer node, who also white-labelled NVIDIA's design (except the DAC, which seems to have been designed by SGS, at least based on the original contract text from 1993) as the STG-2000 (without audio functionality - this was also called the "NV1-V32", for 32-bit VRAM, in internal documentation, with NVIDIA's being the NV1-D64). The card was designed to implement a reasonable level of 3D graphics functionality, as well as audio, public-key encryption for DRM purposes (which was never used, as it would have required the cooperation of software companies) and Sega Saturn game ports in a single megabyte of memory (memory cost $50 a megabyte when the initial design of the NV1 chip began in 1993). In order to achieve this, many techniques had to be used that ultimately compromised the quality of the 3D rendering of the card, such as using forward texture mapping, where a texel (output pixel) of a texture is directly mapped to a point on the screen, instead of the more traditional inverse texture mapping, which iterates through pixels and maps texels from those. While this has memory space advantages (as you can cache the texture in the very limited amount of VRAM NVIDIA had to work with very easily), it has many more disadvantages - firstly, this approach does not support UV mapping (a special coordinate system used to map textures to three-dimensional objects) and other aspects of what would be considered to be today basic graphical functionality. Additionally, the fundamental implementation of 3D rendering used quad patching instead of traditional triangle-based approaches - this has very advantageous implications for things like curved surfaces, and may have been a very effective design for the CAD/CAM customers purchasing more high end 3D products. However, it turned out to not be particularly useful at all for the actually intended target market - gaming. There was also a total lack of SoundBlaster compatibility (required for audio to sound half-decent in many games) in the audio engine, and partially-emulated and very slow VGA compatibility, which led to slow performance in the games people *actually played*, unless your favourite game was a crappier, slower version of Descent, Virtua Cop or Daytona USA for some reason. Another body blow to NVIDIA was received when Microsoft released Direct3D in 1996 with DirectX 2.0, which simultaneously used triangles, became the standard 3D API and killed all of the numerous non-OpenGL proprietary 3D apis, including S3's S3D and later Metal, ATIs 3DCIF, and NVIDIA's NVLIB. -The upshot of all of this was, despite its innovative silicon design, what can be understood as nothing less than the total failure of NVIDIA to sell, or convince anyone to develop for in any way, the NV1. While Diamond Multimedia bought 250,000 chips to place into boards (marketed as the Diamond "Edge 3D" series), barely any of them sold, and those that did sell were often returned, leading to the chips themselves being returned to NVIDIA and hundreds of thousands of chips sitting simply unused in warehouses. Barely any NV1-capable software was released - the few pieces of software that were released came via a partnership with Sega, which I will elaborate on later - and most of it was forced to run under software emulators for Direct3D (or other APIs) written by Priem - which was only possible due to the software architecture NVIDIA chose for their drivers - which were slow (slower and worse-looking than software), buggy, and extremely unappealing. NVIDIA lost $6.4 million in 1995 on a revenue of $1.1 million, and $3 million on a revenue of $3.9 million in 1996; most of the capital that allowed NVIDIA to continue operating were from the milestone payments from SGS-Thomson for developing the card, their NV2 contract with Sega (which wlll be explored later), and their venture capital funding - not the very few NV1 sales. The card reviewed poorly, had very little software and ultimately no sales, and despite various desperate efforts to revive it, including releasing the SDK for free (a proprietary API called NVLIB was used to develop games against the NV1) and straight up begging their customers on their website to spam developers with requests to develop NV1-compatible versions of games, the card was effectively dead within a year. +The **NV1** was a combination graphics, audio, DRM (yes, really) and game port card implementing what NVIDIA dubbed the "NV Unified Media Architecture" (UMA); the chip was manufactured by SGS-Thomson Microelectronics (now STMicroelectronics) on the 350 nanometer node, who also white-labelled NVIDIA's design (which allegedly[^contract] featured a DAC block designed by SGS-Thomson) as the STG-2000, a version without audio functionality that is also called the "NV1-V32" (for 32-bit VRAM) in internal documentation, as opposed to NVIDIA's design being the NV1-D64. The card was designed to implement a reasonable level of 3D graphics functionality, as well as audio, public-key encryption for DRM purposes (ultimately never used as it would have required the cooperation of software companies) and Sega Saturn game ports, all within a single megabyte of RAM, as memory costs were around $50 a megabyte when initial design began in 1993. -### The NV2 and the near destruction of the company -Nevertheless, NVIDIA (which by this time had close to a hundred employees, including sales and marketing teams), and especially its cofounders, remained confident in their architecture and overall prospects of success . They had managed to solidify a business relationship with Sega, to the point where they had initially won the contract to provide the graphics hardware for the successor to the Sega Saturn, at that time codenamed "V08". The GPU was codenamed "Mutara" (after the nebula critical to the plot in Star Trek II: The Wrath of Khan) and the overall architecture was the NV2. It maintained many of the functional characteristics of the NV1 and was essentially a more powerful successor to that card. +[^contract]: Source: [Strategic Collaboration Agreement between NVIDIA and SGS-Thomson](http://web.archive.org/web/20240722140726/https://contracts.onecle.com/nvidia/sgs.collab.1993.11.10.shtml), originally covering NV1 but revised to include NV3, apparently part of a filing with the US Securities and Exchange Commission. -However, problems started to emerge almost immediately. Game developers, especially Sega's internal game developers, were not happy with having to use a GPU with such a heterodox design (for example, porting games to or from the PC, which Sega did do at the time, would be made far harder). This position was especially championed by Yu Suzuki, the head of one of Sega's most prestigious internal development teams Sega-AM2 (responsible, among others, for the Daytona USA, Virtua Racing, Virtua Fighter, and Shenmue series), who sent his best graphics programmer to interface with NVIDIA and push for triangles. At this point, the story diverges - some tellings claim that NVIDIA simply refused to accede to Sega's request and this severely damaged their relationship; others that the NV2 wasn't killed until it failed to produce any video during a demonstration, and Sega still paid NVIDIA for developing it to prevent bankruptcy (it does appear that one engineer got it working for the sole purpose of receiving a milestone payment). At some point, Sega, as a traditional Japanese company, couldn't simply kill the deal, so officially, they relegated the NV2 to be used in the successor to the educational aimed-at-toddlers Sega Pico, while in reality, Sega of America had already been told to "not worry" about NVIDIA anymore. NVIDIA got the hint, and the NV2 was cancelled. A bonus fact that isn't really applicable anywhere is that the NV2 appears to have been manufactured by the then-just founded Helios Semiconductor, based on available sources, the only NVIDIA card to have been manufactured by them. +In order to achieve this, many techniques had to be used that ultimately compromised the quality of the 3D rendering of the card, such as using forward texture mapping, where a texel (output pixel) of a texture is directly mapped to a point on the screen, instead of the more traditional inverse texture mapping, which iterates through pixels and maps texels from those. While this has memory space advantages (as you can cache the texture in the very limited amount of VRAM NVIDIA had to work with very easily), it has many more disadvantages; firstly, this approach does not support UV mapping (a special coordinate system used to map textures to three-dimensional objects) and other aspects of what would be considered to be today basic graphical functionality. -At this point, NVIDIA had no sales, no customers, and barely any money (at some point in late 1996, NVIDIA had $3 million - and was burning through $330,000 a month). Most of the NV2 team had been redeployed to the next generation - the NV3. No venture capital funding was going to be forthcoming due to the failure to actually create any products people wanted to buy, at least not without extremely unfavourable terms on things like ownership. It was effectively almost a complete failure and a waste of years of the employees time. By the end of 1996, things had gotten infinitely worse - the competition was heating up. Despite NV1 being the first texture-mapped consumer GPU ever released, they had been fundamentally outclassed by their competition. It was a one-two punch: initially, Rendition, founded around the same time as NVIDIA in 1993 released its custom RISC architecture V1000 chip. While not particularly fast, this was, for a few months, the only card that could run Quake (the hottest game of 1996) in hardware accelerated mode and was an early market leader (as well as S3's laughably bad ViRGE - Video and Rendering Graphics Engine - which at launch was slower than software on high-end CPUs, and was reserved for OEM bargain-bin disaster machines). +Additionally, the fundamental implementation of 3D rendering used quad patching instead of traditional triangle-based approaches; this has very advantageous implications for things like curved surfaces, and may have been a very effective design for the CAD/CAM customers purchasing more high end 3D products, however, it turned out to not be particularly useful at all for the actually intended target market of gaming. There was also a total lack of Sound Blaster compatibility (very much a requirement for half-decent audio in games back then) in the audio engine, and VGA compatibility was very slow and partially emulated, which led to slow performance in the games people *actually played*, unless your favourite game was a crappier, slower version of *Descent*, *Virtua Cop* or *Daytona USA* for some reason. Another body blow to NVIDIA was received when Microsoft released Direct3D in 1996 with DirectX 2.0, which not only used triangles, but also became the standard 3D API and deprecated all of the numerous non-OpenGL proprietary APIs of the time, including S3's S3D and MeTaL[^metal], ATI's 3DCIF, and NVIDIA's own NVLIB. -However, this was nothing compared to the body blow about to hit the entire industry, NVIDIA included. At a conference in early 1996, an $80,000 SiliconGraphics (then the world leader in accelerated graphics) machine crashed during a demo by the then-CEO Ed McCracken. While they were rebooting the machine, if accounts of the event are to be believed, the people in the event started leaving, many of which, based in rumours that they had heard heading downstairs to another demo by a then-tiny company made up of ex-SGI employes calling itself "3D/fx" (later shortened to 3dfx), which was claiming comparable graphics quality for $250...and had demos to prove it. In many of the cases of supposed "wonder innovations" in the tech industry, it turns out to be too good to be true, but when their card, the "Voodoo Graphics" was first released in the form of the "Righteous 3D" by Orchid in October 1996, it turned out to be true. Despite the fact that it was a 3D-only card and required a 2D card to be installed, and the fact it could not accelerate graphics in a window (which almost all other 3d cards could do), the card's performance was so high relative to the other efforts (including the NV1) that it not only had rave reviews on its own but kicked off a revolution in consumer 3D graphics, which especially caught fire when GLQuake was released in January 1997. +[^metal]: S3's later API for the Savage family, not to be confused with Apple's Metal from many years later. -The reasons that 3dfx was able to design such an effective GPU when all others failed are numerous - the price of RAM plummeted by 80% through 1996 (which allowed 3DFX to cut their estimated retail price for the Voodoo from $1000 to $300), many of their staff members came from what at that time was perhaps the most respected and certainly the largest company in the graphics industry - SiliconGraphics (which by 1997 had over fifteen years of experience in developing graphical hardware, but was also suffering from rampant mismanagement and in what would prove to be terminal decline), and while 3dfx used the proprietary Glide API, it also supported OpenGL and Direct3D - Glide was designed to be very similar to OpenGL while allowing for 3dfx to approximate standard graphical techniques, which, as well as their driver design - the Voodoo only accelerates edge interpolation (where a triangle is converted into "spans" of horizontal lines, and the positions of nearby vertexes are used to determine the span's start and end positions), texture mapping and blending, span interpolation (which to simplify a complex topic generally involves, in a GPU of this era, z-buffering, also known as depth buffering, sorting polygons back to front, and color buffering, storing the color of each pixel sent to the screen in a buffer which allows for blending and alpha transparency), and final presentation of the rendered 3D scene - the rest was all done in software. All of these factors were key in what proved to be an exceptionally low price for what was considered to be an exceptionally high quality for the time of the card. +The upshot of all of this was what can be understood as nothing less than the total failure of NVIDIA to sell or convince anyone to develop for NV1 in any way, despite its innovative silicon design. While Diamond Multimedia purchased 250,000 chips to place into their "Edge 3D" series of boards, barely any of them sold, and those that did sell were often returned, leading to the chips themselves being returned to NVIDIA and hundreds of thousands of chips sitting simply unused in warehouses. Barely any NV1-capable software was released, with the few pieces of software that do exist coming via a partnership with Sega (more on that later), while most others were forced to run under software emulators for Direct3D (or other APIs) written by Priem, which were made possible by the software architecture NVIDIA chose for their drivers, but were slower and worse-looking than software rendering, buggy, and extremely unappealing. -Effectively, NVIDIA had to design a graphics architecture that could at very least get close to 3dfx's performance, on a shoestring budget, with very little resources (60% of the staff, including the entire sales and marketing teams, having been laid off to preserve money). Since they did not have the time, they not only could not completely redesign the NV1 from scratch if they felt the need to do this (this would take two years - time that NVIDIA simply didn't have - and any design that came out of this effort would be immediately obsoleted by other companies, such as 3dfx's Voodoo line and ATI with its initially rather pointless, but rapidly advancing in performance and driver stability, Rage series of chips) the chip would have to work reasonably well on the first tapeout, as they simply did not have the capital to produce more revisions of the chip, The fact they were able to achieve a successful design in the form of the NV3 under such conditions was testament to the intelligence, skill and luck of NVIDIA's designers. Later on in this blogpost, we will explore how they managed to achieve this. +NVIDIA lost $6.4 million in 1995 on a revenue of $1.1 million, and $3 million on a revenue of $3.9 million in 1996. Most of the capital that allowed NVIDIA to continue operating were from the milestone payments from SGS-Thomson for developing the card, their NV2 contract with Sega (again, more on that later), and their venture capital funding; not the very few NV1 sales. The card reviewed poorly, had very little software and ultimately no sales, and despite various desperate efforts to revive it, including releasing the SDK for free (including the proprietary NVLIB API used to develop games against the NV1) and straight up begging their customers on their website to spam developers with requests to develop NV1-compatible versions of games, the card was effectively dead within a year. -### The NV3 (RIVA 128) -It was with these financial, competitive and time constraints in mind that design on the NV3, which would eventually be commercialised as the RIVA 128 ("Real-time Interactive Video and Animation accelerator", with the 128 owing to its at-the-time very large 128-bit size of its internal bus), began in 1996. NVIDIA retained SGS-Thomson (soon to be renamed to STMicroelectronics, which is the name it is still under today) as the manufacturing partner, in return for SGS-Thomson cancelling their rival GPU - the STG-3001. In a similar vein to the NV1, NVIDIA was initially going to sell the card as the "NV3" with inbuilt audio functionality and SGS-Thomson was going to white-label the chip as the SGS-Thomson STG-3000 without audio functionality - it seems, based on the original contract language, which for some reason is [**only available on a website for example contracts, where it has been around since 2004**](https://contracts.onecle.com/NVIDIA/sgs.collab.1993.11.10.shtml), but appears to have originated from a filing with the U.S. Securities and Exchange Commission, based on the format and references to "the Commission", that NVIDIA convinced them to cancel their own GPU, the STG-3001, and manufacture the NV3 instead - which would prove to be a terrible decision for STMicro when NVIDIA dropped them and moved to TSMC for the RIVA 128ZX due to yield issues, and the fact that NVIDIA's venture capital funders were pressuring them to move to TSMC. STMicro manufactured PowerVR cards for a few more years, but they had dropped out of the market entirely by 2001. +### The NV2 -After the NV2 disaster, the company made several calls on the NV3's design that turned out to be very good decisions. First, they acquiesced to Sega's advice (which they might have done already, but too late, to save the Mutara V08/NV2) and moved to an inverse texture mapping triangle based model (although some remnants of the original quad patching design remain) and removed the never-used DRM functionality from the card. This may have been assisted by the replacement of Curtis Priem with the rather egg-shaped David Kirk, perhaps notable as a "Special Thanks" credit on Gex and the producer of the truly unparalleled *3D Baseball* on the Sega Saturn during his time at Crystal Dynamics, as chief designer - Priem insisted on including the DRM functionality with the NV1, because back when he worked at Sun, the game he had written as a demo of the GX GPU designed by Malachowsky was regularly pirated. Another decision that turned out to pay very large dividends was deciding to forgo a native API entirely and entirely build the card around accelerating the most popular graphical APIs - which led to an initial focus on Direct3D (although OpenGL drivers were first publicly released in alpha form in December 1997, and released fully in early 1998). Initially DirectX 3.0 was targeted, but 5.0 came out late during the development of the chip (4.0 was cancelled due to [**lack of developer interest in its new functionality**](https://devblogs.microsoft.com/oldnewthing/20040122-00/?p=40963)) and the chip is mostly Direct3D 5.0 compliant (with the exception of some blending modes such as additive blending, which Jensen Huang later claimed was due to Microsoft not giving them the specification in time), which was made much easier by the design of their driver (which allowed, and still allows, graphical APIs to be plugged in as "clients" to the Resource Manager kernel - as I mentioned earlier, this will be explained in full detail later). The VGA core (which was so separate from the main GPU on the NV1 that it had its own PCI ID) was replaced by a VGA core licensed from Weitek (who would soon exit the graphics market), which was placed in the chip parallel to the main GPU with its own 32-bit bus, which massively accelerated performance in unaccelerated VESA titles, like Doom - and provided a real advantage over the 3D-only 3dfx cards (3dfx did have a combination card, the SST-96 or Voodoo Rush, but it used a crappy Alliance card and was generally considered a failure). Finally, Huang, in his capacity as the CEO, allowed the chip to be expanded (in terms of physical size and number of gates) from its original specification, allowing for a more complex design with more features. +Nevertheless, NVIDIA grew to close to a hundred employees, including sales and marketing teams. The company, and especially its cofounders, remained confident in their architecture and overall prospects of success. They had managed to solidify a business relationship with Sega, to the point where they had initially won the contract to provide the graphics hardware for the successor to the Sega Saturn, at that time codenamed "V08". The GPU was codenamed "Mutara" (after the nebula critical to the plot in *Star Trek II: The Wrath of Khan*) and the overall architecture was the **NV2**. It maintained many of the functional characteristics of the NV1 and was essentially a more powerful successor to that card. According to available sources, this would have been the only NVIDIA chip manufactured by the then-just founded Helios Semiconductor. -The initial revision of the architecture appears to have been completed in January 1997. Then, aided by hardware simulation software (unlike the NV0, an actual hardware simulation) purchased from another almost-bankrupt company, an exhaustive test set was completed. The first bug presented itself almost immediately when the "C" character in the MS-DOS codepage appeared incorrectly, Windows took 15 minutes to boot, and moving the mouse cursor required a map of the screen so you didn't lose it by moving too far, but ultimately the testing was completed. However, NVIDIA didn't have the money to respin the silicon for a second stepping if problems appeared, so it had to work at least reasonably well in the first stepping. Luckily for NVIDIA, when the card came back it worked well enough to be sold to NVIDIA's board partners (almost certainly due to that hardware simulation package they had), and the company survived - most accounts indicate it was only three or four weeks away from bankruptcy; when 3dfx saw the RIVA 128 at its reveal at the CGDC 1997 conference, the response of one of the founders was "You guys are still around?" - NVIDIA's financial problems were so severe that 3dfx almost *bought* NVIDIA, effectively for the purpose of killing the company as a theoretical competitor, but refused, as they assumed they would be bankrupt within months anyway (a disastrous decision). However, this revision of the chip - revision A - was not the revision that NVIDIA actually commercialised; SGS-Thompson dropped the plans for the STG-3000 at some point, which led NVIDIA, now flush with cash (revenue in the first nine months of 1997 was only $5.5 million, but skyrocketed up to $23.5 million in the last three months - the first three month period of the RIVA 128's availability, owing to the numerous sales of RIVA 128 chips to add-in board partners), to create a new revision of the chip to remove the sound functionality (although some remnants of it were left after it was removed); some errata was also fixed and other minor adjustments made to the silicon - there are mentions of quality problems with early cards in a lawsuit filed against STB Systems (who were the first OEM partner for the RIVA 128), it is not clear if the problems were on STB or NVIDIA's end and respun the chip, with the revision B silicon being completed in October 1997 and presumably available a month or two later. It is most likely that some revision A cards were sold at retail, but based on the dates, these would have to be very early units, with the earliest NVIDIA RIVA 128 drivers that I have discovered (labelled as "Version 0.75") dated August 1997 (these also have NV1 support - and actually are the only Windows NT drivers with NV1 support), and reviews starting to drop on websites like Anandtech in the first half of September 1997. There are no known drivers for the audio functionality in the revision A of the RIVA 128 available, so anyone wishing to use it would have to write custom drivers to actually use it. +However, problems started to emerge almost immediately. Game developers, especially Sega's internal teams, were not happy with having to use a GPU with such a heterodox design; for example, porting games to or from the PC, which Sega did do at the time, would be made far harder. This position was especially championed by Yu Suzuki, head of one of Sega's most prestigious internal development teams Sega-AM2, responsible for the *Daytona USA*, *Virtua Racing*, *Virtua Fighter*, and *Shenmue* series among others, who sent his best graphics programmer to interface with NVIDIA and push for triangles. At this point, the story diverges: some tellings claim that NVIDIA simply refused to accede to Sega's request and this severely damaged their relationship; others that the NV2 wasn't killed until it failed to produce any video during a demonstration, and Sega still paid NVIDIA for developing it to prevent bankruptcy, with one engineer apparently getting it working for the sole purpose of receiving a milestone payment. -The card generally reviewed quite well at its launch and was considered as the fastest graphics card released in 1997, with raw speed, although not video quality, higher than the Voodoo1 - most likely, the lower quality of the NV3 architecture's graphics output owes much to the card's rushed development (due to NVIDIA's financial situation) leading to shortcuts being taken in the GPU design process in order to ship on time. For example, some of the Direct3D 5.0 blending modes are not supported, and per-polygon mipmapping, a graphical technique involving scaling down textures as you move away from an object in order to prevent shimmering, was used instead of the more accurate per pixel approach, causing seams between different mipmapping layers. The dithering quality and the quality of the RIVA 128's bilinear texture filtering were often criticised. Furthermore, some games exhibited seams between polygons, and the drivers were generally very rough at launch, especially if the graphics card was an upgrade and previous drivers were not. While NVIDIA were able to fix many of the driver issues by the time of the version 3.xx drivers, which were released in 1998 and 1999, and even wrote a fairly decent OpenGL ICD, the standards for graphical quality had risen over time and what was considered "decent" in 1997 was considered to be "bad" and even "awful" by 1999. Nevertheless, Over a million units were sold within a few months and NVIDIA's immediate existence as a company was secured; an enhanced version (revision C, also called "NV3T"), branded as the RIVA 128 ZX, was released in March 1998 in order to compete with a hypothetically very fast and much-hyped, but not actually very good card, the Intel/Lockheed Martin i740 chip (the predecessor of the universally detested Intel iGPUs). As of 2024, Intel has finally managed to produce a graphics card people actually want to buy (and aren't forced to due to a lack of financial resources, as with their iGPUs), the mid-range Intel Arc B580, based on the "Battlemage" architecture, Intel's 16th-generation GPU architecture (or in Intel parlance, the 13th, because of great names such as "Generation 12.7"). Better late than never, I guess. +At some point, Sega, as a traditional Japanese company, couldn't simply kill the deal, so the NV2 was officially relegated to be used in the successor to the educational toddler-aimed Sega Pico, while in reality, Sega of America had already been told to "not worry" about NVIDIA anymore. NVIDIA got the hint, and the NV2 was cancelled. With NV1 and NV2 out of the picture, NVIDIA had no sales, no customers, and barely any money; at some point in late 1996, the company had $3 million and was burning through $330,000 a month, and most of the NV2 team had been redeployed to the next-generation NV3. No venture capital funding was going to be forthcoming due to the failure to actually create any products people wanted to buy, at least not without extremely unfavourable terms on things like ownership. The company was effectively almost a complete failure and a waste of years of the employees' time. -After all of this history and exposition, we are finally ready to actually explore the GPU behind the RIVA 128 series. I refer to it as NV3, as NV3 is the architecture behind it (and this can be used to refer to all cards manufactured using it, including STG-3000, if a prototype form of one does ever turn out to exist, RIVA 128, and RIVA 128 ZX). Note that the architecture of the 32-bit Weitek core will not be discussed in length here unless it is absolutely required. It's pretty much a standard SVGA, and really is not that interesting compared to the main card. It's not even substantially integrated with the main GPU, although there are a few areas in the design that allow the main GPU to write directly to the Weitek's registers. +### Near destruction of the company + +By the end of 1996, things had gotten infinitely worse, with the competition heating up; despite NV1 being the first texture-mapped consumer GPU ever released, they had been fundamentally outclassed by their competition. It was a one-two punch: initially, Rendition - founded around the same time as NVIDIA in 1993 - released its V1000 chip based on a custom RISC architecture, and while not particularly fast, it was, for a few months, the only card that could run Quake (the hottest game of 1996) in hardware accelerated mode. The V1000 was an early market leader, alongside S3's laughably bad ViRGE (Video and Rendering Graphics Engine) which was infamously slower than software rendering on high-end CPUs at launch, and was reserved for high-volume OEM bargain-bin disaster machines. + +However, this was nothing compared to the body blow about to hit the entire industry, NVIDIA included. At a conference in early 1996, an $80,000 machine from SiliconGraphics, then the world leader in accelerated graphics, crashed during a demo by the then-CEO Ed McCracken. If accounts of the event are to be believed, while the machine rebooted, people who had heard rumors left the room and headed downstairs to another demo by a then-tiny company made up of ex-SGI employes calling itself "3D/fx" (later shortened to 3dfx), claiming comparable graphics quality for $250... with demos to prove it. As with many cases of supposed "wonder innovations" in the tech industry, it was too good to be true, but when their card, the "Voodoo Graphics" was first released in the form of the "Righteous 3D" by Orchid in October 1996, it turned out to be true. Despite the fact that it was a 3D-only card and required a 2D card to be installed, and the fact it could not accelerate graphics in a window (which almost all other cards could do), performance was so high relative to other products (including the NV1) that it not only had rave reviews on its own but also kicked off a revolution in consumer 3D graphics, which especially caught fire when GLQuake was released in January 1997. + +The reasons for 3dfx being able to design such an effective GPU when all others failed were numerous. The price of RAM plummeted by 80% through 1996, allowing the Voodoo's estimated retail price to be cut from $1000 to $300; many of their staff members came from SiliconGraphics, perhaps the most respected and certainly the largest company in the graphics industry of that time[^sgi]; and while 3dfx used the proprietary Glide API, it also supported OpenGL and Direct3D. Glide was designed to be very similar to OpenGL while allowing for 3dfx to approximate standard graphical techniques, which, as well as their driver design - the Voodoo only accelerates edge interpolation[^edge], texture mapping and blending, span interpolation[^span], and final presentation of the rendered 3D scene - the rest was all done in software. All of these factors were key in what proved to be an exceptionally low price for what was considered to be an exceptionally high quality for the time of the card. + +[^sgi]: By 1997, SGI had over 15 years of experience in developing graphical hardware, while also suffering from rampant mismanagement in what would prove to be their terminal decline. + +[^edge]: Where a triangle is converted into "spans" of horizontal lines, and the positions of nearby vertexes are used to determine the span's start and end positions. + +[^span]: To simplify a complex topic, in a GPU of this era, span interpolation generally involves z-buffering (also known as depth buffering), sorting polygons back to front, and color buffering, storing the color of each pixel sent to the screen in a buffer which allows for blending and alpha transparency. + +Effectively, NVIDIA had to design a graphics architecture that could at the very least get close to 3dfx's performance, on a shoestring budget and with very little resources, as 60% of their staff (including the entire sales and marketing teams) had been laid off to preserve money. They could not do a complete redesign of the NV1 from scratch if they felt the need to, as it would take two years (time they simply didn't have) and any design that came out of this effort would be immediately obsoleted by competitors, such as 3dfx's Voodoo series, and ATI's Rage which was initially rather pointless but rapidly advancing in performance and driver stability. The chip would also have to work reasonably well on the first tapeout, as there was no capital to produce more revisions of the chip. The fact NVIDIA were able to achieve a successful design in the form of the NV3 under such conditions was a testament to the intelligence, skill and luck of their designers; we will explore how they managed to achieve this later on this write-up. + +### The NV3 + +It was with these financial, competitive and time constraints in mind that design on the NV3 began in 1996. This chip would eventually be commercialised as the RIVA 128, standing for "Real-time Interactive Video and Animation accelerator" followed by a nod to its 128-bit internal bus width which was very large at the time. NVIDIA retained SGS-Thomson (soon to be STMicroelectronics) as their manufacturing partner, in exchange for SGS-Thomson cancelling their competing STG-3001 GPU. In a similar vein to the NV1, NVIDIA was to sell the chip as "NV3" and SGS-Thomson was to white-label it as STG-3000, once again separated by audio functionality; however, NVIDIA convinced SGS-Thomson to cancel their own part and stick to manufacturing the NV3 instead, which would prove to be a terrible decision when NVIDIA dropped them in favor of TSMC for manufacturing of the RIVA 128 ZX due to both yield issues and pressure from venture capital funders. STMicro went on to manufacture PowerVR chips for a few more years, before dropping out of the market entirely by 2001. + +After the NV2 disaster, the company made several calls on the NV3's design that turned out to be very good decisions. First, they acquiesced to Sega's advice (which they might have already done to save the Mutara V08/NV2 but it was too late) and moved to an inverse texture mapping triangle-based model, although some remnants of the original quad patching design remain. The unused DRM functionality was also remove, which may have been assisted by the replacement of Curtis Priem with David Kirk[^dkirk] as chief designer, as Priem insisted on including the DRM functionality with the NV1, citing piracy issues with the game he had written as a demo of the Malachowsky-designed GX GPU back when he worked at Sun. + +[^dkirk]: The rather egg-shaped David Kirk is perhaps notable as a "Special Thanks" credit on *Gex* and the producer of the truly unparalleled *3D Baseball* on the Sega Saturn during his time at Crystal Dynamics. + +Another decision that turned out to pay very large dividends was deciding to forgo a native API entirely and build the card around accelerating the most popular graphical APIs, which led to an initial focus on Direct3D, although OpenGL drivers were first publicly released in alpha form in December 1997 and fully in early 1998. DirectX 3.0 was the initial target, and after 4.0 was [cancelled due to lack of developer interest in its new functionality](https://devblogs.microsoft.com/oldnewthing/20040122-00/?p=40963), 5.0 came out late during development of the chip, which turned out to be mostly compliant, with the exception of some blending modes such as additive blending which Jensen Huang later claimed was due to Microsoft not giving them the specification in time. This compliance was made much easier by the design of their driver, which allowed (and still allows) graphical APIs to be plugged in as "clients" to the Resource Manager kernel; as I mentioned earlier, this will be explained in full detail later. + +The VGA core, previously so separate from the main GPU on the NV1 that it had its own PCI ID, was replaced by a new one licensed from Weitek who would soon exit the graphics market. The core was placed in the chip parallel to the main GPU with its own 32-bit bus, which massively accelerated performance in unaccelerated VESA titles such as *Doom*, and provided a real advantage over the 3D-only 3dfx cards, especially as their combination SST-96 or Voodoo Rush card used a questionable Alliance chip and was generally considered a failure. Finally, Huang, in his capacity as the CEO, allowed the chip to be expanded (in terms of physical size and number of gates) from its original specification, allowing for a more complex design with more features. + +The initial revision of the architecture appears to have been completed in January 1997. Then, aided by hardware simulation software (an actual hardware simulation unlike the NV0) purchased from another almost-bankrupt company, an exhaustive test set was completed. The first bug presented itself almost immediately when the "C" character in the MS-DOS codepage appeared incorrectly, Windows took 15 minutes to boot, and moving the mouse cursor required a map of the screen so you didn't lose it by moving too far, but ultimately the testing was completed. However, NVIDIA didn't have the money to respin the silicon for a second stepping if problems appeared, so it had to work at least reasonably well in the first stepping. + +### RIVA 128 + +Luckily for NVIDIA, the NV3 chip worked well enough to be sold to their board partners (almost certainly thanks to that hardware simulation package), and the company survived. Most accounts indicate they were only three or four weeks away from bankruptcy; when 3dfx saw the RIVA 128 at its reveal at the CGDC 1997 conference, one founder responded with "you guys are still around?", considering 3dfx almost *bought* NVIDIA effectively for the purpose of killing the company as a theoretical competitor, but NVIDIA refused as they assumed they would be bankrupt within months anyway. However, this revision A of the chip was not the one NVIDIA actually commercialised; SGS-Thomson dropped the plans for the STG-3000 at some point, which led NVIDIA, now flush with cash[^revenue], to create a new revision of the chip to remove the sound functionality (although some parts remained), fix some errata and make other minor adjustments to the silicon. The chip was respun, with the revision B silicon being completed in October 1997 and presumably available a month or two later; it is most likely that some revision A cards were sold at retail, but based on the dates, these would have to be very early units[^stb], with the earliest NVIDIA RIVA 128 drivers that I have discovered (labelled as "Version 0.75" and also doubling as the only NV1 drivers for Windows NT) being dated August 1997, and reviews starting to drop on websites such as AnandTech in the first half of September 1997. There are no known drivers available for the audio functionality in the revision A RIVA 128, so anyone wishing to use it would have to write custom drivers. + +[^revenue]: NVIDIA's revenue in the first nine months of 1997 was only $5.5 million, but skyrocketed up to $23.5 million in the last three months, which correspond to the first three months of the RIVA 128's availability, owing to the numerous sales of chips to add-in board partners. + +[^stb]: While there are mentions of quality problems with early cards in a lawsuit involving STB Systems, the RIVA 128's first OEM partner, it is not clear if the problems were on STB or NVIDIA's end. + +The chip was generally well-reviewed at its launch and considered as the fastest graphics chip released in 1997, beating the Voodoo1 in raw speed but not output video quality, most likely due to NVIDIA's financial situation leading to rushed development of the chip with shortcuts taken in the design process in order to ship on time. Examples of this lower quality include the lack of support for some of Direct3D 5.0's blending modes, and the use of per-polygon mipmapping[^mipmap] instead of the more accurate per-pixel approach, causing seams between different mipmapping layers; the dithering and bilinear texture filtering quality were often criticised as well, and some games exhibited seams between polygons. Furthermore, the drivers were generally very rough at launch, especially if the graphics card was an upgrade and previous drivers were not; while NVIDIA were able to fix many driver issues by the 3.xx versions released in 1998 and 1999, and even wrote a fairly decent OpenGL ICD, the standards for graphical quality had risen over time and what was considered "decent" in 1997 was considered to be "bad" and even "awful" by 1999. + +[^mipmap]: Mipmapping is a graphical technique involving scaling down textures as you move away from an object in order to prevent shimmering. + +Nevertheless, over a million RIVA 128 units sold within a few months, and NVIDIA's immediate existence as a company was secured; an enhanced version (revision C, also called "NV3T") was released in March 1998 as the RIVA 128 ZX, in order to compete with the Intel/Lockheed Martin i740, a chip which was hyped as being very fast on paper but turned out to be not very good, leading to Intel starting their long line of sub-par integrated GPUs before finally returning to the discrete market in recent years under the Arc brand, with the current Battlemage line being their *13th or 16th generation* product depending on who you ask. + +After all of this history and exposition, we are finally ready to actually explore the GPU behind the RIVA 128 series. I refer to it as NV3 as a catch-all term for all chips using this architecture, including the RIVA 128, RIVA 128 ZX, and the hypothetical STG-3000. Note that the 32-bit Weitek VGA core's architecture will not be discussed at length here unless absolutely required, as it is pretty much a standard SVGA core, and really is not that interesting compared to the main GPU; they're not even substantially integrated, although there are a few areas in the design that allow the main GPU to write directly to the Weitek's registers. ---