NVIDIA GeForce RTX 4090 Graphics Card Specs, Performance, Price & Availability

Product Info

NVIDIA GeForce RTX 4090

2022

Manufacturer

NVIDIA

Type

Graphics Card

Platforms

Expected Price

$1599 US

Expected Release Date

2022

NVIDIA GeForce RTX 4090 graphics card is going to be the next-gen flagship for the green team, ushering in performance levels never before seen in the PC gaming segment, and here's everything from specs, price, and performance that you need to know.

NVIDIA GeForce RTX 4090 - The Next-Generation BFGPU For The Ultimate Gamer

[Launched- 11/10/22]

[Our Review]

The NVIDIA GeForce RTX 3090 series proved that the green team can go to extreme lengths to secure their lead in the PC graphics segment. Labeled as 'BFGPU', a new breed of enthusiast & ultimate graphics card, these provide the best performance possible with the best possible PC gaming features in a package that's next to none.

NVIDIA's direction with the BFGPU was to design a graphics card not just for the ultimate gamer but also for professional content creators too who also want to have the best graphics performance at hand to power the next generation of AAA gaming titles with superb visuals and insane fluidity. It's not just the FPS that matters these days, it's visuals, and a smoother frame rate too and this is exactly what the GeForce RTX 30 series is made to excel at.

We should expect similar things with the next-generation flagship too but an important factor to consider is that GPUs are becoming more power-hungry and more pricey. It is a trend that might continue into the future as we get better products but in return, there's always a cost to pay for end consumers. So starting with what we know so far, first we should take a look at the brand new Ada Lovelace or AD10* class GPUs that will be powering the next-gen GeForce RTX 40 series cards.

You can also read the expected specs, prices, and performance of other upcoming RTX 40 GPUs in the posts below:

NVIDIA's AD102 'Ada Lovelace' GPU - The Next-Gen Powerhouse

The NVIDIA Ada Lovelace AD102 GPU features up to 12 GPC (Graphics Processing Clusters). These are 5 more SMs compared to the Ampere GA102 GPUs. Each GPU will consist of 6 TPCs and 2 SMs which is the same configuration as the existing chip. Each SM (Streaming Multiprocessor) will house four sub-cores which is also the same as the GA102 GPU. What's changed is the FP32 & the INT32 core configuration. Each sub-core will include 64 FP32 units but combined FP32+INT32 units will go up to 128. This is because half of the FP32 units don't share the same sub-core as the IN32 units. The 64 FP32 cores are separate from the 128 INT32 cores.

So in total, each sub-core will consist of 16 FP32 plus 16 INT32 units for a total of 32 units. Each SM will have a total of 64 FP32 units plus 64 INT32 units for a total of 128 units. And since there are a total of 144 SM units (12 per GPC), we are looking at a total of 18,432 cores. Each SM will also include two Wrap Schedules (32 thread/CLK) for 64 wraps per SM & their own L0 i-cache. This is a 33% increase in Wraps/Threads vs the GA102 GPU. The Register file size is 16,384 across a 32-bit lane. Each SM also carries its own 128 KB of L1 data cache and shared memory so that's 18 MB of L1 cache.

Moving over to the cache, this is another segment where NVIDIA has given a big boost over the existing Ampere GPUs. The L2 cache will be increased to 96 MB as mentioned in the leaks. This is a 16x increase over the Ampere GPU that hosts just 6 MB of L2 cache. The cache will be shared across the GPU. The GPU will also feature up to 192 ROPs for the full-die.

There are also going to be the latest 4th Generation Tensor and 3rd Generation RT (Raytracing) cores infused on the Ada Lovelace GPUs which will help boost DLSS & Raytracing performance to the next level. Overall, the Ada Lovelace AD102 GPU will offer:

71% More GPCs (Versus Ampere)
71% More Cores (Versus Ampere)
50% More L1 Cache (Versus Ampere)
16x More L2 Cache (Versus Ampere)
71% More ROPs (Versus Ampere)
4th Gen Tensor & 3rd Gen RT Cores

The full die has not been featured on any GPU so far, not even the L40 which has 2 SMs disabled. It is likely that as yields progress, we will eventually see a gaming and workstation product using the full-fat AD102. Till then, the RTX 4090 is the top gaming graphics card while the RTX 6000 Ada is the top workstation solution.

NVIDIA AD102 'Ada Lovelace' Gaming GPU Block Diagram:

NVIDIA AD102 'Ada Lovelace' Gaming GPU 'SM' Block Diagram:

NVIDIA GeForce RTX 4090

82.6 TFLOPS of peak single-precision (FP32) performance
165.2 TFLOPS of peak half-precision (FP16) performance
660.6 Tensor TFLOPS
1321.2 Tensor TFLOPs with sparsity
191 RT-TFLOPs

At the heart of the NVIDIA GeForce RTX 4090 graphics card lies the Ada Lovelace AD102 GPU. The GPU measures 608,4mm2 and will utilize the TSMC 4N process node which is an optimized version of TSMC's 5nm (N5) node designed for the green team. The GPU features an insane 76.3 Billion transistors.

NVIDIA Ampere "GeForce RTX 30" GPUs Full Breakdown:

Graphics Card	NVIDIA GeForce RTX 2070 SUPER	NVIDIA GeForce RTX 3070	NVIDIA GeForce RTX 2080	NVIDIA GeForce RTX 3080	NVIDIA Titan RTX	NVIDIA GeForce RTX 3090
GPU Codename	TU106	GA104	TU104	GA102	TU102	GA102
GPU Architecture	NVIDIA Turing	NVIDIA Ampere	NVIDIA Turing	NVIDIA Ampere	NVIDIA Turing	NVIDIA Ampere
GPCs	5 or 6	6	6	6	6	7
TPCs	20	23	23	34	36	41
SMs	40	46	46	68	72	82
CUDA Cores / SM	64	128	64	128	64	128
CUDA Cores / GPU	2560	5888	2944	8704	4608	10496
Tensor Cores / SM	8 (2nd Gen)	4 (3rd Gen)	8 (2nd Gen)	4 (3rd Gen)	8 (2nd Gen)	4 (3rd Gen)
Tensor Cores / GPU	320 (2nd Gen)	184 (3rd Gen)	368	272 (3rd Gen)	576 (2nd Gen)	328 (3rd Gen)
RT Cores	40 (1st Gen)	46 (2nd Gen)	46 (1st Gen)	68 (2nd Gen)	72 (1st Gen)	82 (2nd Gen)
GPU Boost Clock (MHz)	1770	1725	1800	1710	1770	1695
Peak FP32 TFLOPS (non-Tensor)	9.1	20.3	10.6	29.8	16.3	35.6
Peak FP16 TFLOPS (non-Tensor)	18.1	20.3	21.2	29.8	32.6	35.6
Peak BF16 TFLOPS (non-Tensor)	NA	20.3	NA	29.8	NA	35.6
Peak INT32 TOPS (non-Tensor)	9.1	10.2	10.6	14.9	16.3	17.8
Peak FP16 Tensor TFLOPS with FP16 Accumulate	72.5	81.3/162.6	84.8	119/238	130.5	142/284
Peak FP16 Tensor TFLOPS with FP32 Accumulate	36.3	40.6/81.3	42.4	59.5/119	65.2	71/142
Peak BF16 Tensor TFLOPS with FP32 Accumulate	NA	40.6/81.3	NA	59.5/119	NA	71/142
Peak TF32 Tensor TFLOPS	NA	20.3/40.6	NA	29.8/59.5	NA	35.6/71
Peak INT8 Tensor TOPS	145	162.6/325.2	169.6	238/476	261	284/568
Peak INT4 Tensor TOPS	290	325.2/650.4	339.1	476/952	522	568/1136
Frame Buffer Memory Size and Type	8 GB GDDR6	8 GB GDDR6	8 GB GDDR6	10 GB GDDR6X	24 GB GDDR6	24 GB GDDR6X
Memory Interface	256-bit	256-bit	256-bit	320-bit	384-bit	384-bit
Memory Clock (Data Rate)	14 Gbps	14 Gbps	14 Gbps	19 Gbps	14 Gbps	19.5 Gbps
Memory Bandwidth	448 GB/sec	448 GB/sec	448 GB/sec	760 GB/sec	672 GB/sec	936 GB/sec
ROPs	64	96	64	96	96	112
Pixel Fill-rate (Gigapixels/sec)	113.3	165.6	115.2	164.2	169.9	193
Texture Units	160	184	184	272	288	328
Texel Fill-rate (Gigatexels/sec)	283.2	317.4	331.2	465	509.8	566
L1 Data Cache/Shared Memory	3840	5888	4416 KB	8704 KB	6912 KB	10496 KB
L2 Cache Size	4096 KB	4096 KB	4096 KB	5120 KB	6144 KB	6144 KB
Register File Size	10240 KB	11776 KB	11776 KB	17408 KB	18432 KB	20992 KB
TGP (Total Graphics Power)	215 Watts	220W	225W	320W	280W	350W
Transistor Count	13.6 Billion	17.4 Billion	13.6 Billion	28.3 Billion	18.6 Billion	28.3 Billion
Die Size	545 mm2	392.5 mm2	545 mm2	628.4 mm2	754mm2	628.4 mm2
Manufacturing Process	TSMC 12 nm FFN (FinFET NVIDIA)	Samsung 8 nm 8N NVIDIA Custom Process	TSMC 12 nm FFN (FinFET NVIDIA)	Samsung 8 nm 8N NVIDIA Custom Process	TSMC 12 nm FFN (FinFET NVIDIA)	Samsung 8 nm 8N NVIDIA Custom Process

NVIDIA Ada GPUs - AD102, AD103, AD104 For The First Wave of Gaming Cards

NVIDIA is first introducing three brand new Ada GPUs which include the AD102, AD103 & AD104. The AD102 GPU is going to be featured on the GeForce RTX 4090, the AD103 is going to be used by the GeForce RTX 4080 16 GB graphics cards and the AD104 GPU is going to be featured on the GeForce RTX 4080 12 GB graphics cards.

The Ada GPUs are based on the TSMC 4N process node which is a custom process designed exclusively for NVIDIA. It is essentially an optimized version of the N5 (5nm) process, offering drastic increases in transistors, cores, and frequency. The top AD102 GPU packs 70% more cores and also offers 76.3 Billion transistors while offering over 2x the performance per watt.

NVIDIA Ada AD102 GPU

The full AD102 GPU is made up of 12 graphics processing clusters with 12 SM units on each cluster. That makes up 144 SM units for a total of 18432 cores, 144 RT cores, 576 Tensor Cores, 576 Texture Units, and a 384-bit bus interface in a 76.3 billion transistor package measuring 608,5mm2.

NVIDIA GeForce RTX 4090 Graphics Cards Specifications

The NVIDIA GeForce RTX 4090 will use 128 SMs of the 144 SMs for a total of 16,384 CUDA cores. The GPU will come packed with 96 MB of L2 cache and a total of 384 ROPs which is simply insane but considering that the RTX 4090 is a cut-down design, it may feature slightly lower L2 and ROP counts. The clock speeds are not confirmed yet but considering that the TSMC 4N process is being used. The clock speeds are rated at up to 2.6 GHz and NVIDIA is claiming over 3 GHz speeds with overclocking which you can read more about here.

As for memory specs, the GeForce RTX 4090 will feature 24 GB GDDR6X capacities that will be clocked at 21 Gbps speeds across a 384-bit bus interface. This will provide up to 1 TB/s of bandwidth. This is the same bandwidth as the existing RTX 3090 Ti graphics card and as far as the power consumption is concerned, the TBP is rated at 450W. The card will be powered by a single 16-pin connector which delivers up to 600W of power. Custom models will be offering higher TBP targets.

NVIDIA Founders Edition Designed To Utilize Up To 600W of Power For Higher Overclocking

As for its brand new Founders Edition cards, the GeForce RTX 4090 24 GB and RTX 4080 16 GB, NVIDIA has produced a compact PCB similar to the ones we saw on the previous generation & designing a PCB like this helps improve airflow and cooling performance.

NVIDIA says that they have further optimized the Dual Axial Flow Through system, increasing fan sizes and fin volume by 10%, offering 20% higher air flow and upgrading to a 23-phase power supply (20+3 Phase for RTX 4090). Memory temperatures are reduced, and the new, substantially more powerful Ada GPUs are kept cool in ventilated cases, giving gamers excellent overclocking headroom. NVIDIA went through a rigorous testing procedure and is said to have evaluated as many as 50 fan designs before finalizing the one we are getting on the new cards. The cooler is used to dissipate heat from the heatsink assembly that comprises a vapor chamber, a big jump from the previous design too.

The NVIDIA GeForce RTX 4080 also uses the same cooler as the RTX 4090 Founders Edition and since it has a lower TDP, it should deliver even better thermal performance.

Each GeForce RTX 40 Series Founders Edition graphics card reduces cable clutter by leveraging the new standard GPU power input of next-gen ATX 3.0 power supplies, the PCIe Gen-5 16-pin Connector. This enables you to power GeForce RTX 40 Series graphics cards with just a single cable, improving the aesthetics of your build. If you are using a previous-gen power supply, an adapter cable is included in the box, allowing you to plug in three 8-pin power connectors, with an optional fourth connector for more overclocking headroom. ATX 3.0 power supplies will be available in October from ASUS, Cooler Master, FSP, Gigabyte, iBuyPower, MSI, and ThermalTake, with more models to come.

One advantage of the new 16-pin connector is that while the Founders Edition cards are designed at 450W & 320W, respectively, they can utilize the extra headroom provided through the new connector for extreme overclocking with the RTX 4090 going for that full 600W mark. The new power delivery also gives the RTX 40 series a 10x increase in response time to power transient management compared to the previous generation.

The new cards also feature DP 1.4a (4K 12-bit HDR @ 240Hz) and HDMI 2.1 (4K 120Hz HDR / 8K 60Hz HDR). All cards are compliant with the PCIe Gen 4 interface on existing motherboards and also feature full compliance with the Resizable-BAR technologies.

NVIDIA GeForce RTX 4090 Founders Edition PCB:

Next-Gen Micron GDDR6X Dies Run 10C Cooler Thanks To New Process Node

NVIDIA has also leveraged Micron's latest GDDR6X memory chips for its GeForce RTX 40 graphics cards which run 10C cooler, are more power efficient and since they are all 16Gb DRAM dies, they can be fused on one side of the PCB to be cooled better than dual-sided memory.

NVIDIA GeForce RTX 4090 Graphics Cards Performance

The NVIDIA GeForce RTX 4090 is the first gaming card to hit the 100 TFLOPs compute horsepower limit.

So we decided it was time to test how far we can push the NVIDIA GeForce RTX 4090 Founders Edition with some overclocking. To get to 100 TFLOPs, we first pushed the power limit and temp limit slider all the way to the max and upped the Core and Memory clocks by +275 and +1100 MHz, respectively. This wasn't enough as the card was being limited by its power design. That is when we landed our hands on MSI's latest Afterburner which allowed us to raise the core voltages. At 100%, we saw some performance regression so we had to stick with +55% which showed us some good results.

With the overclock applied on our NVIDIA GeForce RTX 4090 graphics card, we saw a maximum GPU core clock of 3150 MHz on the AD102 Ada GPU, a maximum power draw of 547W and our temps peaked at 69C. All of this was done on air and with no exotic liquid cooling, chillers or LN2 were used.

And behold, we saw the magical number of not 100 but almost 101 TFLOPs right in front of our eyes. To put things into perspective, this is a 22% compute boost over the stock RTX 4090 and a 2.5x compute performance boost over the RTX 3090 Ti. The AD102 GPU also ripped apart the data-center-focused Hopper H100 GPUs by offering over 50% better FP32 performance. Ada Lovelace is truly a game changer and we can definitely see it become a popular compute and AI graphics card when Quadro variants of the said chip launch as the RTX 6000 ADA and L60.

FP32 Compute Horsepower Comparisons (Higher is Better)

Compute Power

120

160

200

240

120

160

200

240

RTX 4090 OC

RTX 4090 Stock

RTX 3090 Ti

RX 6900 XTX

Xbox Series X

PlayStation 5

This will be a 2x compute performance uplift for each graphics card versus its predecessor and this is without even factoring in the RT and Tensor core performance which are expected to get major lifts too in their respective department. Now FLOPs aren't necessarily reflective of the graphics or gaming performance but they do provide a metric that can be used for comparison. A 2-2.5x gain over the RTX 3090 & RTX 3090 Ti would be very disruptive and it makes sense why NVIDIA is going so hard with higher power limits on their cards.

Gamers should expect 4K gaming to be buttery smooth on these graphics cards and with DLSS, we might even see playable 60 FPS at 8K resolution which is something that NVIDIA has been trying to achieve with its RTX 3090 series BFGPUs for a while now.

NVIDIA GeForce RTX 4090 Graphics Cards Price & Availability

Now coming to the prices, the NVIDIA GeForce RTX 3090 Ti & RTX 3090 graphics cards are undoubtedly the most expensive single-chip GPUs to date. The NVIDIA GeForce RTX 4090 will come at a price of $1599 US for the Founders Edition variant and will be available on the 12th of October.

NVIDIA GeForce RTX 40 Series Official Specs:

Graphics Card Name	NVIDIA GeForce RTX 4090	NVIDIA GeForce RTX 4090 D	NVIDIA GeForce RTX 4080	NVIDIA GeForce RTX 4070 Ti	NVIDIA GeForce RTX 4070	NVIDIA GeForce RTX 4060 Ti	NVIDIA GeForce RTX 4060
GPU Name	Ada Lovelace AD102-300	Ada Lovelace AD102-250	Ada Lovelace AD103-300	Ada Lovelace AD104-400	Ada Lovelace AD104-250	Ada Lovelace AD106-350	Ada Lovelace AD107-400
Process Node	TSMC 4N	TSMC 4N	TSMC 4N	TSMC 4N	TSMC 4N	TSMC 4N	TSMC 4N
Die Size	608mm2	608mm2	378.6mm2	294.5mm2	294.5mm2	190.0mm2	146.0mm2
Transistors	76 Billion	76 Billion	45.9 Billion	35.8 Billion	35.8 Billion	22.9 Billion	TBD
CUDA Cores	16384	14592	9728	7680	5888	4352	3072
TMUs / ROPs	512 / 176	TBD	320 / 112	240 / 80	184 / 64	136 / 48	TBD
Tensor / RT Cores	512 / 128	456 / 128	304 / 76	240 / 60	184 / 46	136 / 34	TBD
L2 Cache	72 MB	72 MB	64 MB	48 MB	36 MB	32 MB	24 MB
Base Clock	2230 MHz	2280 MHz	2210 MHz	2310 MHz	1920 MHz	2310 MHz	1830 MHz
Boost Clock	2520 MHz	2520 MHz	2510 MHz	2610 MHz	2475 MHz	2535 MHz	2460 MHz
FP32 Compute	83 TFLOPs	TBD	49 TFLOPs	40 TFLOPs	29 TFLOPs	22 TFLOPs	15 TFLOPs
RT TFLOPs	191 TFLOPs	TBD	113 TFLOPs	82 TFLOPs	67 TFLOPs	51 TFLOPs	35 TFLOPs
Tensor-TOPs	1321 TOPs	TBD	780 TOPs	641 TOPs	466 TOPs	353 TOPs	242 TOPs
Memory Capacity	24 GB GDDR6X	24 GB GDDR6X	16 GB GDDR6X	12 GB GDDR6X	12 GB GDDR6X	8-16 GB GDDR6	8 GB GDDR6
Memory Bus	384-bit	384-bit	256-bit	192-bit	192-bit	128-bit	128-bit
Memory Speed	21.0 Gbps	21.0 Gbps	23.0 Gbps	21.0 Gbps	21.0 Gbps	18.0 Gbps	17.0 Gbps
Bandwidth	1008 GB/s	1008 GB/s	736 GB/s	504 GB/s	504 GB/s	288 GB/s (554 GB/s Effective)	272 GB/s (453 GB/s Effective)
TBP	450W	425W	320W	285W	200W	160-165W	115W
Price (MSRP / FE)	$1599 US / 1949 EU	12,999 RMB (China-Only)	$1199 US / 1469 EU	$799 US	$599 US	$399-$499 US	$299 US
Price (Current)	$1599 US / 1859 EU	12,999 RMB (China-Only)	$1199 US / 1399 EU	$799 US	$599 US	$399-$499 US	$299 US
Launch (Availability)	12th October 2022	28th December 2023	16th November 2022	5th January 2023	13th April 2023	24th May / 18th July 2023	29th June 2023

Which NVIDIA GeForce RTX 40 series graphics card are you looking forward to the most?