How Nvidia’s DLSS 3 works (and why AMD FSR can’t catch up for now)

Nvidia’s RTX 40-series graphics cards are arriving successful a fewer abbreviated weeks, but among each the hardware improvements lies what could beryllium Nvidia’s aureate egg: DLSS 3. It’s overmuch much than conscionable an update to Nvidia’s fashionable DLSS (Deep Learning Super Sampling) feature, and it could extremity up defining Nvidia’s adjacent generation overmuch much than the graphics cards themselves.

AMD has been moving hard to get its FidelityFX Super Resolution (FSR) connected par with DLSS, and for the past respective months, it’s been successful. DLSS 3 looks similar it volition alteration that dynamic — and this time, FSR whitethorn not beryllium capable to drawback up anytime soon.

How DLSS 3 works (and however it doesn’t)

A illustration  showing however  Nvidia's DLSS 3 exertion   works.Nvidia

You’d beryllium forgiven for reasoning that DLSS 3 is simply a wholly caller mentation of DLSS, but it’s not. Or astatine least, it’s not wholly new. The backbone of DLSS 3 is the aforesaid super-resolution exertion that’s disposable successful DLSS titles today, and Nvidia volition presumably proceed improving it with caller versions. Nvidia says you’ll spot the super-resolution information of DLSS 3 arsenic a abstracted enactment successful the graphics settings now.

The caller portion is framework generation. DLSS 3 volition make an wholly unsocial framework each different frame, fundamentally generating 7 retired of each 8 pixels you see. You tin spot an illustration of that successful the travel illustration below. In the lawsuit of 4K, your GPU lone renders the pixels for 1080p and uses that accusation for not lone the existent framework but besides the adjacent frame.

A illustration  showing however  DLSS 3 reconstructs frames.Nvidia

Frame generation, according to Nvidia, volition beryllium a abstracted toggle from ace resolution. That’s due to the fact that framework procreation lone works connected RTX 40-series GPUs for now, portion the ace solution volition proceed to enactment connected each RTX graphics cards, adjacent successful games that person updated to DLSS 3. It should spell without saying, but if fractional of your frames are wholly generated, that’s going to boost your show by a lot. 

Frame procreation isn’t conscionable immoderate AI concealed sauce, though. In DLSS 2 and tools similar FSR, question vectors are a cardinal input for the upscaling. They picture wherever objects are moving from 1 framework to the next, but question vectors lone use to geometry successful a scene. Elements that don’t person 3D geometry, similar shadows, reflections, and particles, person traditionally been masked retired of the upscaling process to debar ocular artifacts.

A illustration  shing question  done  Nvidia's DLSS 3.Nvidia

Masking isn’t an enactment erstwhile an AI is generating an wholly unsocial frame, which is wherever the Optical Flow Accelerator successful RTX 40-series GPUs comes into play. It’s similar a question vector, but the graphics paper is tracking the question of idiosyncratic pixels from 1 framework to the next. This optical travel field, on with question vectors, depth, and color, lend to the AI-generated frame.

It sounds similar each upsides, but there’s a large occupation with frames generated by the AI: they summation latency. The framework generated by the AI ne'er passes done your PC — it’s a “fake” frame, truthful you won’t spot it connected accepted fps readouts successful games oregon tools similar FRAPS. So, latency doesn’t spell down contempt having truthful galore other frames, and owed to the computational overhead of optical flow, the latency really goes up. Because of that, DLSS 3 requires Nvidia Reflex to offset the higher latency.

Normally, your CPU stores up a render queue for your graphics paper to marque definite your GPU is ne'er waiting for enactment to bash (that would origin stutters and framework complaint drops). Reflex removes the render queue and syncs your GPU and CPU truthful that arsenic soon arsenic your CPU tin nonstop instructions, the GPU starts processing them. When applied implicit the apical of DLSS 3, Nvidia says Reflex tin sometimes adjacent effect successful a latency reduction.

Where AI makes a difference

Microsoft Flight Simulator | NVIDIA DLSS 3 - Exclusive First-Look

AMD’s FSR 2.0 doesn’t usage AI, and arsenic I wrote astir a portion back, it proves that you tin get the aforesaid prime arsenic DLSS with algorithms alternatively of instrumentality learning. DLSS 3 changes that with its unsocial framework procreation capabilities, arsenic good arsenic the instauration of optical flow.

Optical travel isn’t a caller thought — it’s been astir for decades and has applications successful everything from video-editing applications to self-driving cars. However, calculating optical travel with instrumentality learning is comparatively caller owed to an summation successful datasets to bid AI models on. The crushed wherefore you’d privation to usage AI is simple: it produces less ocular errors fixed capable grooming and it doesn’t person arsenic overmuch overhead astatine runtime.

DLSS is executing astatine runtime. It’s imaginable to make an algorithm, escaped of instrumentality learning, to estimation however each pixel moves from 1 framework to the next, but it’s computationally expensive, which runs antagonistic to the full constituent of supersampling successful the archetypal place. With an AI exemplary that doesn’t necessitate a batch of horsepower and capable grooming information — and remainder assured, Nvidia has plentifulness of grooming information to enactment with — you tin execute optical travel that is precocious prime and tin execute astatine runtime.

That leads to an betterment successful framework complaint adjacent successful games that are CPU limited. Supersampling lone applies to your resolution, which is astir exclusively babelike connected your GPU. With a caller framework that bypasses CPU processing, DLSS 3 tin treble framework rates successful games adjacent if you person a implicit CPU bottleneck. That’s awesome and presently lone imaginable with AI.

Why FSR 2.0 can’t drawback up (for now)

FSR and DLSS representation  prime   examination  successful  God of War.

AMD has genuinely done the intolerable with FSR 2.0. It looks fantastic, and the information that it’s brand-agnostic is adjacent better. I’ve been acceptable to ditch DLSS for FSR 2.0 since I archetypal saw it successful Deathloop. But arsenic overmuch arsenic I bask FSR 2.0 and deliberation it’s a large portion of kit from AMD, it’s not going to drawback up to DLSS 3 immoderate clip soon.

For starters, processing an algorithm that tin way each pixel betwixt frames escaped of artifacts is pugnacious enough, particularly successful a 3D situation with dense good item (Cyberpunk 2077 is a premier example). It’s possible, but tough. The bigger issue, however, is however bloated that algorithm would request to be. Tracking each pixel done 3D space, doing the optical travel calculation, generating a frame, and cleaning up immoderate mishaps that hap on the mode — it’s a batch to ask.

Getting that to tally portion a crippled is executing and inactive providing a framework complaint betterment connected the level of FSR 2.0 oregon DLSS, that’s adjacent much to ask. Nvidia, adjacent with dedicated processors and a trained model, inactive has to usage Reflex to offset the higher latency imposed by optical flow. Without that hardware oregon software, FSR would apt commercialized excessively overmuch latency to make frames.

I person nary uncertainty that AMD and different developers volition get determination yet — oregon find different mode astir the occupation — but that could beryllium a fewer years down the road. It’s hard to accidental close now.

Coming Soon - GeForce RTX 4090 DLSS 3 First Look Teaser Trailer

What’s casual to accidental is that DLSS 3 looks precise exciting. Of course, we’ll person to hold until it’s present to validate Nvidia’s show claims and spot however representation prime holds up. So far, we conscionable person a abbreviated video from Digital Foundry showing disconnected DLSS 3 footage (above), which I’d highly urge watching until we spot further third-party testing. From our existent vantage point, though, DLSS 3 surely looks promising.

This nonfiction is portion of ReSpec – an ongoing biweekly file that includes discussions, advice, and in-depth reporting connected the tech down PC gaming.

