Just Published: NVIDIA Fully-Fused Network Execution

A 2021 NVIDIA application fuses a small network's layers into one GPU kernel for speed. Published research, decoded at the claim level.

Here's what published — and published is not granted. Application US20220284658A1, "Fully-Fused Neural Network Execution," lists inventors Thomas Müller, Nikolaus Binder, Fabrice Rousselle, Jan Novák, and Alexander Keller — NVIDIA Research names well known for real-time neural graphics. The CPC codes are graphics-and-compute classes G06T 15/06, G06T 15/005, and crucially G06N 3/10 (neural-network implementation/hardware).

The mechanism is kernel fusion. Running a neural network normally launches many separate GPU operations — one per layer — each paying overhead to read and write memory. "Fully fused" execution collapses an entire small network into a single GPU kernel so intermediate results stay in fast on-chip memory and never round-trip to slower global memory. For tiny networks evaluated millions of times — exactly the case in neural graphics and real-time rendering — that fusion is the difference between interactive and unusable.

“A fully-connected neural network may be configured for execution by a processor as a fully-fused neural network by limiting slow global memory accesses to reading and writing inputs to and outputs from the fully-connected neural network.”— U.S. Patent Application 2022/0284658 A1 source

Because this is a publication, treat the claims as sought, not allowed. The right framing is what NVIDIA is pursuing: ownership of an execution technique that makes small networks run at graphics-pipeline speed on its GPUs. That ties the method directly to NVIDIA's hardware advantage, which is the strategic logic behind filing it.

On scope, the publication caveat is the headline. This is an A1 application; the enforceable boundary will be whatever claim set issues, if one does. Calling it a granted patent, or describing it as covering all network-on-GPU execution, would be wrong on both the legal status and the scope.

The takeaway: US20220284658A1 shows NVIDIA patenting at the seam between machine learning and graphics — the execution layer where its silicon expertise is the moat — with marquee research inventors and a method tuned to make small networks fly on a GPU.

Just Published: NVIDIA's Application on Fully-Fused Neural-Network Execution (2021)

Comments