NPU
Neural Processing Unit — a dedicated co-processor optimized for the matrix math of neural network inference. Found in modern smartphone SoCs, laptop CPUs, and Copilot+ PCs.
An NPU executes ML workloads at much lower power than a CPU or GPU can. Smartphone NPUs run features like computational photography, on-device translation, and voice recognition continuously without draining battery.
In laptops
- Apple Neural Engine — 38 TOPS in M5 generation.
- Qualcomm Hexagon (X Elite/X2) — 45–80 TOPS.
- Intel AI Boost (Lunar Lake / Panther Lake) — 48+ TOPS.
- AMD XDNA 2 (Ryzen AI 300/400) — 50+ TOPS.
What it enables
Windows Studio Effects (background blur, eye contact, auto-framing) run on the NPU instead of the CPU/GPU, freeing those for productive work. On macOS, Live Captions, dictation, and image segmentation all run on Neural Engine.
NPU vs GPU
NPUs are more efficient per watt; GPUs are far higher throughput. Heavy generative AI workloads (Stable Diffusion, large LLMs) still favor a GPU.