n0xTry WebGPU Chat
WebGPU LLM In The Browser
N0X runs open-source chat models directly in supported browsers through WebGPU and MLC WebLLM. Models download once, cache locally, and stream responses without an account or hosted inference backend.
Private by default
Worker-based inference
Tiny model path