Ollama Out-of-Bounds Read Vulnerability Allows Remote Process Memory Leak

hubie

from SoylentNews on 2026-05-20 18:06 (#75RQA)

"Fnord666" writes:

Ollama Out-of-Bounds Read Vulnerability Allows Remote Process Memory Leak:

Cybersecurity researchers have disclosed a critical security vulnerability in Ollama that, if successfully exploited, could allow a remote, unauthenticated attacker to leak its entire process memory.
The out-of-bounds read flaw, which likely impacts over 300,000 servers globally, is tracked as CVE-2026-7482 (CVSS score: 9.1). It has been codenamed Bleeding Llama by Cyera.
Ollama is a popular open-source framework that allows large language models (LLMs) to be run locally instead of on the cloud. On GitHub, the project has more than 171,000 stars and has been forked over 16,100 times.
"Ollama before 0.17.1 contains a heap out-of-bounds read vulnerability in the GGUF model loader," according to a description of the flaw in CVE.org. "The /api/create endpoint accepts an attacker-supplied GGUF file in which the declared tensor offset and size exceed the file's actual length; during quantization in fs/ggml/gguf.go and server/quantization.go (WriteTo()), the server reads past the allocated heap buffer."
GGUF, short for GPT-Generated Unified Format, is a file format that's used to store large language models so that they can be easily loaded and executed locally. It's analogous to other popular model saving formats like PyTorch .pt/.pth (based on Python's pickle module), safetensors, and Open Neural Network Exchange (ONNX).
The problem, at its core, stems from Ollama's use of the unsafe package when creating a model from a GGUF file, specifically in a function named "WriteTo()," thereby making it possible to execute operations that bypass the memory safety guarantees of the programming language.
In a hypothetical attack scenario, a bad actor can send a specially crafted GGUF file to an exposed Ollama server with the tensor's shape set to a very large number to trigger the out-of-bounds heap read during model creation using the /api/create endpoint. Successful exploitation of the vulnerability could leak sensitive data from the Ollama process memory.
This may include environment variables, API keys, system prompts, and concurrent users' conversation data. This data can be exfiltrated by uploading the resulting model artifact through the /api/push endpoint to an attacker-controlled registry.
[...] "On top of that, engineers often connect Ollama to tools like Claude Code. In those cases, the impact is even higher - all tool outputs flow to the Ollama server, get saved in the heap, and potentially end up in an attacker's hands."

Source	RSS or Atom Feed
Feed Location	https://soylentnews.org/index.rss
Feed Title	SoylentNews
Feed Link	https://soylentnews.org/
Feed Copyright	Copyright 2014, SoylentNews