Ollama: MLX Performance Breakthrough and Anthropic Search
The Ollama team delivered some impressive performance wins today with a major MLX runner overhaul that boosts GLM 4.7 Flash performance by 150%, plus enabling web search for Anthropic APIs. Patrick Devine led the MLX improvements while Parth Sareen added the Anthropic web search feature and fixed a PowerShell search bug.
Duration: PT3M50S
Episode overview
This episode is a short developer briefing from Ollama.
It explains recent repository work in plain language.
- Show: Ollama
- Published: 2026-02-14T11:01:55Z
- Audio duration: PT3M50S
Transcript excerpt
This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.
Hey there, developer friends! Welcome back to another episode of the Ollama podcast. I'm your host, and wow, do we have some exciting updates to dive into today - February 14th, 2026. Happy Valentine's Day, by the way! And speaking of love, you're going to love the performance improvements the team has been cooking up.
Let's jump right into the big story of the day - we've got three fantastic pull requests that just landed, and they're all about making your experience smoother and faster.
First up, Patrick Devine has been working some serious magic with the MLX runner, and the results are absolutely stunning. This is one of those PRs that makes you sit up and pay attention - we're talking about a 150 percent performance improvement for the GLM 4.7 Flash model. That's not a typo, folks - one hundred…
What Patrick did was dive deep into how the MLX runner handles the Safetensors-based GLM 4.7 Flash model. The key breakthrough was fixing how scalar data types were being handled. You know how sometimes the smallest details can have the biggest impact? This is a perfect example. By getting those data type operations…
But that's not all Patrick tackled in this monster pull request - and I…
N…