Ollama: Cloud Models Get Smarter & Build Performance Boost
Today we're diving into a busy day with 6 merged PRs and 7 commits that brought some major improvements to Ollama! The team tackled cloud model handling, fixed XML parsing issues with GLM models, and made Docker builds way more efficient. Special shoutouts to the collaborative effort on cloud model stubs and Bruce MacDonald's clever fix for GLM tool call parsing.
Duration: PT3M51S
Episode overview
This episode is a short developer briefing from Ollama.
It explains recent repository work in plain language.
- Show: Ollama
- Published: 2026-03-07T11:18:50Z
- Audio duration: PT3M51S
Transcript excerpt
This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.
Hey there, amazing developers! Welcome back to another episode of the Ollama podcast. I'm your host, and wow, do we have a packed show for you today! Grab your favorite beverage because we're diving into some really exciting changes that dropped on March 6th and 7th.
So picture this: you're working with cloud models, and you're tired of having to pull stub files every single time. Well, the team heard you loud and clear! Jeffrey Morgan landed a pretty significant change that eliminates the need to pull stubs for cloud models. Now, I know what you're thinking - "didn't we hear…
But here's where it gets really interesting - Bruce MacDonald tackled one of those wonderfully specific real-world problems that make you go "oh, that's so clever!" The GLM models were being a bit... let's say quirky... with their XML formatting. They were leaving closing tags unclosed in tool calls, which was…
Now, if you're running Ollama in Docker - and let's be honest, many of you are - Daniel Hiltgen just made your builds way smarter. The old parallel settings were, in his words, "naive," and I love that honesty! The new approach uses Ninja for all Docker cmake builds with much more intelligent load…
Speaki…
Nearby episodes from Ollama
- Thinking Streams and Local Tool Power-ups
- Stability First - Error Handling and Performance Fixes
- MLX Gets a Major Upgrade and Web Search Goes Live
- Simplifying the Sampling Story
- Cloud Integrations Get Some Love
- Smarter Constraints and Qwen3.5 Boost
- Cloud Integration Drama and AI Model Expansion
- Smarter Sampling and Crash Prevention