Ollama: Smarter Sampling and Crash Prevention

Jeffrey Morgan merged two key improvements today - a substantial enhancement to the sampling system with repeat-based sampling capabilities, and a crucial fix preventing crashes in the Qwen3Next model's DeltaNet when using split offloading. The team also collaborated with community contributor Yossi Ovadia on the crash fix.

Duration: PT3M49S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

  • Show: Ollama
  • Published: 2026-03-02T11:01:42Z
  • Audio duration: PT3M49S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Hey there, amazing developers! Welcome back to another episode of the Ollama podcast. I'm your host, and wow, do we have some exciting updates to dig into today, March 2nd, 2026. Grab your favorite beverage because we're talking about some really thoughtful improvements that are going to make your AI experiences…

Let's jump right into the big story of the day - Jeffrey Morgan just merged a fantastic enhancement to Ollama's sampling system. This is one of those changes that might sound technical at first, but it's actually pretty exciting when you think about what it means for your models.

So what's repeat-based sampling? Think of it like giving your AI a better memory about what it just said. You know how sometimes when you're talking, you might catch yourself repeating a word or phrase, and you naturally course-correct? That's essentially what this new sampling system does for language models. It…

The implementation here is really solid too. Jeffrey added 193 lines of new code across 8 files, touching everything from the core API types to the documentation, and even adding comprehensive tests. I love seeing changes that come with proper test coverage - that's exactly the kind of…

Wha…

Nearby episodes from Ollama

  1. Cloud Models Get Smarter & Build Performance Boost
  2. Cloud Integrations Get Some Love
  3. Smarter Constraints and Qwen3.5 Boost
  4. Cloud Integration Drama and AI Model Expansion
  5. Building Bridges for Better Model Compatibility
  6. MLX Runner Gets Rock Solid
  7. Tool Calling Gets Smarter
  8. Cleaner Shutdowns and Faster Startups