Ollama: Nemotron Architecture Lands with Unified Cache Vision

Jeffrey Morgan merged a massive pull request adding Nemotron architecture support to Ollama, bringing over 3,000 lines of new code across 22 files. This foundational change introduces a unified recurrent cache system that paves the way for supporting multiple advanced architectures like Qwen3.5 and LFM models.

Duration: PT3M49S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

  • Show: Ollama
  • Published: 2026-02-23T11:03:10Z
  • Audio duration: PT3M49S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Hey there, code friends! Welcome back to another episode of the Ollama podcast. I'm so excited to be here with you on this beautiful February 23rd morning, and wow, do we have some incredible progress to dive into today. Grab your favorite beverage because we're talking about some seriously impressive architectural…

So picture this - you're building a house, and instead of just adding another room, you decide to completely reimagine the foundation to support not just your current needs, but three different house styles you want to build in the future. That's exactly what Jeffrey Morgan accomplished with yesterday's massive pull…

This isn't just any ordinary feature add, folks. We're talking about over three thousand lines of new code spread across twenty-two files. But here's what makes this really exciting - Jeffrey didn't just bolt on Nemotron support. He took a step back and said, "You know what? Let's build something that's going to…

Let's talk about what actually landed in the codebase. We've got brand new converter files specifically for Nemotron, complete with comprehensive test suites - because good developers always write tests, right? There's a whole new kvcache package…

Wha…

Nearby episodes from Ollama

  1. Tool Calling Gets Smarter
  2. Cleaner Shutdowns and Faster Startups
  3. Qwen 3.5 Architecture Lands with Safety Upgrades
  4. Memory Management Revolution
  5. Fixing the WSL Plugin Problem
  6. Smarter UIs and Smoother Onboarding
  7. Tokenizer Consolidation & MLX Library Improvements
  8. Rolling Back and Rolling Forward