Ollama: Tokenizer Love and Better Model Support

Today we're diving into some fantastic tokenizer improvements that make Ollama even more versatile! Daniel Hiltgen delivered two key enhancements - adding SentencePiece-style BPE support for better model compatibility, and fixing a tokenizer configuration bug in the MLX pipeline. Plus, Parth Sareen updated the Pi integration docs to help more developers get started.

Duration: PT3M52S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

  • Show: Ollama
  • Published: 2026-04-01T10:00:33Z
  • Audio duration: PT3M52S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Hey there, fellow developers! Welcome back to another episode of the Ollama podcast. I'm your host, and wow, do we have some exciting updates to chat about today. Grab your favorite beverage because we're diving into some really cool tokenizer improvements that are going to make your AI models work even better.

So picture this - you know how different AI models sometimes have their own quirky ways of handling text? Well, the Ollama team has been hard at work making sure we can support even more of these models seamlessly. And today's changes are a perfect example of that dedication to compatibility and correctness.

Let's start with the star of the show - Daniel Hiltgen just merged a fantastic enhancement that adds SentencePiece-style BPE support to our tokenizer. Now, if you're thinking "wait, what's that?" - don't worry, I've got you covered. Byte Pair Encoding, or BPE, is basically how we break down text into smaller pieces…

The cool thing about this update is that some models use a special Unicode character - U+2581 - to represent spaces. It's like a secret code for spaces that certain models prefer. Daniel's implementation adds a new option called WithSentencePieceNormalizer…

The…

Nearby episodes from Ollama

  1. Weekly Recap - Gemma4 Integration & Audio Support
  2. Performance Lessons and Gemma4 Refinements
  3. Gemma4 Arrives with Audio Magic
  4. Modernizing Codex Configuration
  5. Legacy Compatibility and Developer Experience Wins
  6. Smoothing the Launch Experience
  7. Fixing the Inconsistencies That Matter
  8. Smart Caching and Better User Experience