Ollama: MLX Performance Boost and Model Updates

Six pull requests merged with significant MLX runner optimizations delivering 1.5% throughput improvements and better concurrent processing. Model recommendations updated to feature kimi-k2.6.

2026-04-21T00:00:00Z

Duration: PT1M55S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

Show: Ollama
Published: 2026-04-21T00:00:00Z
Audio duration: PT1M55S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Good morning, this is your Ollama development briefing for April 21st, 2026.

Jesse Gross merged MLX Sampler Improvements, adding logprobs support to the MLX runner and optimizing the sampling process. The changes avoid multiple sorts when both top-P and top-K filters are active, delivering a 1.5% generation throughput improvement with gemma4 models.

Gross also merged tokenization improvements that move prompt processing out of the GPU goroutine into individual request handlers. This allows CPU tokenization to happen concurrently while the GPU handles current requests, improving overall pipeline efficiency.

Parth Sareen merged a server fix enabling format constraints for gemma4 models when thinking mode is disabled. This addresses user blocking issues with model constraints.

Michael Verrilli merged capability detection fixes for the interactive TUI mode. The terminal interface was missing multimodal detection, causing image and audio files to be treated as unknown commands instead of valid attachments.

Matteo Celani merged a model picker fix that resolves stale model displays when switching between chats. The issue occurred when streaming messages stored model objects instead of…

Nearby episodes from Ollama

MLX Sampling Performance Enhancement 2026-04-25T00:00:00Z
OpenAI Reasoning Integration 2026-04-24T00:00:00Z
Launch System Improvements and Integration Fixes 2026-04-23T00:00:00Z
Launch System Overhaul and Documentation Updates 2026-04-22T00:00:00Z
New CLI Integration and Performance Improvements 2026-04-20T00:00:00Z
Weekly Recap - MLX Performance & Launch Integration Expansion 2026-04-20T00:00:00Z
MLX Sampler Improvements 2026-04-18T00:00:00Z
Windows WSL Integration Simplified 2026-04-17T00:00:00Z