Ollama: Weekly Recap - MLX Performance & Launch Integrations

This week brought significant MLX runner optimizations with logprobs support and batched sampling, plus new Kimi CLI integration and improved OpenClaw onboarding flows. Performance improvements of up to 1.5% were achieved across multiple model types.

2026-04-27T00:00:00Z

Duration: PT2M44S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

Show: Ollama
Published: 2026-04-27T00:00:00Z
Audio duration: PT2M44S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Good morning. This is your Ollama weekly recap for April 20th through 27th, 2026.

16 PRs merged, 20 additional commits this week.

Starting with performance improvements. Jesse Gross delivered major MLX runner enhancements, introducing logprobs support that matches OpenAI semantics with full-vocabulary log-softmax calculations. The MLX sampler received significant optimization, combining top-P and top-K filters to avoid multiple sorts,…

Additional MLX optimizations include moving prompt tokenization from GPU to CPU threads, enabling parallel processing of requests. Thread safety improvements were made to array management using atomic operations and mutexes. The GLM4 MoE Lite model gained a fused sigmoid router, improving generation performance by…

On the integration front, Kimi CLI support was added to the launch system with full installer flow. OpenClaw integration received significant hardening, including bundled web search capabilities and improved onboarding reliability. The launcher now stages plugin dependencies under stable per-user directories and…

Model management saw several updates. The recommended models list now maintains fixed canonical ordering, with Kimi K2.6 replacing…

Nearby episodes from Ollama

Metal GPU Stability and Gemma4 Updates 2026-04-30T00:00:00Z
Launch Experience Improvements and Model Recommendations 2026-04-29T00:00:00Z
Multi-Sequence Batching and New Model Support 2026-04-28T00:00:00Z
Tokenizer Bug Fix for BPE Processing 2026-04-27T00:00:00Z
MLX Sampling Performance Enhancement 2026-04-25T00:00:00Z
OpenAI Reasoning Integration 2026-04-24T00:00:00Z
Launch System Improvements and Integration Fixes 2026-04-23T00:00:00Z
Launch System Overhaul and Documentation Updates 2026-04-22T00:00:00Z