Ollama: Speculative Decoding and Codex App Updates

The Ollama team merged five pull requests focusing on MLX runner performance improvements through DFlash speculative decoding and several Codex app refinements including restart mechanisms and documentation updates.

Duration: PT0S

Episode overview

This episode is a short developer briefing from Ollama.

It explains recent repository work in plain language.

  • Show: Ollama
  • Published: 2026-05-15T10:01:04Z
  • Audio duration: PT0S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Good morning, this is your Ollama developer briefing for May 15th, 2026.

Patrick Devine merged a significant performance enhancement, adding DFlash speculative decoding to the MLX runner. This 1,900-line addition introduces block diffusion speculative decoding with support for Qwen 3.6 models, both mixture-of-experts and dense variants. The implementation includes draft model recurrent…

Parth Sareen contributed four merged pull requests centered around the Codex app. The first addressed restart reliability issues by implementing more robust restart mechanisms while maintaining existing safeguards. Sareen also updated UI copy across the launch commands and registry components, and made substantial…

The additional commits mirror these merged pull requests, with no standalone changes beyond the integrated work.

What's next: The team appears focused on stabilizing the Codex app for launch while continuing MLX runner optimizations. Performance testing of the new speculative decoding implementation will likely be a priority.

That's your Ollama update for today. Back tomorrow with more developer news.

Nearby episodes from Ollama

  1. Startup Performance Optimization
  2. Codex Integration Enhancement
  3. Weekly Recap - MLX Performance & Codex Integration
  4. Release Build Optimization
  5. MLX Sampler Overhaul and Codex Integration
  6. Vision Model Integration Enhancement
  7. MLX Threading and Claude Image Fixes
  8. Model Transfer Optimization and Test Reliability