PyTorch: Metal Shaders Get a Precision Fix

Today we're diving into a crucial Metal shader fix that resolves half-precision type mismatches, plus some exciting CPU performance improvements with new u8s8 support for integer matrix multiplication. We also saw some dynamic development with multiple reverts and re-implementations as the team iterates on opaque object support and dynamo optimizations.

Duration: PT4M10S

Episode overview

This episode is a short developer briefing from PyTorch.

It explains recent repository work in plain language.

  • Show: PyTorch
  • Published: 2026-03-12T10:07:53Z
  • Audio duration: PT4M10S

Transcript excerpt

This excerpt keeps the crawler page concise. Listen to the episode or use the RSS feed for the full update.

Hey there, amazing developers! Welcome back to another episode of the PyTorch podcast. I'm your host, and it's March 12th, 2026. Grab your coffee because we've got some really interesting updates from the PyTorch world - including a super important fix that's going to make Metal shader development so much smoother.

Let's jump right into our main story today. We had one merged pull request that I'm genuinely excited about, and it's one of those fixes that might seem small but has huge implications. The team tackled a really tricky issue with Metal shader codegen where half-precision types were causing compilation failures.

Here's what was happening - and I love this because it's such a great example of how different systems handle types differently. Metal Shading Language is pretty strict about implicit conversions, especially when you're trying to convert from float to bfloat. The PyTorch codegen was generating bare float literals…

The fix touched three key methods in the MPS codegen - the constant method was completely ignoring its dtype parameter, the masked method was assigning bare literals in else branches, and the where method was passing literals through ternaries without…

This…

Nearby episodes from PyTorch

  1. Polish & Performance Day
  2. Distributed Computing Gets Real - Compilation, Clustering, and Convolutions
  3. Performance Revolution and Developer Experience Upgrades
  4. Windows Testing Gets Flexible & Dynamic Shapes Take Flight
  5. The Testing & Error Handling Polish Episode
  6. Stream Safety and Performance Wins
  7. Subclass Evolution and Memory Management Improvements
  8. Performance Tuning and Code Health Day