Ollama Now Runs Faster on Macs Thanks to Apple's MLX Framework

Ollama has released a performance-focused update for macOS that leverages Apple’s MLX framework to significantly increase processing speeds for local AI models on Apple silicon hardware.

Key points

Ollama version 0.19 delivers a 1.6x increase in prompt prefill speeds and nearly doubles response generation rates.
The update utilizes Apple’s GPU Neural Accelerators, providing the most substantial performance gains for Macs equipped with M5-series chips.
Improved memory management enhances responsiveness for AI-powered coding agents and personal assistants during extended sessions.
The current preview release requires a Mac with at least 32GB of unified memory to function.
Initial support is limited to Alibaba’s Qwen3.5 model, with plans to expand compatibility to additional AI models in future updates.

Why it matters

This update lowers the barrier for developers and power users to run sophisticated AI models locally on high-end Apple hardware without relying on cloud-based services. By optimizing performance for local silicon, Ollama is making private, high-speed AI tools more practical for resource-intensive tasks like coding and data analysis.

Ollama Now Runs Faster on Macs Thanks to Apple's MLX Framework

Latest News

The tech news feed
that never sleeps.

Page not found

Ollama Now Runs Faster on Macs Thanks to Apple's MLX Framework

Related Articles

13 legal startups to watch in 2026, according to investors

How a Google Machine Terminated 130,000 AI Slop YouTube Channels in Six Months

Grok-iOS – remote Grok Build from your iPhone over ACP

Apple testing ‘Live Notes’ AI system to record Genius Bar sessions: report

Latest News

Related Articles

The tech news feedthat never sleeps.

Page not found

The tech news feed
that never sleeps.