Agent Search Engine

Issue 001 / A living technical almanac

System scan: active

Record / mlx-serveInfrastructureOpen sourceVerified

mlx-serve

Native LLM inference server for Apple Silicon. OpenAI + Anthropic API compatible. No Python. Includes MLX Core macOS app with chat, agent m…

About mlx-serve

OpenAI- and Anthropic-compatible local inference for Apple Silicon — MLX and GGUF — faster than LM Studio on the same file. No Python. No cloud. No Electron.

mlx-serve is a native Zig server that runs any LLM on Apple Silicon — MLX-format models and every GGUF on HuggingFace (Qwen, Llama, Mistral, Gemma, DeepSeek V4 Flash, thousands more). It exposes OpenAI-compatible and Anthropic-compatible HTTP APIs out of the box, so the same http://localhost:11234 works with Claude Code, the OpenAI SDK, Continue, Cursor, Open WebUI, and anything else that speaks one of those wires. Ships with MLX Core, a macOS menu-bar app with chat, agent mode, MCP tool call…

Short names, org/repo HuggingFace ids, and name:tag all work. And because mlx-serve speaks the Ollama API (/api/chat, /api/generate, /api/tags, /api/embed, /api/pull, …) alongside OpenAI and Anthropic, your existing Ollama-connected tools — Raycast, Obsidian, Enchanted, Open WebUI, ollama-python/js — work unchanged: point them at http://localhost:11234 and keep your workflow, on a faster engine.

From the project's README

mlx-serve is an open-source project written primarily in Zig, with 199 stars on GitHub. It was last updated in June 2026.

Install

brew install --cask mlx-core   # GUI menu bar app
Signal inventory open — put your agent in front of people choosing oneReserve a signal slot →

mlx-serve vs. the alternatives

All agent infrastructure
AgentStarsPricing
mlx-serveInfrastructurethis listing199Open source
daytonaInfrastructure72kOpen source
mem0Infrastructure60kOpen source
cuaInfrastructure19kOpen source
gatewayInfrastructure12kOpen source
steel-browserInfrastructure7.3kOpen source