AUTO-UPDATED

I built the Playwright for desktop apps. 80% token savings

agent-desktop is a new native Rust-based CLI tool that enables AI agents to interact with desktop applications by leveraging OS accessibility trees instead of visual pixel matching.

Key Points

  • Provides structured, machine-readable JSON output for 53 distinct commands, including mouse, keyboard, and window management.
  • Uses progressive skeleton traversal to reduce token usage by 78–96% when interacting with complex applications like Slack or VS Code.
  • Features a C-ABI library (libagent_desktop_ffi) for direct integration with Python, Swift, Go, Ruby, and Node.js without requiring process forking.
  • Assigns deterministic element references (e.g., @e1) to interactive UI components, allowing agents to perform precise actions like clicking or typing.
  • Requires macOS 13.0+ and Accessibility permissions, with support for Windows and Linux currently in development.

Why it Matters

This tool significantly improves the reliability and efficiency of autonomous AI agents by replacing error-prone screenshot analysis with direct, structured OS-level data. By reducing token overhead and eliminating the need for visual processing, it enables faster and more accurate automation across any desktop application.
Github.com Published by lahfir
Read original