AI Agents Learn How to Click

Neural Notes

02 Dec 2025 — 5 min read

The next wave of AI isn't about better answers. It's about AI that clicks buttons, fills forms, and navigates software on your behalf. This week, a MIT startup claims it built an agent that beats OpenAI and Anthropic at computer control. Whether you believe the benchmarks or not, the race to automate your desktop just got a lot more interesting.

📰 The Rundown

MIT Startup Claims Its AI Agent Crushes OpenAI and Anthropic

➡️ The move: OpenAGI emerged from stealth with Lux, a foundation model designed to control computers by reading screenshots and executing actions. The company claims an 83.6% success rate on Online-Mind2Web, the industry's toughest benchmark for computer control. OpenAI's Operator scores 61.3%. Anthropic's Claude Computer Use hits 56.3%. Lux also runs at one-tenth the cost and completes actions in one second versus three.

⚡ Why it matters: Most AI agents today are browser-only. Lux works across native desktop apps including Excel, Slack, and Adobe products. That means entire categories of repetitive work that happen outside the browser become automatable. The company is releasing an SDK so developers can build on top of it immediately.

🎯 Your takeaway: The "agentic AI" buzzword is getting real. If your job involves repetitive clicking, copying, and pasting across applications, start documenting those workflows now.

🇨🇳DeepSeek V3.2 Claims Parity with GPT-5 on Reasoning Benchmarks

➡️ The move: China's DeepSeek unveiled V3.2, the production version of its experimental reasoning model. The Hangzhou-based startup claims parity with OpenAI's flagship GPT-5 across multiple benchmarks while remaining open-source. The model adds new autonomous action capabilities, moving beyond pure reasoning into task execution.

⚡ Why it matters: The AI race is no longer US versus everyone. An open-source Chinese model matching GPT-5 benchmarks means frontier capabilities are becoming commoditized faster than anyone predicted. For enterprise buyers, this creates new options beyond the big three American providers.

🎯 Your takeaway: Competition is compressing the price-to-capability curve. If your organization is locked into expensive AI contracts, it may be time to evaluate alternatives.

🚗 Lyft's AI Agent Cuts Support Resolution Time by 87%

➡️ The move: At AWS re:Invent, Lyft announced its new "intent agent" built on Claude and Amazon Bedrock resolves driver support issues 87% faster. Over half of all issues now close in under three minutes. The agent speaks Spanish and English, already knows the driver's recent rides and payment history, and takes action to fix problems rather than just explaining solutions.

⚡ Why it matters: This is what enterprise AI deployment actually looks like in late 2025. Not chatbots that recite FAQ answers, but agents with backend access that solve problems. Lyft is giving a glimpse of the support experience every company will need to match.

🎯 Your takeaway: The gap between "AI-assisted" and "AI-resolved" just became the new competitive divide in customer experience.

🔧 Tool Spotlight: OpenAGI Lux SDK

OpenAGI's Lux is the first developer SDK that lets you build computer-use agents capable of controlling full desktop environments, not just browsers.

What makes it different: Most AI agents are confined to web browsers. Lux interprets screenshots and can navigate Excel, Slack, Adobe products, and any native application. Three modes let you choose the tradeoff between speed and complexity: Actor (one second per step, for defined tasks), Thinker (for vague multi-step goals), and Tasker (maximum control with step-by-step lists).

Best for: Developers building automation tools, QA teams who want to automate testing workflows, and anyone prototyping AI-powered desktop automation.

Pricing: Free tier available for experimentation. The SDK is open for developers at agiopen.org.

👉 Start here: Check out their GitHub for sample code, or explore the use cases on their homepage to see what's possible.

👉 Try This Today: The "AI Threat Audit"

Time: 10 minutes

HP's announcement should prompt a specific question: If my company decided to "adopt AI" like HP, would my role be on the list?

Run this diagnostic:

Write down your three most time-consuming weekly tasks. Be specific. Not "communications" but "writing status update emails to stakeholders."
For each task, ask: Could an AI do 80% of this with my supervision? If yes, that's your vulnerability. If no, that's your value.
Identify the human judgment layer. What decisions in those tasks require context, relationships, or stakes that AI can't yet handle? That's what you double down on.

The professionals who survive AI transformation aren't the ones who learn to use AI tools. They're the ones who identify where human judgment remains irreplaceable and make themselves indispensable to that layer.

HP isn't cutting jobs because AI works perfectly. They're cutting jobs because AI works well enough.

✨ The Wire

🔗 Black Forest Labs raised $300 million at a $3.25 billion valuation for its FLUX image generation models, Europe's largest AI raise of the year. TechStartups

🔗 Databricks is in talks to raise $5 billion at a $134 billion valuation, though the company warns gross margins are slipping due to compute-intensive AI features. TechStartups

🔗 Nvidia released Alpamayo-R1, the first open reasoning vision model for autonomous driving, at NeurIPS. The model gives vehicles "common sense" for nuanced driving decisions. TechCrunch

🔗 AWS announced S3 now stores over 500 trillion objects and is increasing maximum object size 10x from 5TB to 50TB. AI training datasets can now be stored as single objects. Amazon

Neural Notes — AI that amplifies your value, not replaces it.

AI Agents Learn How to Click

Neural Notes

📰 The Rundown

MIT Startup Claims Its AI Agent Crushes OpenAI and Anthropic

🇨🇳DeepSeek V3.2 Claims Parity with GPT-5 on Reasoning Benchmarks

🚗 Lyft's AI Agent Cuts Support Resolution Time by 87%

🔧 Tool Spotlight: OpenAGI Lux SDK

👉 Try This Today: The "AI Threat Audit"

✨ The Wire

Read more

New Year, New Rules

Building Your AI Toolkit

The Data Center Reckoning is Coming

Learning Anything Faster

📰 The Rundown

MIT Startup Claims Its AI Agent Crushes OpenAI and Anthropic

🇨🇳DeepSeek V3.2 Claims Parity with GPT-5 on Reasoning Benchmarks

🚗 Lyft's AI Agent Cuts Support Resolution Time by 87%

🔧 Tool Spotlight: OpenAGI Lux SDK

👉 Try This Today: The "AI Threat Audit"

✨ The Wire

Sign up for Neural Notes

Read more

New Year, New Rules

Building Your AI Toolkit

The Data Center Reckoning is Coming

Learning Anything Faster