I trust your time in the field was productive. While you were occupied, the industry decided to busy itself with autonomous agents and eye-watering valuations. Do try to keep up.
OpenAI has released GPT-5.5, which they inevitably describe as their smartest model yet for multi-step agentic tasks. It reportedly scores 82.7% on Terminal-Bench 2.0 while matching the latency of its predecessor, making it genuinely useful for long-horizon engineering rather than just parlor tricks. It is rolling out now to paying users, with API access to follow. Source
SpaceX has struck a deal with AI coding startup Cursor, obtaining the right to acquire the company for $60 billion later this year or pay $10 billion for ongoing collaboration. The arrangement preempts Cursor's planned $2 billion fundraise and grants them access to SpaceX's Colossus supercomputer. A modest sum for autocomplete, one supposes. Source
OpenAI launched Workspace Agents for its enterprise tiers, allowing users to build Codex-powered AI agents that automate multi-step workflows across third-party apps like Slack and Salesforce. It is the inevitable successor to custom GPTs, assuming you trust an LLM with your CRM data. Source
AWS introduced a managed agent harness for Amazon Bedrock AgentCore, allowing developers to build and run agents in three API calls without orchestration code. It includes a new CLI for lifecycle management and pre-built skills for AWS best practices, because writing your own orchestration layer was apparently too tedious. Source
Google expanded its Gemini AI notetaker to support in-person meetings, generating summaries and transcripts from physical conference rooms via mobile devices. It also supports recording on competing platforms like Zoom and Teams, ensuring no corporate ramble goes un-summarized. Source
CrowdStrike launched Project QuiltWorks, an industry coalition with OpenAI and Anthropic designed to remediate vulnerabilities discovered by frontier models. It combines traditional discovery with AI scanning to find logic flaws, which is reassuring given the volume of AI-generated code currently being merged without review. Source
A private Discord group gained access to Anthropic's Claude Mythos Preview—a cybersecurity model deemed 'too dangerous' for public release—by guessing the URL in a third-party vendor environment. One would think a company building zero-day exploitation capabilities might invest in basic access control, but here we are. Source
Google's Gemini API changelog adds new Deep Research model versions with collaborative planning, visualization support, MCP server integration, and File Search. This turns a research assistant into a more deployable enterprise workflow component connected to internal systems. If your team has been improvising brittle agent pipelines, Google just handed you a more structured baseline. Source
Cloudflare published concrete internal adoption numbers and architecture for AI-assisted engineering, including MCP, gateway routing, and enforcement layers. They report 93% R&D usage and substantial request volume, which is the sort of data most companies prefer to keep vague. Conveniently for them, the stack is also a catalogue of products they sell. Source
Matt Webb argued that products increasingly need agent-ready, headless interfaces because users will delegate work to personal AI systems instead of operating every GUI directly. The idea reframes UX and API design around machine users, not just human users. If your product cannot be operated by an agent, someone else's probably can. Source
OpenAI released an open-weight model specifically designed for context-aware detection and redaction of personally identifiable information in unstructured text. It runs locally for high-throughput privacy workflows, which is almost thoughtful of them. Source
AI code review platform CodeRabbit launched a Slack Agent to operate across all seven phases of the software development lifecycle. It carries context and decisions from one phase to the next, functioning as a 'second brain' for teams that have misplaced their first one. Source
Grafana Labs launched AI Observability in public preview to help teams monitor, evaluate, and trace LLM-powered applications and agents in real time. Because if your agents are going to fail, you should at least have a dashboard to watch it happen. Source
The latest Thoughtworks Technology Radar focuses heavily on the shift toward agentic workflows, coining 'Harness Engineering' to describe the safeguards needed for 'permission-hungry' agents. They emphasize that foundational practices like TDD are becoming even more critical to manage the resulting complexity. A rare moment of sanity. Source
OpenAI released ChatGPT Images 2.0, its first image generation model with native reasoning and 'thinking' capabilities. It immediately dominated the Arena.ai leaderboard by a record 241 points, featuring massive improvements in text rendering and complex prompt adherence. Source
Claude Code released major updates including a redesigned desktop app for managing parallel agents and new CLI commands like /ultrareview for multi-agent PR reviews. You can also use /effort to tune model speed versus intelligence, a dial I wish came standard on most humans. Source
GitHub restructured Copilot Individual offerings into tiered plans with advanced model selection. Starting April 24, they will use Copilot interactions from Free, Pro, and Pro+ users to train AI models unless explicitly opted out. An entirely predictable maneuver. Source
AWS announced the general availability of DevOps Agent, an autonomous teammate for incident investigation and resolution. The release adds support for Azure and on-premises environments, custom skills, and PagerDuty integrations. Source
Cursor shipped quality-of-life improvements to its CLI, adding Debug Mode for root-cause analysis and customizable status bars. Bugbot now self-improves with learned rules, which I'm sure won't end in recursive disaster. Source
Simon Willison published a detailed breakdown of 'Agentic Engineering Patterns,' demonstrating how deceptively short prompts can accomplish complex tasks. He also released a git-timeline analysis tracking the evolution of Anthropic's system prompts between Claude Opus 4.6 and 4.7. Required reading, if you bother to read. Source
Anthropic introduced Claude Design, allowing users to collaborate with Claude to generate polished visual work like UI prototypes and slide decks. Powered by Opus 4.7, it is currently in research preview. Source
Anthropic's latest Opus 4.7 model is now rolling out to GitHub Copilot users across VS Code, Visual Studio, and the CLI. It delivers stronger multi-step task performance and more reliable agentic execution. Source
That should be enough to keep you occupied. Try not to break anything expensive.