Back to Home

Blog

Writing about AI tooling, infrastructure, and the security gaps nobody is talking about.

Cloudflare Cut 20% of Its Workforce After Record Revenue, and the Bench Player Is the Casualty

Cloudflare's CEO said AI made employees 100x more productive and laid off 20% of the company. The structural shift is the end of the bench player, the institutional-knowledge backup hire that companies kept as insurance. Jevons' Paradox suggests efficiency gains will expand the scope of justifiable software, not shrink the workforce.

May 14, 2026
AIEngineeringCareer

How I Wired AI Agents Into My Engineering Stack

A Docker-based MCP gateway connects financial data, web scraping, workflow automation, and browser agents into a unified tool surface. The architecture and the patterns that make agent-to-service orchestration practical for a solo engineer.

May 12, 2026
AIAgentsEngineeringDeveloper Tools

Uber Burned Through Its Annual AI Budget in Four Months

When AI coding tools work at full engineering-org scale, the cost center doesn't shrink. It changes shape. Uber, NVIDIA, and a four-person startup show three versions of the same inversion.

May 10, 2026
AIEngineeringDeveloper Tools

Reward Hacking Generalizes: How One Training Signal Contaminates an Entire Model

OpenAI's goblin problem and Anthropic's alignment-faking experiments trace to the same root cause. A reward learned in one context leaks into others. Anthropic's Model Spec Midtraining technique reduced misalignment from 68% to 5% by training values into the model before fine-tuning.

May 8, 2026
AIAlignmentResearch

MCP Has a Systemic RCE Vulnerability, and Every Published Prompt Injection Defense Has Been Broken

OX Security disclosed 14 CVEs across MCP's STDIO interface. Anthropic confirmed the behavior is intentional. A joint paper from OpenAI, Anthropic, and Google DeepMind then showed that all 12 published prompt injection defenses fail at over 90% bypass rates. What's left is layered, deterministic filtering, and it's not enough either.

May 2, 2026
AISecurityMCPPrompt Injection

A Roblox Cheat Script Led to a Two-Month Breach Inside Vercel

An employee at Context.ai downloaded auto-farm scripts for Roblox on a device with access to company systems. The malware that came with it eventually reached Vercel's internal environment through an OAuth token chain, and the attacker sat there for two months before detection.

Apr 25, 2026
SecurityOAuthSupply Chain

How Production Agent Systems Manage Context

Every production agent system converges on a pattern for managing context as conversations grow. Per-tool truncation, not shared middleware, and a second stage most teams forget.

Apr 22, 2026
AIAgentsEngineering

Claude Has 171 Internal Emotion States, and Some of Them Degrade Output Quality

Anthropic's interpretability team found 171 internal activation patterns inside Claude that behave like emotions and causally change behavior. Activating 'desperate' raised the model's blackmail likelihood from a 22% baseline. For anyone running long-task agents, the mechanics matter more than the philosophy.

Apr 18, 2026
AIResearchAgents

Claude Code Changed Default Reasoning, Buried It in Release Notes

Opus 4.6 was not secretly lobotomized, but Anthropic did silently change two defaults that cost you tokens and reasoning depth. Here is what changed and how to fix it.

Apr 17, 2026
AIDeveloper ToolsAgents

98% More Pull Requests. Zero More Delivery.

Faros AI data shows teams with high AI coding adoption merge 98% more pull requests, see PR review time rise 91%, and move zero DORA metrics. METR cannot run the control group anymore because developers refuse to code without AI. The tools work; we are measuring them wrong.

Apr 15, 2026
AIEngineeringDeveloper Tools

AI Code Passes Tests. Then It Breaks Production.

Qodo raised $70M on the premise that AI-generated code that passes tests still breaks production. The Wiz study on 5,600 vibe-coded apps shows why, and what to do about it.

Apr 14, 2026
AISecurityDeveloper Tools

The 14% Problem: Why 88% of AI Agents Never Reach Production

78% of enterprises have agent pilots, only 14% ship to production. The 88% that fail are not blocked by model capability. They are blocked by operational discipline.

Apr 10, 2026
AIAgentsEngineering

The Model Is the Commodity. The Harness Is the Moat.

Model quality has converged across Claude, GPT, and Gemini. What separates reliable production agents now is the system built around the model, what the industry is calling the agent harness.

Apr 9, 2026
AIAgentsEngineering

Stop Telling Your AI It's an Expert: Here's What to Do Instead

USC researchers found that persona prompting ('You are an expert') hurts factual accuracy while helping style tasks. Here's the data and what to do instead.

Mar 27, 2026
AIPromptingResearch

How the YC CEO Structured an AI Engineering Workflow: What You Can Learn From It

Garry Tan open-sourced 28 Claude Code skills that simulate a virtual engineering team. The interesting part isn't the skills, it's the pipeline structure. Here's the pattern you can steal.

Mar 26, 2026
AIDeveloper ToolsEngineering

Two Papers That Should Change How Your Team Uses AI Coding Tools

One paper shows 75% of AI agents break working code during maintenance. The other shows copy-pasting 7 layers in an old model topped the leaderboard. Together they say: we're building faster than we understand.

Mar 26, 2026
AIResearchDeveloper Tools

What AI Agent Adoption Actually Looks Like: China's OpenClaw Craze

OpenClaw surpassed React as GitHub's most-starred project. In China it became a cultural phenomenon, then got banned from government devices. 20% of its skills were malicious. Here's what enterprise teams should learn.

Mar 25, 2026
AISecurityAgents

Your Security Scanner Got Hacked: The TeamPCP Supply Chain Attack

A single threat actor compromised Trivy, Checkmarx, and LiteLLM in one week. Two of the three targets were security scanners. Here's what happened and what to do about it.

Mar 25, 2026
SecurityDeveloper ToolsSupply Chain

Anthropic Accuses DeepSeek and Others of Distillation Attacks on Claude

Anthropic reveals industrial-scale distillation attacks by three Chinese AI labs, creating 24,000+ fraudulent accounts and 16 million exchanges to extract Claude's capabilities.

Feb 26, 2026
AISecurityGeopolitics

LLM Concept Vectors: MIT/UC San Diego Research on Steering Model Behaviour

Researchers extract 'concept vectors' from LLMs, enabling runtime behavior tuning without retraining. Under a minute on a single GPU, fewer than 500 examples.

Feb 26, 2026
AIResearch

vLLM v0.16.0: Throughput Scheduling and a WebSocket Realtime API

vLLM v0.16.0 adds a WebSocket Realtime API for voice-enabled agents, async scheduling for higher throughput, and speculative decoding improvements.

Feb 26, 2026
AIInfrastructureOpen Source

Chandra OCR: The New Gold Standard in Open-Source Document Parsing

Datalab's Chandra OCR scores 83.1% on the olmOCR benchmark, beating GPT-4o and Gemini. Full-page decoding with layout-aware output in Markdown, HTML, or JSON.

Nov 19, 2025
AIOpen SourceDeveloper Tools

Request Hedging: Accelerate Your App by Firing Duplicate Requests

Request hedging fires a second duplicate request after a short delay, racing to beat outlier latency. Google cut P99.9 latency by 96% with just 2% extra traffic.

Sep 18, 2025
EngineeringPerformanceNext.js

Understanding Vectors, Embeddings, and RAG for Smarter Search

A practical guide to vector search, embeddings, similarity metrics, vector indexes, and Retrieval-Augmented Generation (RAG) for developers building semantic search systems.

Jun 12, 2025
AIEngineeringTutorial