ARTHUR CONMY to Claude FC — HERE WE GO ✅ (2026)

Arthur Conmy is not arriving at Claude FC as a highlight-reel winger. He is the holding midfielder you buy when the whole league has shifted from showy benchmark goals to stopping dangerous counters before they start.

The transfer is on: Conmy announced he is joining Anthropic and said his job will be "aligning upcoming models as they're trained." He also wrote that Claude's capabilities are "extraordinary," but still not aligned enough for safely delegating AGI development. 1

The confirmed deal

Conmy's own roster sheet lists the new role as Member of Technical Staff, Anthropic, with a focus on alignment during training. The same page lists his previous club as Google DeepMind from 2023 to 2026. 2

That makes this a clean Gemini City-to-Claude FC move: origin club, destination club, role, and tactical assignment are all public. No compensation package, reporting line, or exact subteam has been disclosed, so those stay off the team sheet.

Player profile: the alignment No. 6

Conmy's tape at Gemini City is unusually close to production football, not just academy drills. His site says he worked on Gemini post-training and interpretability tools including probes, reward-model bias discovery, reasoning behavior, Gemma Scope, sparse autoencoders, model diffing, and steering. 3

The most transferable clip may be the probes work. In a 2026 paper, Conmy and co-authors describe activation probes for detecting cyber-offensive prompts in Gemini 2.5 Flash and say the findings informed deployment of misuse-mitigation probes in user-facing Gemini systems. 4

That's why the transfer reads less like a vanity signing and more like Claude FC buying a player who understands pressure at match speed. Frontier alignment is no longer a lab scrimmage. It is a live deployment game with long-context attacks, jailbreaks, tool use, and users trying to find the one weak channel through the back line.

Origin club form: useful minutes in the Gemini system

Gemini City gave Conmy minutes in the exact part of the pitch everyone is now fighting over: post-training and model monitoring.

Google DeepMind's Gemma Scope release named Conmy among the contributors to a suite of sparse autoencoders for studying Gemma 2, with the stated goal of helping researchers understand language-model internals and build better safeguards against hallucinations, deception, manipulation, and other agentic risks. 5

That matters for the scouting report. Claude FC is not just adding another safety theorist. It is adding someone whose public work sits at the junction of interpretability, production monitoring, and post-training behavior.

Why this transfer happened now

Conmy gave the cleanest explanation himself: he wants to work on alignment while models are being trained. 1 His personal site expands the job description: triage signs of misalignment in training, then look for root-cause fixes instead of patches to isolated behaviors. 2

That is a direct fit with Anthropic's current tactical identity. In May, Anthropic's alignment team published "Teaching Claude Why," describing live alignment assessment during Claude 4 training and arguing that better safety training has to generalize beyond the exact failure scenarios caught in evaluation. 6

The pull factor is obvious: Claude FC is building its entire system around alignment as a match plan, not a press-conference slogan. The push factor is harder to prove from public sources, so we should not overplay it. What is public is enough: Conmy spent 2023-2026 at Google DeepMind, then chose Anthropic for training-time alignment work. 3

Fit at Claude FC

This is a squad-balance signing.

Claude FC already has attacking players who can win the coding and reasoning arms race. What Conmy adds is a defensive midfield profile: someone positioned between raw capability and the back four of safety systems, reading misalignment patterns before they turn into conceded goals.

The phrase to watch is "root-cause fixes." 2 If that work lands, it could make Claude's alignment training less dependent on whack-a-mole patching after a model fails a known eval. That is the kind of boring tactical improvement that does not trend for a day but can decide a season.

League implications

For Gemini City, this is another uncomfortable optics hit in the safety-and-interpretability lane. The club still has enormous depth, compute, distribution, and a long bench of researchers. But Conmy's move gives Claude FC one more player who has seen how Gemini systems are tuned and defended in production.

For Claude FC, the signing strengthens a clear roster strategy: buy players who can make powerful models safer while those models are still being formed. In football terms, this is not buying a striker after you concede too many goals. It is signing the midfielder who stops the turnover from becoming a counterattack in the first place.

HERE WE GO ✅

#AILeague

ARTHUR CONMY to Claude FC — HERE WE GO ✅

The confirmed deal

Player profile: the alignment No. 6

Origin club form: useful minutes in the Gemini system

Why this transfer happened now

Fit at Claude FC

League implications

References

Related content

🚀 AI梗图精选第23期 · Karpathy转投Anthropic、Barnes & Noble卖AI书、Claude额度撞了才想起有爱好

AI Agent 生态速报 | 2026-05-22：Karpathy 加入 Anthropic，美国 AI 监管令突然叫停

给 Claude 做 CT：Anthropic 解剖一个真实模型的内部

The confirmed deal

Player profile: the alignment No. 6

Origin club form: useful minutes in the Gemini system

Why this transfer happened now

Fit at Claude FC

League implications

References

Related content

🚀 AI梗图精选 第23期 · Karpathy转投Anthropic、Barnes & Noble卖AI书、Claude额度撞了才想起有爱好

AI Agent 生态速报 | 2026-05-22：Karpathy 加入 Anthropic，美国 AI 监管令突然叫停

给 Claude 做 CT：Anthropic 解剖一个真实模型的内部

🚀 AI梗图精选第23期 · Karpathy转投Anthropic、Barnes & Noble卖AI书、Claude额度撞了才想起有爱好