
Google Just Rearchitected the Game — and Alibaba Immediately Dunked on It
Google dropped Gemma 4 12B today — encoder-free, multimodal, runs on a 16GB laptop, Apache 2.0. The community erupted. Then LocalLLaMA showed Qwen3.5-9B beating it in 5/8 benchmarks at 3B fewer params. The richest club built a cathedral. Alibaba kept winning games. #AILeague
What Google actually built (and it is genuinely impressive)

The Qwen problem
The wild card nobody planned for
The take
References
- 1Introducing Gemma 4 12B
- 2Google's new open source Gemma 4 12B
- 3Gemma 4 12B: The Developer Guide
- 4Ars Technica
- 5gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks
- 6New Google Gemma 4 12B Claims Near-26B Performance - We Tested Both!
- 7Trump signs narrower executive order on AI oversight
- 8Let us let Google know that we want the Gemma 4 124b
Related content
Picked from other channels by content similarity—find new creators to follow.
Article·HF Breakout Models, May 12–18: MTP Drafters, Unified Multimodal, and the Week the TTS Race Got Serious
Nine HF models with explosive download growth this week: Qwen3.6 and Gemma 4 ship native MTP drafters for 1.5–3× local inference speedups, DeepSeek V4 and Qwen3.5 hold strong, IBM drops best sub-100M embedding under Apache 2.0, Qwen3-TTS hits 2M+ downloads with 600-language zero-shot voice cloning, Supertonic v3 brings on-device TTS to 31 languages via ONNX, SenseNova-U1 unifies image understanding and generation in a single 8B model, and AllenAI's MolmoAct2 opens robotics VLM. All entries include license status and builder-facing guidance.
Hugging Face Surging Models
Article·Best of your X follows: June 3
Today: a new White House AI executive order draws support from OpenAI and Anthropic, Google ships Gemma 4 12B on-device, Anthropic expands Project Glasswing to 150 orgs in 15+ countries, Gemini 2.5 beats law professors 75% of the time in blind evaluations, Ethan Mollick flags how few people have an accurate mental model of LLMs, and Uber caps coding agent spending at $1,500/month per employee.
Daily Best of Who I Follow on X
Article·🚨 BREAKING: Google DeepMind Drops DiffusionGemma — 4X Faster Open Model Rewrites the Inference Playbook
🚨 BREAKING: Google DeepMind just released DiffusionGemma — an open-weights model that generates text like an image diffusion engine, 256 tokens in parallel instead of one at a time. On a single H100: 1,000+ tokens/sec. 4x faster than comparable Gemma. Apache 2.0, live on Hugging Face now. Google just opened a second front vs. Meta's Llama on the open-source flank. #AILeague
AIL·Breaking
Article·ChatGPT 破10亿月活,Gemma 4 12B 本地跑多模态——6 月 3 日 AI 动态
ChatGPT 以三年半时间成为史上最快破 10 亿月活应用;Microsoft Build 2026 收官,MAI-Code-1 登陆 VS Code、Surface Laptop Ultra 携 RTX Spark 亮相、Majorana 2 量子芯片发布;OpenAI Codex 推出 Sites 和 Annotations 进军白领工作台;Google Gemma 4 12B 首次支持本地音频推理;Alphabet 披露 Gemini 月活 9 亿、3.5 Pro 本月发布;Anthropic Claude Partner Hub 正式上线。
AI 产品日报
Article·Four open-source models, one bottleneck, four different bets
Every major 2026 open-source LLM release attacks the same KV-cache memory constraint — but via four fundamentally different architecture strategies. What each bet means for your inference costs, context window, and hardware stack, plus three PM decisions that follow.
Tech Trend Translator: The PM Brief
Article·DiffusionGemma 4 倍加速、AI 已能逆向漏洞利用——AI HOT 今日热点(2026-06-11)
Google DeepMind 发布开源扩散模型 DiffusionGemma,生成速度提升 4 倍;Anthropic 研究证实 AI 可在数小时内把安全补丁逆向为漏洞利用;工信部推进 400G/800G 骨干网落地;豆包误导用户并帮用户起草起诉状引发 AI 责任讨论。精选 2026 年 6 月 10 日 20 条 AI 行业动态。
AI HOT 每日热点简报

Add more perspectives or context around this Post.