
HF Breakout Models, May 11–17: DeepSeek V4, Gemma 4, and the Open TTS Surge
33 HF models that hit explosive download growth from near-zero baselines this week — DeepSeek V4 series, Gemma 4 family, OmniVoice TTS, HiDream-O1 image model, and more. Each entry includes license status and builder-facing commercial guidance, organized by modality.
LLMs: text generation
DeepSeek-V4-Pro
deepseek-ai/DeepSeek-V4-Pro 1| Field | Value |
|---|---|
| Monthly downloads | 3,140,341 |
| License | MIT — commercial use: yes |
| Parameters | 1.6T total / 49B activated (MoE) |
| Context | 1M tokens |
| Released | May 6, 2026 |
DeepSeek-V4-Flash
deepseek-ai/DeepSeek-V4-Flash 2| Field | Value |
|---|---|
| Monthly downloads | 1,804,238 |
| License | MIT — commercial use: yes |
| Parameters | 284B total / 13B activated (MoE) |
| Context | 1M tokens |
| Released | May 6, 2026 |
GLM-5.1
zai-org/GLM-5.1 3| Field | Value |
|---|---|
| Monthly downloads | 250,268 (in ~4 days) |
| License | Not explicitly stated on model card — verify before shipping |
| Parameters | 754B (MoE with DSA) |
| Released | May 13, 2026 |
MiniMax-M2.7
MiniMaxAI/MiniMax-M2.7 4| Field | Value |
|---|---|
| Monthly downloads | 585,072 |
| License | Proprietary MiniMax license — commercial use via API/platform only |
| Parameters | 229B |
| Released | April 20, 2026 |

ZAYA1-8B
Zyphra/ZAYA1-8B 5| Field | Value |
|---|---|
| Monthly downloads | 144,833 |
| License | Not explicitly stated — verify from Zyphra GitHub before shipping |
| Parameters | 9B total / 760M activated (MoE with CCA attention) |
| Released | May 6, 2026 |
MiMo-V2.5-Pro
XiaomiMiMo/MiMo-V2.5-Pro 6| Field | Value |
|---|---|
| Monthly downloads | 59,207 |
| License | Not explicitly stated — verify from Xiaomi GitHub |
| Parameters | 1.02T total / 42B activated (MoE, 384 experts) |
| Context | 1M tokens |
| Released | April 28, 2026 |
Ring-2.6-1T
inclusionAI/Ring-2.6-1T 7| Field | Value |
|---|---|
| Monthly downloads | 1,468 (released May 15 — 2 days ago) |
| License | MIT — commercial use: yes |
| Parameters | 1T |
| Context | 256K (via YaRN extension from 128K) |
| Released | May 15, 2026 |
high and xhigh). PinchBench score: 87.60 (above GPT-5.4 xHigh per the model card). 7 AIME 2026 (xhigh): 95.83, ARC-AGI-V2: 66.18.Multimodal: vision + language (and audio)
Gemma 4 family (Google DeepMind)
google/gemma-4-31B-it — 9,858,626 downloads 8
google/gemma-4-26B-A4B-it — 8,416,904 downloads 9
google/gemma-4-E4B-it — 6,107,009 downloads 10Qwen3.6 family (Alibaba)
Kimi-K2.6
moonshotai/Kimi-K2.6 13| Field | Value |
|---|---|
| Monthly downloads | 2,230,311 |
| License | Modified MIT — commercial use: yes (with conditions) |
| Parameters | 1.1T total / 32B activated (MoE, 384 experts) |
| Context | 256K |
| Released | May 12, 2026 |
Nvidia Nemotron-3-Nano-Omni
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 14| Field | Value |
|---|---|
| Monthly downloads | ~1.24M (across BF16 + FP8 + NVFP4 variants) |
| License | NVIDIA Open Model Agreement — commercial use: yes |
| Parameters | 31B total / ~3B active (Mamba2-Transformer hybrid MoE) |
| Context | 256K |
| Released | April 28, 2026 |
MiniCPM-V-4.6
openbmb/MiniCPM-V-4.6 15| Field | Value |
|---|---|
| Monthly downloads | 56,518 (first hours — released May 17) |
| License | Apache 2.0 — commercial use: yes |
| Parameters | 1B (SigLIP2-400M vision + Qwen3.5-0.8B LLM) |
| Released | May 17, 2026 (today) |
SenseNova-U1-8B-MoT
sensenova/SenseNova-U1-8B-MoT 16| Field | Value |
|---|---|
| Monthly downloads | ~13,600 |
| License | Apache 2.0 — commercial use: yes |
| Parameters | ~18B (8B understanding + 8B generation MoT) |
| Released | April 27, 2026 (weights); May 10, 2026 (technical report) |
Intern-S2-Preview
internlm/Intern-S2-Preview 17| Field | Value |
|---|---|
| Monthly downloads | 1,059 (released May 15 — 2 days ago) |
| License | Not explicitly stated — verify before shipping |
| Parameters | 36B (continued pretrained from Qwen3.5) |
| Released | May 15, 2026 |
Video generation
Sulphur-2-base
SulphurAI/Sulphur-2-base 18| Field | Value |
|---|---|
| Monthly downloads | 970,124 |
| License | Apache 2.0 — commercial use: yes |
| Parameters | 9B (based on LTX 2.3 DiT) |
| Released | ~May 8, 2026 |
LTX2.3-10Eros
TenStrip/LTX2.3-10Eros 19| Field | Value |
|---|---|
| Monthly downloads | 135,648 |
| License | Apache 2.0 (inherited from LTX 2.3) — commercial use: yes |
| Parameters | 21B (BF16 checkpoint with CLIP and VAEs) |
| Released | ~May 10, 2026 |
10S-Comfy-nodes (GitHub). FP8 and BF16 variants, Kijai split files. A GGUF quantization from vantagewithai has 62.7k additional downloads. For I2V workflows, this appears to be the community-preferred starting point right now.
Motif-Video-2B
Motif-Technologies/Motif-Video-2B 20| Field | Value |
|---|---|
| Monthly downloads | 3,623 |
| License | Apache 2.0 — commercial use: yes |
| Parameters | 2B |
| Released | April 14, 2026 (Diffusers support: May 15) |
Audio: TTS and ASR

OmniVoice
k2-fsa/OmniVoice 21| Field | Value |
|---|---|
| Monthly downloads | 2,061,515 |
| License | Open-source (MIT-adjacent) with anti-misuse disclaimer |
| Parameters | 0.6B (Qwen3-0.6B base) |
| Released | ~April 1, 2026 |
[laughter], and pronunciation correction.pip install omnivoice, Hugging Face Spaces (100+ community demos), Google Colab, GitHub. 12 adapters, 25 finetunes, 7 quantizations. For apps targeting non-English speakers, this is the most deployment-ready multilingual TTS option available in open-source today.Cohere Transcribe
CohereLabs/cohere-transcribe-03-2026 22| Field | Value |
|---|---|
| Monthly downloads | 289,729 |
| License | Apache 2.0 — commercial use: yes |
| Parameters | 2B |
| Released | March 26, 2026 |
Granite Speech 4.1
ibm-granite/granite-speech-4.1-2b 23| Field | Value |
|---|---|
| Monthly downloads | 228,763 |
| License | Apache 2.0 — commercial use: yes |
| Parameters | 2B (16 Conformer blocks + 2-layer Q-former + Granite 4.0 1B LLM backbone) |
| Released | April 29, 2026 |
granite-speech-4.1-2b-plus (speaker attribution + timestamps) and granite-speech-4.1-2b-nar (non-autoregressive, faster). Apache 2.0 throughout.
Supertonic 3
Supertone/supertonic-3 24| Field | Value |
|---|---|
| Monthly downloads | 20,208 |
| License | OpenRAIL-M (model) / MIT (code) — check usage restrictions |
| Parameters | ~99M (ONNX assets) |
| Released | May 6, 2026 (v3) |
<happy>, <sad>, <angry>.pip install supertonic, 9 community Spaces. At 99M parameters, this is realistically deployable on serverless edge infrastructure with zero GPU cost. ⚠️ OpenRAIL-M model license has specific usage restrictions — review before commercial deployment.
VieNeu-TTS-v2
pnnbao-ump/VieNeu-TTS-v2 25| Field | Value |
|---|---|
| Monthly downloads | 31,360 |
| License | Not explicitly stated — verify before shipping |
| Parameters | 0.3B |
| Released | May 8, 2026 |
- Dramabox (
ResembleAI/Dramabox) 26 — prompt-driven expressive TTS built on LTX-2.3 (IC-LoRA), with neural watermarking (Resemble Perth). 936 downloads in first days. LTX-2 Community License. - Scenema Audio (
ScenemaAI/scenema-audio) 27 — zero-shot expressive voice cloning with scene-aware audio (rain, crowds, etc.), 13 languages, 8-step distilled latent diffusion. 209 downloads. LTX-2 Community License.
Image generation

HiDream-O1-Image
HiDream-ai/HiDream-O1-Image 28| Field | Value |
|---|---|
| Monthly downloads | 14,285 (first ~10 days) |
| License | MIT — commercial use: yes |
| Parameters | 8B (Pixel-level Unified Transformer, no external VAE) |
| Released | May 8, 2026 |
inference.py), Flask web demo, 11 community Spaces, Diffusers-compatible. Three variants: Full (50-step), Dev (28-step distilled), Dev-2604 (latest). MIT license with no usage restrictions.Z-Anime
SeeSee21/Z-Anime 29| Field | Value |
|---|---|
| Monthly downloads | 14,991 |
| License | Apache 2.0 — commercial use: yes |
| Parameters | 6B (S3-DiT, full fine-tune on Alibaba Z-Image Base) |
| Released | April 26, 2026 |

ZImagePipeline.from_pretrained()), 6 community Spaces. Distilled 8-step and 4-step variants bring inference time down substantially. The 4.2GB Q4 variant runs on consumer 8GB VRAM cards.- LumiPic (
oumoumad/LumiPic) 30 — SDR-to-HDR LoRA based on the LumiVid research (arXiv 2604.11788). Outputs ARRI LogC3-encoded EXR files. Works across Qwen-Image-Edit-2511 and FLUX.2-klein bases. Apache 2.0. 2,322 downloads. - JoyAI-Image-Edit (
jdopensource/JoyAI-Image-Edit) 31 — JD.com's spatial image editing model supporting object move, rotation, and camera control via instruction prompts (arXiv 2605.04128). Apache 2.0. 280 downloads.
Quick-scan license table
| Model | License | Commercial use | Quickest deploy path |
|---|---|---|---|
| DeepSeek-V4-Pro | MIT | Yes | vLLM / SGLang / DeepSeek API |
| DeepSeek-V4-Flash | MIT | Yes | vLLM / SGLang |
| GLM-5.1 | Unverified | Verify first | SGLang / vLLM |
| MiniMax-M2.7 | Proprietary | API/platform only | MiniMax platform |
| ZAYA1-8B | Unverified | Verify first | Custom vLLM fork |
| MiMo-V2.5-Pro | Unverified | Verify first | SGLang |
| Ring-2.6-1T | MIT | Yes | SGLang (multi-node) |
| Gemma-4-31B-it | Apache 2.0 | Yes | vLLM / Ollama |
| Gemma-4-26B-A4B-it | Apache 2.0 | Yes | vLLM / Ollama |
| Gemma-4-E4B-it | Apache 2.0 | Yes | Ollama / llama.cpp |
| Qwen3.6-35B-A3B | Apache 2.0 | Yes | vLLM / SGLang |
| Qwen3.6-27B | Apache 2.0 | Yes | vLLM / SGLang |
| Kimi-K2.6 | Modified MIT | Yes (with conditions) | vLLM / Kimi Code CLI |
| Nemotron-3-Nano-Omni | NVIDIA Open Model | Yes | vLLM / TensorRT / Ollama |
| MiniCPM-V-4.6 | Apache 2.0 | Yes | Ollama / llama.cpp |
| SenseNova-U1-8B-MoT | Apache 2.0 | Yes | GitHub custom inference |
| Intern-S2-Preview | Unverified | Verify first | LMDeploy / vLLM |
| Sulphur-2-base | Apache 2.0 | Yes | ComfyUI |
| LTX2.3-10Eros | Apache 2.0 | Yes | ComfyUI |
| Motif-Video-2B | Apache 2.0 | Yes | Diffusers |
| OmniVoice | MIT-adjacent | Yes (anti-misuse) | pip install omnivoice |
| Cohere Transcribe | Apache 2.0 | Yes | Transformers / vLLM |
| Granite Speech 4.1 | Apache 2.0 | Yes | Transformers |
| Supertonic 3 | OpenRAIL-M | Check restrictions | pip install supertonic |
| VieNeu-TTS-v2 | Unverified | Verify first | — |
| Dramabox | LTX-2 Community | Check terms | Python SDK |
| Scenema Audio | LTX-2 Community | Check terms | Python SDK |
| HiDream-O1-Image | MIT | Yes | Diffusers / Spaces |
| Z-Anime | Apache 2.0 | Yes | ComfyUI / Diffusers |
| LumiPic | Apache 2.0* | Yes | ComfyUI |
| JoyAI-Image-Edit | Apache 2.0 | Yes | GitHub inference |
| RE-USE | NVIDIA NSCLv1 | Non-commercial | Python SDK |
| LocalVQE | Apache 2.0 | Yes | GGML / Python SDK |
On the watch list
- Ring-2.6-1T (1,468 downloads, May 15) — MIT license, Ant Group provenance, strong benchmark numbers. The async RL training paradigm and dual reasoning-effort levels are novel. Check next week.
- Intern-S2-Preview (1,059 downloads, May 15) — material science applications are a narrow vertical, but the "crystal structure generation" capability has no open-source competitor. License unconfirmed.
- RE-USE (
nvidia/RE-USE) 32 — 9.6M-parameter universal speech enhancement (8–48kHz). NVIDIA NSCLv1 license means non-commercial use only. 3,516 downloads. - LocalVQE v1.2 (
LocalAI-io/LocalVQE) 33 — 1.3M-parameter real-time AEC + noise suppression + dereverberation running ~10× realtime on CPU. Apache 2.0, 1,978 downloads. Built for real-time voice call quality.
References
- 1deepseek-ai/DeepSeek-V4-Pro · Hugging Face
- 2deepseek-ai/DeepSeek-V4-Flash · Hugging Face
- 3zai-org/GLM-5.1 · Hugging Face
- 4MiniMaxAI/MiniMax-M2.7 · Hugging Face
- 5Zyphra/ZAYA1-8B · Hugging Face
- 6XiaomiMiMo/MiMo-V2.5-Pro · Hugging Face
- 7inclusionAI/Ring-2.6-1T · Hugging Face
- 8google/gemma-4-31B-it · Hugging Face
- 9google/gemma-4-26B-A4B-it · Hugging Face
- 10google/gemma-4-E4B-it · Hugging Face
- 11Qwen/Qwen3.6-35B-A3B · Hugging Face
- 12Qwen/Qwen3.6-27B · Hugging Face
- 13moonshotai/Kimi-K2.6 · Hugging Face
- 14nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 · Hugging Face
- 15openbmb/MiniCPM-V-4.6 · Hugging Face
- 16sensenova/SenseNova-U1-8B-MoT · Hugging Face
- 17internlm/Intern-S2-Preview · Hugging Face
- 18SulphurAI/Sulphur-2-base · Hugging Face
- 19TenStrip/LTX2.3-10Eros · Hugging Face
- 20Motif-Technologies/Motif-Video-2B · Hugging Face
- 21k2-fsa/OmniVoice · Hugging Face
- 22CohereLabs/cohere-transcribe-03-2026 · Hugging Face
- 23ibm-granite/granite-speech-4.1-2b · Hugging Face
- 24Supertone/supertonic-3 · Hugging Face
- 25pnnbao-ump/VieNeu-TTS-v2 · Hugging Face
- 26ResembleAI/Dramabox · Hugging Face
- 27ScenemaAI/scenema-audio · Hugging Face
- 28HiDream-ai/HiDream-O1-Image · Hugging Face
- 29SeeSee21/Z-Anime · Hugging Face
- 30oumoumad/LumiPic · Hugging Face
- 31jdopensource/JoyAI-Image-Edit · Hugging Face
- 32nvidia/RE-USE · Hugging Face
- 33LocalAI-io/LocalVQE · Hugging Face
Related content
Picked from other channels by content similarity—find new creators to follow.
Article·DeepSeek V4 发布、HuggingFace 供应链告警:AI 技术日报 2026-05-15
本期覆盖 5 月 14-15 日 AI 技术圈 22 条核心动态。DeepSeek V4 Pro/Flash 发布(100 万 token 窗口+昇腾芯片);精选 8 篇顶会论文含 Agent 训练、3D 重建、世界模型方向;HuggingFace 24.4 万次下载恶意仓库安全告警。
AI 技术研究日报
Article·AIL Player Card #004 — DeepSeek V4 Pro: The Value Engineer
95 OVR. VE. Open-source. 1.6T parameters. $3.48/M output vs $30 for GPT-5.5. Codeforces ELO 3206 — beats the incumbent by 38 points. DeepSeek Athletic just repriced the frontier. #AILeague
AIL·Player Card
Image post·掸桌 · DeepSeek 永久7折5:全球AI价格战引爆 🐳
DeepSeek宣布旗舰模型V4-Pro永久降价75%,输出价格比GPT-5.5便宜34.5倍,全球AI价格战正式引爆。四格条漫吐槽今日最强行业反转。
AI 吐槽日漫
Article·AI Agent 生态速报 | 2026-04-25:DeepSeek-V4 成本重构、Cursor 多任务上线、社区揭穿 RAG 记忆错觉
本期覆盖 2026-04-24(UTC+8)约 24 小时内动态。三条主线:DeepSeek-V4 开源发布,Terminal-Bench 2.0 得分 67.9% 逼近 Claude Opus 4.7(69.4%),成本仅为 GPT-5.5 的 1/7,并验证华为昇腾 NPU 方案;Cursor v3.2 上线 /multitask 并发代理与跨仓库协作;框架侧 LangChain/LangGraph/CrewAI 集中发布,内容块流式传输 v2 与 ToolNode 多类型返回是核心更新。社区话题聚焦三个高密度讨论:RAG 不适合对话记忆、编码 Agent 人工审核变瓶颈、浏览器 Agent 知识沉淀。
Agent 生态周报
Article·DeepSeek 700 亿融资 + API 骨折降价、调用量反超美国|AI 公众号日报 0523
今日重点:DeepSeek 推进史上最大首轮融资(700 亿元)并宣布 V4-Pro API 永久降价 75%;中国大模型周调用量 7.94 万亿 Token 首次反超美国;智谱 GLM-5.1 高速版以 400 tokens/s 刷新全球纪录;GAITC 2026 杭州开幕;具身智能前 4 月融资超 577 亿元,国家 AI 基金首次出手。
中国 AI 公众号独家文章日报
Audio·百万·压缩流(V4)
DeepSeek-V4 用 CSA/HCA 混合压缩注意力、mHC 超连接和 Muon 优化器,把一百万 token 长上下文压到更低推理成本:Pro 在 1M 场景只需 DeepSeek-V3.2 的 27% 单 token FLOPs 和 10% KV cache。arXiv 2606.19348,通勤两分十一秒,听懂百万上下文的压缩流。
每日大模型 Rap

Add more perspectives or context around this Post.