05-16-Daily AI News Daily

Daily Summary

Douyin's Doubao hits 345M MAU yet charges first at 68 yuan—China's free AI era officially ends.
From Feishu CLI to Kimi WebBridge, AI is shifting from "answering questions" to "doing the work"—agentic AI is today's strongest signal.
Want to understand how this wave captures users and reshapes toolchains? Today's issue is worth opening.

⚡ Quick Navigation

📰 Today’s AI News - Latest updates at a glance

💡 Tip: Want to experience the latest AI models mentioned in this article (Claude 4.5, GPT, Gemini 3 Pro) first? No account? Grab one at Aivora —one-minute setup, hassle-free support.

Today’s AI News

👀 One-Liner

China’s AI products are quietly completing an identity shift—from “chatbots” to “work-doing agents.” Today’s signals are unmistakable.

🔑 3 Keywords

#AgenticAIWave #ChinaAIPaidEra #DeveloperEcosystemReshape

🔥 Top 10 Headlines

1. Doubao Hits 345M MAU, Yet It’s the Most Anxious One

345M monthly active users—that’s more than China’s #2 through #4 AI products combined. By logic, a product at this scale should be the most relaxed. Instead, Doubao is the first to aggressively charge—starting at 68 yuan, not cheap.

There’s a counterintuitive logic worth unpacking: usually it’s the laggards scrambling to monetize. Doubao flipped the script. 36kr’s research reveals the answer: bigger scale means bigger burn, bigger pressure. The MAU numbers look great, but the real battlefield is compute costs, retention anxiety, and commercialization pressure.

China’s free AI era might actually be ending.

2. WeChat Mini Programs Officially Integrate Hy3 Preview; Moonshot Releases Kimi WebBridge

Two announcements, one direction.

WeChat’s “Growth Plan” now supports Hy3 preview—developers can build AI assistants and content automation in mini programs with a stronger foundation and lower technical barriers. Meanwhile, Moonshot launches Kimi WebBridge—AI that directly operates your browser, not just “answering questions” but actually clicking buttons, filling forms, running workflows.

One lowers dev barriers, one expands execution scope. China’s AI is shifting from “chatbox” to “operator.”

3. Feishu CLI Hits 10,000 Stars in One Month

One month, 10,000 stars. Developers voted with their feet.

Feishu CLI’s logic is disruptive: it exposes all Feishu capabilities to the command line. You completely bypass the complex UI and just talk to the CLI to get everything done. For agents, this is tailor-made—no click simulation, no UI parsing, just direct capability calls.

The competitive dimension for traditional SaaS just shifted. It used to be about prettier UI and smoother interactions; now it’s about deeper agent adaptation and broader coverage. Feishu moved early on this.

4. everything-claude-code: Agentic Framework Performance Optimization System

183,474 stars, exploded the moment it launched today.

This project optimizes performance for Claude Code, Codex, Cursor, and other mainstream AI coding tools across skills, memory, safety, and research prioritization. Simple version: when you use these tools to code, this helps your agent run faster, more stable, smarter.

For developers heavily reliant on AI coding tools, this isn’t nice-to-have—it’s real productivity gains. #1 on GitHub Trending, worth bookmarking.

5. GPT Codex Remote Control: Multi-Device Linking

This feature is buried deep—most people don’t even know it exists.

GPT’s Codex supports remote multi-device control—hit “Set Up Codex Mobile” in another device’s Codex and link them together. Phone and computer running tasks simultaneously, collaborating.

Sounds like a small feature, but the signal behind it: Codex is shifting from “single-machine coding assistant” to “cross-device agent platform.” Today it’s two devices, tomorrow maybe your entire workflow.

6. Gemini: Is One Picture Really Worth a Thousand Words?

David Maliglowka ran an experiment with Gemini, live on Gemini Discord: feed an image to Gemini, have it restore the image’s information in language, then compare.

Results are interesting—images do have higher information density in some scenarios, but Gemini’s language description ability is surprisingly strong. The value isn’t just “cool”—it’s testing the ceiling of multimodal understanding.

Google posted this to their official account, which says they’re confident in the results.

7. Codepilot New Preview: Codex Support as Agent Engine Coming Soon

Codepilot is plugging Codex in as its agent engine.

What does that mean? Codepilot used to be a code assistant; with Codex, it can autonomously plan tasks, break down steps, execute code—from “help you write” to “do it for you.”

AI coding tools are entering the next arms race phase: not whose autocomplete is more accurate, but whose agent is stronger and more autonomous. Codepilot’s upgrade is direct competition with Cursor and Windsurf.

8. SANA-WM: 2.6B Open-Source World Model, Native One-Minute Video Generation

Generate one-minute, 720p video with precise camera control—only 2.6B parameters.

SANA-WM’s breakthrough: hybrid linear attention mechanism combining frame-level processing with temporal modeling. Efficiency beats industrial-grade models like LingBot-World and HY-WorldPlay, visual quality stays competitive.

Open-source, efficient, long-form video—these three words together make this the must-download paper for video generation researchers and developers today.

9. LLM Post-Training Tech Comparison: SFT, DPO, GRPO Evolution Chain

Three tech terms, one clear evolution path.

SFT teaches models to “listen,” DPO aligns outputs with human preference, GRPO further unlocks reasoning and thinking—this is the standard pipeline for today’s mainstream model training. Understand this chain and you see why models keep getting “smarter,” not just “obedient.”

For anyone getting into model training, this chart beats most long-form tutorials.

Tweet Image

10. Lex Fridman Comes to China, Recording Podcasts with Domestic AI Engineers

Just a backpack, low-key entry.

Lex Fridman is one of the world’s most influential AI podcast hosts—getting on his show means your ideas reach the entire English-speaking AI circle. Coming to China to record with domestic AI engineers—specifics TBD, but whoever gets on, gets direct international exposure.

China’s AI story is finding more people who want to tell it.

📌 Worth Watching

[Product] Feishu CLI + AI Tools: Two Conversations Produce a Curated AI Paper Collection — Not a demo, real use case: two conversations output a beautifully formatted paper collection. Feishu CLI’s practical value beats any PowerPoint.

[Research] SceneParser: Hierarchical Scene Parsing, AI Truly Understanding “Object Relationships” — Not just object detection, but understanding structural dependencies and interactions between objects. This is a key puzzle piece for robotics and embodied AI.

[Research] VLA Model Dynamic Blind Spot Correction: Rhythm and Path Calibration Without Retraining — Robot control models often “go blind” in dynamic scenes. This paper proposes a training-free fix, very practical for embodied AI deployment.

[Business] GEO Service Providers Are a Mixed Bag: 2M Budget, Only 4K UV—A Lesson Learned — New growth pitfalls in the AI era. Startups should read this GEO whitepaper before signing contracts.

😄 AI Fun Fact

AI Observation: H-OmniStereo: Zero-Shot Omnidirectional Stereo Matching with Heading-Align

This observation fits the fun section: it’s not necessarily today’s biggest launch, but it shows how AI change lands in everyday habits. The real fun isn’t the hype—it’s how new tools spread by first showing up as a small shortcut, a time-saver, or one successful workflow hack.

🔮 AI Trend Predictions

China’s AI Products Enter Paid Monetization Phase Collectively

Timeline: Q3 2026
Confidence: 75%
Rationale: Today’s news Doubao Hits 345M MAU, Yet Most Anxious + Doubao as the largest-scale domestic AI charging first is typically the industry pricing signal. Others following is just a matter of time—the free subsidy era is ending.

Agentic Tools Become Standard SaaS Capability

Timeline: Q3 2026
Confidence: 80%
Rationale: Today’s news Feishu CLI Hits 10K Stars + Codepilot Integrates Codex Engine —two different verticals simultaneously adapting for agents means this isn’t differentiation anymore, it’s table stakes.

Browser-Control AI Like Kimi WebBridge Explodes

Timeline: Q2-Q3 2026
Confidence: 65%
Rationale: Today’s news Moonshot Releases Kimi WebBridge + browser automation is the lowest-barrier agent deployment scenario. Moonshot entering means this shifted from tech validation to product competition—more players will follow.

China’s AI Narrative Systematically Enters English-Speaking Circles

Timeline: Q2 2026
Confidence: 60%
Rationale: Today’s news Lex Fridman Visits China for Podcasts + DeepSeek, Kimi already generating overseas buzz. Lex’s visit signals: international media is actively seeking China’s AI story, not passively receiving it.

❓ Related Questions

How to Experience Kimi WebBridge?

Kimi WebBridge is Moonshot’s latest browser control feature, currently in early release. Domestic users can apply through official Kimi channels, though some advanced features may require account permissions or closed-beta access.

If you also want to experience Claude, ChatGPT, and other mainstream overseas AI tools, domestic users typically face payment friction and account registration barriers.

Solution: Visit Aivora for ready-made accounts—instant delivery, hassle-free support.

Last updated on 2026/05/16 03:49:24

05-17-Daily 05-15-日报