12-29-Daily AI News Daily
Alright, let’s get this done. I’m channeling my inner Senior Technical Translator and Editor. Time to dive in! 🚀
AI Daily 2025/12/29
AI Daily
Today’s Summary
Jim Fan reviews the robotics field: hardware is impressive, but software lags significantly, and VLM solutions “don’t feel right.” 🤔 A 32B open-source model achieves OpenAI-level deep research capabilities by replacing “predicting the next token” with “deciding the next atomic action.” ✨ Vibe Coding’s hard-earned lessons are worth saving, and Claude Code’s journey from a side project to a $1 billion product is also fascinating. 🤩
⚡ Quick Navigation
💡 Tip: Want to be among the first to try the latest AI models mentioned (Claude 4.5, GPT, Gemini 3 Pro)? Don’t have an account? Head over to Aivora to grab one, get started in a minute, and enjoy worry-free after-sales support.
Today’s AI News
👀 One-Liner
DeepMind’s documentary has surpassed 200 million views, yet Jim Fan describes the robotics field as a “Wild West”—software is far from catching up with hardware. 🤯
🔑 3 Keywords
#RoboticsDilemma #VibeCodingBestPractices #DeepResearchAgent
🔥 Top 10 Blockbusters
1. Jim Fan’s Year-End Review: Three Lessons from Robotics
Jim Fan, NVIDIA’s Senior Research Scientist, throws cold water on the hype surrounding cool robots like Optimus and Figure. He points out that while hardware is impressive, software is lagging significantly. What’s more, these robots are delicate—overheating, motor failures, and firmware glitches are common, requiring constant “coddling.” He also directly states that current mainstream VLM-based VLA solutions “don’t feel right” because visual language models are pre-trained to answer questions, often discarding low-level details crucial for dexterous manipulation. He’s betting on video world models as the right path forward. Get ready for an interesting 2026 in the robotics scene. 🤖
2. Step-DeepResearch: A 32B Parameter Agent Outperforms OpenAI and Gemini in Deep Research
Step-DeepResearch, a 32B parameter open-source model, has achieved comparable scores to OpenAI and Google’s deep research systems, which are proprietary and costly (scoring 61.42 on Scale AI benchmarks). What’s the secret? It replaces “predicting the next token” with “deciding the next atomic action”—a four-step process involving planning, deep search, reflective verification, and report generation. The coolest part is its incredibly simple architecture, a single ReAct-style Agent with no fancy multi-agent orchestration. This formula of “medium-scale models + correct training data = expert-level research capabilities” is definitely one to remember. 🧠
3. Vibe Coding Best Practices: Hard-Earned Lessons from 1 Million Lines of Code
Vibe Coding best practices: “Anyone can Vibe Code”? That’s naive. After writing a million lines of code with AI, the co-founder of @vibekanban has distilled some ironclad rules. First, plan before you code (once AI starts writing, it tends towards “minimal changes,” leading to increasingly rigid architectures). Second, enable YOLO mode and let the AI run autonomously, but only if your codebase has automated tests. Third, explicitly tell the AI in the system prompt, “We’re aiming for the simplest changes, migration costs don’t matter,” otherwise, it’ll get lazy. There’s also a cheeky trick: use ESLint to prevent the AI from arbitrarily disabling lint rules. This guide? Definitely worth bookmarking. ✅
4. The Origin of Claude Code: How a Side Project Became a $1 Billion ARR Product
Claude Code’s origin story is quite something: it started as just a side project by developer Boris Cherny in September 2024. Back then, Claude often messed up even simple Bash commands and crashed after a few minutes. But now, powered by Claude Sonnet 4.5 and Opus 4.5, it can run for hours or even days, tackling super complex tasks. The key tech is its “Stop Hooks” mechanism—when Claude wants to pause, you can “poke” it with a script to keep working, like running tests or automatically fixing failures. Anthropic defines an AI Agent as “LLM + cyclical autonomous tool invocation,” and Claude Code is a perfect embodiment of this definition. 💯
5. DeepMind Documentary “The Thinking Game” Hits 200 Million Views in 4 Weeks
Demis Hassabis himself is hyping up the DeepMind documentary, “The Thinking Game,” which tells the story of AlphaFold’s creation. It racked up over 200 million YouTube views in just four weeks. If you’re curious about how an AGI lab operates or how Nobel Prize-level projects come to life, this is a great watch for the holidays. With director Greg Kohs and music by Dan Deacon, the production team is seriously stacked. 🎬
6. LongVideoAgent: Enabling AI to Truly “Understand” One-Hour Long Videos
LongVideoAgent tackles the current limitations of multimodal large models when processing long videos, which typically resort to “compressed summaries + frantic frame extraction,” losing all the juicy details. This new paper introduces LongVideoAgent: a main Agent for reasoning and decision-making, a localization Agent to find relevant segments, and a visual Agent to extract fine-grained details. Reinforcement learning teaches the main Agent when to explore and when to stop. The results? GPT-5-mini jumped from 62.4% to 71.1% on long video Q&A benchmarks, and Qwen2.5-3B soared from 23.5% to 47.4%, effectively doubling its performance. An Agent-based design is truly the right way to approach long video understanding. 🎥
7. Genfocus: A Small AI Model Dedicated to Adjusting Depth of Field and Aperture
Genfocus is a super interesting model, specifically designed to adjust depth of field and aperture effects in images. It can even transform shallow depth-of-field photos into fully focused ones. It’s not a huge, all-encompassing image editing model; it just does one thing exceptionally well. The model is already open-source on HuggingFace, so photography enthusiasts can definitely have some fun with it. 📸
8. Sam Altman: Google Remains a Huge Threat, ChatGPT Needs “Red Alert” Twice a Year
Sam Altman’s latest statement is a real talker: Google remains a massive threat to OpenAI, and the ChatGPT team might “go into red alert twice a year, and it will last for a long time.” This is a candid admission—the AI competition is far from over, and the clash of titans will definitely continue. 💥
9. A Little Trick for a Free Month of ChatGPT Plus
Here’s a sweet trick worth snagging: after canceling your ChatGPT Plus subscription, OpenAI might give you a free month (100% off) to keep you around. No idea how long this strategy will last, but it’s working right now! 😉
10. China Releases Draft Regulations for AI Human Interaction
China is currently drafting regulatory rules for “AI with human-like interactive capabilities.” The specific details aren’t fully public yet, but the direction is clear: as AI becomes more human-like, regulation needs to keep pace. This is a significant signal for domestic teams working on AI companions and AI customer service. 🚦
📌 Worth Watching
- [Open Source] awesome-llm-apps - awesome-llm-apps is an 84K Star collection of LLM applications, packed with RAG and Agent examples.
- [Open Source] vibe-kanban - vibe-kanban is a Kanban tool that boosts Claude Code/Codex efficiency by 10x, boasting 7K Stars.
- [Open Source] Mole - Mole is a deep cleaning tool for Mac, with 21K Stars, created by Chinese developers.
- [Open Source] Fresh - Fresh is a simple yet powerful terminal text editor, no need to memorize Vim shortcuts.
- [Product] Omni-Design Custom Workshop - Omni-Design Custom Workshop, by Refly.ai, can deconstruct any complex concept into 4K HD images.
- [Business] French Telecom Giant Orange Employees Using AI Tool Site Made by Chinese Developers - Employees at French telecom giant Orange are using an AI tool site developed by Chinese creators. Indirectly serving a Fortune 500 company?
- [Other] Li Feifei’s “K12 Education is a Waste of Time” Statement Clarified - Li Feifei’s original statement was severely misinterpreted; it’s recommended to read the original text.
📊 More Updates
| 算法 | 描述 |
|---|---|
| TheAlgorithms/Python | Python implementations of all algorithms |
| RustPython | A Python interpreter written in Rust |
| QuantConnect Lean | An algorithmic trading engine |
| VLNVerse | Wu Qi’s team’s full-stack embodied navigation platform |
| How to Build a $100K AI SaaS Without Code | |
| Nuggt Canvas | A better-looking MCP client |
❓ Related Questions
How to Experience ChatGPT Plus?
ChatGPT Plus currently requires a $20 monthly subscription to access advanced features like GPT-4o. For users in mainland China, this might involve difficulties with credit card payments or account registration restrictions.
Solution:
- Aivora offers ready-made ChatGPT Plus accounts.
- Aivora provides instant delivery, so you can use it right after ordering, without dealing with payment or registration hassles.
- Aivora offers stable, exclusive accounts with worry-free after-sales support.
Visit aivora.cn to see the full list of AI account services.
AI Daily 2025/12/29
AI Daily
Today’s Summary
Jim Fan reviews the robotics field: hardware is impressive, but software lags significantly, and VLM solutions “don’t feel right.” A 32B open-source model achieves OpenAI-level deep research capabilities by replacing “predicting the next token” with “deciding the next atomic action.” Vibe Coding’s hard-earned lessons are worth saving, and Claude Code’s journey from a side project to a $1 billion product is also fascinating.
⚡ Quick Navigation
- 📰 Today’s AI News - Latest updates at a glance
💡 Tip: Want to be among the first to try the latest AI models mentioned (Claude 4.5, GPT, Gemini 3 Pro)? Don’t have an account? Head over to Aivora to grab one, get started in a minute, and enjoy worry-free after-sales support.
Today’s AI News
👀 One-Liner
DeepMind’s documentary has surpassed 200 million views, yet Jim Fan describes the robotics field as a “Wild West”—software is far from catching up with hardware.
🔑 3 Keywords
#RoboticsDilemma #VibeCodingBestPractices #DeepResearchAgent
🔥 Top 10 Blockbusters
1. Jim Fan’s Year-End Review: Three Lessons from Robotics
Jim Fan, NVIDIA’s Senior Research Scientist, throws cold water on the hype surrounding cool robots like Optimus and Figure. He points out that while hardware is impressive, software is lagging significantly. What’s more, these robots are delicate—overheating, motor failures, and firmware glitches are common, requiring constant “coddling.” He also directly states that current mainstream VLM-based VLA solutions “don’t feel right” because visual language models are pre-trained to answer questions, often discarding low-level details crucial for dexterous manipulation. He’s betting on video world models as the right path forward. Get ready for an interesting 2026 in the robotics scene.
2. Step-DeepResearch: A 32B Parameter Agent Outperforms OpenAI and Gemini in Deep Research
Step-DeepResearch, a 32B parameter open-source model, has achieved comparable scores to OpenAI and Google’s deep research systems, which are proprietary and costly (scoring 61.42 on Scale AI benchmarks). What’s the secret? It replaces “predicting the next token” with “deciding the next atomic action”—a four-step process involving planning, deep search, reflective verification, and report generation. The coolest part is its incredibly simple architecture, a single ReAct-style Agent with no fancy multi-agent orchestration. This formula of “medium-scale models + correct training data = expert-level research capabilities” is definitely one to remember.
3. Vibe Coding Best Practices: Hard-Earned Lessons from 1 Million Lines of Code
Vibe Coding best practices: “Anyone can Vibe Code”? That’s naive. After writing a million lines of code with AI, the co-founder of @vibekanban has distilled some ironclad rules. First, plan before you code (once AI starts writing, it tends towards “minimal changes,” leading to increasingly rigid architectures). Second, enable YOLO mode and let the AI run autonomously, but only if your codebase has automated tests. Third, explicitly tell the AI in the system prompt, “We’re aiming for the simplest changes, migration costs don’t matter,” otherwise, it’ll get lazy. There’s also a cheeky trick: use ESLint to prevent the AI from arbitrarily disabling lint rules. This guide? Definitely worth bookmarking.
4. The Origin of Claude Code: How a Side Project Became a $1 Billion ARR Product
Claude Code’s origin story is quite something: it started as just a side project by developer Boris Cherny in September 2024. Back then, Claude often messed up even simple Bash commands and crashed after a few minutes. But now, powered by Claude Sonnet 4.5 and Opus 4.5, it can run for hours or even days, tackling super complex tasks. The key tech is its “Stop Hooks” mechanism—when Claude wants to pause, you can “poke” it with a script to keep working, like running tests or automatically fixing failures. Anthropic defines an AI Agent as “LLM + cyclical autonomous tool invocation,” and Claude Code is a perfect embodiment of this definition.
5. DeepMind Documentary “The Thinking Game” Hits 200 Million Views in 4 Weeks
Demis Hassabis himself is hyping up the DeepMind documentary, “The Thinking Game,” which tells the story of AlphaFold’s creation. It racked up over 200 million YouTube views in just four weeks. If you’re curious about how an AGI lab operates or how Nobel Prize-level projects come to life, this is a great watch for the holidays. With director Greg Kohs and music by Dan Deacon, the production team is seriously stacked.
6. LongVideoAgent: Enabling AI to Truly “Understand” One-Hour Long Videos
LongVideoAgent tackles the current limitations of multimodal large models when processing long videos, which typically resort to “compressed summaries + frantic frame extraction,” losing all the juicy details. This new paper introduces LongVideoAgent: a main Agent for reasoning and decision-making, a localization Agent to find relevant segments, and a visual Agent to extract fine-grained details. Reinforcement learning teaches the main Agent when to explore and when to stop. The results? GPT-5-mini jumped from 62.4% to 71.1% on long video Q&A benchmarks, and Qwen2.5-3B soared from 23.5% to 47.4%, effectively doubling its performance. An Agent-based design is truly the right way to approach long video understanding.
7. Genfocus: A Small AI Model Dedicated to Adjusting Depth of Field and Aperture
Genfocus is a super interesting model, specifically designed to adjust depth of field and aperture effects in images. It can even transform shallow depth-of-field photos into fully focused ones. It’s not a huge, all-encompassing image editing model; it just does one thing exceptionally well. The model is already open-source on HuggingFace, so photography enthusiasts can definitely have some fun with it.
8. Sam Altman: Google Remains a Huge Threat, ChatGPT Needs “Red Alert” Twice a Year
Sam Altman’s latest statement is a real talker: Google remains a massive threat to OpenAI, and the ChatGPT team might “go into red alert twice a year, and it will last for a long time.” This is a candid admission—the AI competition is far from over, and the clash of titans will definitely continue.

9. A Little Trick for a Free Month of ChatGPT Plus
Here’s a sweet trick worth snagging: after canceling your ChatGPT Plus subscription, OpenAI might give you a free month (100% off) to keep you around. No idea how long this strategy will last, but it’s working right now!
10. China Releases Draft Regulations for AI Human Interaction
China is currently drafting regulatory rules for “AI with human-like interactive capabilities.” The specific details aren’t fully public yet, but the direction is clear: as AI becomes more human-like, regulation needs to keep pace. This is a significant signal for domestic teams working on AI companions and AI customer service.
📌 Worth Watching
[Open Source] awesome-llm-apps - awesome-llm-apps is an 84K Star collection of LLM applications, packed with RAG and Agent examples.
[Open Source] vibe-kanban - vibe-kanban is a Kanban tool that boosts Claude Code/Codex efficiency by 10x, boasting 7K Stars.
[Open Source] Mole - Mole is a deep cleaning tool for Mac, with 21K Stars, created by Chinese developers.
[Open Source] Fresh - Fresh is a simple yet powerful terminal text editor, no need to memorize Vim shortcuts.
[Product] Omni-Design Custom Workshop - Omni-Design Custom Workshop, by Refly.ai, can deconstruct any complex concept into 4K HD images.
[Business] French Telecom Giant Orange Employees Using AI Tool Site Made by Chinese Developers - Employees at French telecom giant Orange are using an AI tool site developed by Chinese creators. Indirectly serving a Fortune 500 company?
[Other] Li Feifei’s “K12 Education is a Waste of Time” Statement Clarified - Li Feifei’s original statement was severely misinterpreted; it’s recommended to read the original text.
📊 More Updates
| # | Type | Title | Link |
|---|---|---|---|
| 1 | Open Source | TheAlgorithms/Python - Python implementations of all algorithms | GitHub |
| 2 | Open Source | RustPython - A Python interpreter written in Rust | GitHub |
| 3 | Open Source | QuantConnect Lean - An algorithmic trading engine | GitHub |
| 4 | Research | VLNVerse - Wu Qi’s team’s full-stack embodied navigation platform | Details |
| 5 | Tutorial | How to Build a $100K AI SaaS Without Code | Video |
| 6 | Tool | Nuggt Canvas - A better-looking MCP client |
❓ Related Questions
How to Experience ChatGPT Plus?
ChatGPT Plus currently requires a $20 monthly subscription to access advanced features like GPT-4o. For users in mainland China, this might involve difficulties with credit card payments or account registration restrictions.
Solution:
- Aivora offers ready-made ChatGPT Plus accounts.
- Aivora provides instant delivery, so you can use it right after ordering, without dealing with payment or registration hassles.
- Aivora offers stable, exclusive accounts with worry-free after-sales support.
Visit aivora.cn to see the full list of AI account services.