12-29-Daily AI News Daily

Alright, let’s get this done. I’m channeling my inner Senior Technical Translator and Editor. Time to dive in! 🚀


AI Daily 2025/12/29

AI Daily

Today’s Summary

Jim Fan reviews the robotics field: hardware is impressive, but software lags significantly, and VLM solutions “don’t feel right.” 🤔 A 32B open-source model achieves OpenAI-level deep research capabilities by replacing “predicting the next token” with “deciding the next atomic action.” ✨ Vibe Coding’s hard-earned lessons are worth saving, and Claude Code’s journey from a side project to a $1 billion product is also fascinating. 🤩

⚡ Quick Navigation

📰 Today’s AI News

💡 Tip: Want to be among the first to try the latest AI models mentioned (Claude 4.5, GPT, Gemini 3 Pro)? Don’t have an account? Head over to Aivora to grab one, get started in a minute, and enjoy worry-free after-sales support.

Today’s AI News

👀 One-Liner

DeepMind’s documentary has surpassed 200 million views, yet Jim Fan describes the robotics field as a “Wild West”—software is far from catching up with hardware. 🤯

🔑 3 Keywords

#RoboticsDilemma #VibeCodingBestPractices #DeepResearchAgent

🔥 Top 10 Blockbusters

1. Jim Fan’s Year-End Review: Three Lessons from Robotics

Jim Fan, NVIDIA’s Senior Research Scientist, throws cold water on the hype surrounding cool robots like Optimus and Figure. He points out that while hardware is impressive, software is lagging significantly. What’s more, these robots are delicate—overheating, motor failures, and firmware glitches are common, requiring constant “coddling.” He also directly states that current mainstream VLM-based VLA solutions “don’t feel right” because visual language models are pre-trained to answer questions, often discarding low-level details crucial for dexterous manipulation. He’s betting on video world models as the right path forward. Get ready for an interesting 2026 in the robotics scene. 🤖

2. Step-DeepResearch: A 32B Parameter Agent Outperforms OpenAI and Gemini in Deep Research

Step-DeepResearch, a 32B parameter open-source model, has achieved comparable scores to OpenAI and Google’s deep research systems, which are proprietary and costly (scoring 61.42 on Scale AI benchmarks). What’s the secret? It replaces “predicting the next token” with “deciding the next atomic action”—a four-step process involving planning, deep search, reflective verification, and report generation. The coolest part is its incredibly simple architecture, a single ReAct-style Agent with no fancy multi-agent orchestration. This formula of “medium-scale models + correct training data = expert-level research capabilities” is definitely one to remember. 🧠

3. Vibe Coding Best Practices: Hard-Earned Lessons from 1 Million Lines of Code

Vibe Coding best practices: “Anyone can Vibe Code”? That’s naive. After writing a million lines of code with AI, the co-founder of @vibekanban has distilled some ironclad rules. First, plan before you code (once AI starts writing, it tends towards “minimal changes,” leading to increasingly rigid architectures). Second, enable YOLO mode and let the AI run autonomously, but only if your codebase has automated tests. Third, explicitly tell the AI in the system prompt, “We’re aiming for the simplest changes, migration costs don’t matter,” otherwise, it’ll get lazy. There’s also a cheeky trick: use ESLint to prevent the AI from arbitrarily disabling lint rules. This guide? Definitely worth bookmarking. ✅

4. The Origin of Claude Code: How a Side Project Became a $1 Billion ARR Product

Claude Code’s origin story is quite something: it started as just a side project by developer Boris Cherny in September 2024. Back then, Claude often messed up even simple Bash commands and crashed after a few minutes. But now, powered by Claude Sonnet 4.5 and Opus 4.5, it can run for hours or even days, tackling super complex tasks. The key tech is its “Stop Hooks” mechanism—when Claude wants to pause, you can “poke” it with a script to keep working, like running tests or automatically fixing failures. Anthropic defines an AI Agent as “LLM + cyclical autonomous tool invocation,” and Claude Code is a perfect embodiment of this definition. 💯

5. DeepMind Documentary “The Thinking Game” Hits 200 Million Views in 4 Weeks

Demis Hassabis himself is hyping up the DeepMind documentary, “The Thinking Game,” which tells the story of AlphaFold’s creation. It racked up over 200 million YouTube views in just four weeks. If you’re curious about how an AGI lab operates or how Nobel Prize-level projects come to life, this is a great watch for the holidays. With director Greg Kohs and music by Dan Deacon, the production team is seriously stacked. 🎬

6. LongVideoAgent: Enabling AI to Truly “Understand” One-Hour Long Videos

LongVideoAgent tackles the current limitations of multimodal large models when processing long videos, which typically resort to “compressed summaries + frantic frame extraction,” losing all the juicy details. This new paper introduces LongVideoAgent: a main Agent for reasoning and decision-making, a localization Agent to find relevant segments, and a visual Agent to extract fine-grained details. Reinforcement learning teaches the main Agent when to explore and when to stop. The results? GPT-5-mini jumped from 62.4% to 71.1% on long video Q&A benchmarks, and Qwen2.5-3B soared from 23.5% to 47.4%, effectively doubling its performance. An Agent-based design is truly the right way to approach long video understanding. 🎥

7. Genfocus: A Small AI Model Dedicated to Adjusting Depth of Field and Aperture

Genfocus is a super interesting model, specifically designed to adjust depth of field and aperture effects in images. It can even transform shallow depth-of-field photos into fully focused ones. It’s not a huge, all-encompassing image editing model; it just does one thing exceptionally well. The model is already open-source on HuggingFace, so photography enthusiasts can definitely have some fun with it. 📸

8. Sam Altman: Google Remains a Huge Threat, ChatGPT Needs “Red Alert” Twice a Year

Sam Altman’s latest statement is a real talker: Google remains a massive threat to OpenAI, and the ChatGPT team might “go into red alert twice a year, and it will last for a long time.” This is a candid admission—the AI competition is far from over, and the clash of titans will definitely continue. 💥

9. A Little Trick for a Free Month of ChatGPT Plus

Here’s a sweet trick worth snagging: after canceling your ChatGPT Plus subscription, OpenAI might give you a free month (100% off) to keep you around. No idea how long this strategy will last, but it’s working right now! 😉

10. China Releases Draft Regulations for AI Human Interaction

China is currently drafting regulatory rules for “AI with human-like interactive capabilities.” The specific details aren’t fully public yet, but the direction is clear: as AI becomes more human-like, regulation needs to keep pace. This is a significant signal for domestic teams working on AI companions and AI customer service. 🚦

📌 Worth Watching

📊 More Updates

算法描述
TheAlgorithms/PythonPython implementations of all algorithms
RustPythonA Python interpreter written in Rust
QuantConnect LeanAn algorithmic trading engine
VLNVerseWu Qi’s team’s full-stack embodied navigation platform
How to Build a $100K AI SaaS Without Code
Nuggt CanvasA better-looking MCP client

❓ Related Questions

How to Experience ChatGPT Plus?

ChatGPT Plus currently requires a $20 monthly subscription to access advanced features like GPT-4o. For users in mainland China, this might involve difficulties with credit card payments or account registration restrictions.

Solution:

  • Aivora offers ready-made ChatGPT Plus accounts.
  • Aivora provides instant delivery, so you can use it right after ordering, without dealing with payment or registration hassles.
  • Aivora offers stable, exclusive accounts with worry-free after-sales support.

Visit aivora.cn to see the full list of AI account services.

AI Daily 2025/12/29

AI Daily

Today’s Summary

Jim Fan reviews the robotics field: hardware is impressive, but software lags significantly, and VLM solutions “don’t feel right.” A 32B open-source model achieves OpenAI-level deep research capabilities by replacing “predicting the next token” with “deciding the next atomic action.” Vibe Coding’s hard-earned lessons are worth saving, and Claude Code’s journey from a side project to a $1 billion product is also fascinating.

⚡ Quick Navigation

💡 Tip: Want to be among the first to try the latest AI models mentioned (Claude 4.5, GPT, Gemini 3 Pro)? Don’t have an account? Head over to Aivora to grab one, get started in a minute, and enjoy worry-free after-sales support.

Today’s AI News

👀 One-Liner

DeepMind’s documentary has surpassed 200 million views, yet Jim Fan describes the robotics field as a “Wild West”—software is far from catching up with hardware.

🔑 3 Keywords

#RoboticsDilemma #VibeCodingBestPractices #DeepResearchAgent


🔥 Top 10 Blockbusters

1. Jim Fan’s Year-End Review: Three Lessons from Robotics

Jim Fan, NVIDIA’s Senior Research Scientist, throws cold water on the hype surrounding cool robots like Optimus and Figure. He points out that while hardware is impressive, software is lagging significantly. What’s more, these robots are delicate—overheating, motor failures, and firmware glitches are common, requiring constant “coddling.” He also directly states that current mainstream VLM-based VLA solutions “don’t feel right” because visual language models are pre-trained to answer questions, often discarding low-level details crucial for dexterous manipulation. He’s betting on video world models as the right path forward. Get ready for an interesting 2026 in the robotics scene.

AI News Image


2. Step-DeepResearch: A 32B Parameter Agent Outperforms OpenAI and Gemini in Deep Research

Step-DeepResearch, a 32B parameter open-source model, has achieved comparable scores to OpenAI and Google’s deep research systems, which are proprietary and costly (scoring 61.42 on Scale AI benchmarks). What’s the secret? It replaces “predicting the next token” with “deciding the next atomic action”—a four-step process involving planning, deep search, reflective verification, and report generation. The coolest part is its incredibly simple architecture, a single ReAct-style Agent with no fancy multi-agent orchestration. This formula of “medium-scale models + correct training data = expert-level research capabilities” is definitely one to remember.

AI News Image


3. Vibe Coding Best Practices: Hard-Earned Lessons from 1 Million Lines of Code

Vibe Coding best practices: “Anyone can Vibe Code”? That’s naive. After writing a million lines of code with AI, the co-founder of @vibekanban has distilled some ironclad rules. First, plan before you code (once AI starts writing, it tends towards “minimal changes,” leading to increasingly rigid architectures). Second, enable YOLO mode and let the AI run autonomously, but only if your codebase has automated tests. Third, explicitly tell the AI in the system prompt, “We’re aiming for the simplest changes, migration costs don’t matter,” otherwise, it’ll get lazy. There’s also a cheeky trick: use ESLint to prevent the AI from arbitrarily disabling lint rules. This guide? Definitely worth bookmarking.

AI News Image


4. The Origin of Claude Code: How a Side Project Became a $1 Billion ARR Product

Claude Code’s origin story is quite something: it started as just a side project by developer Boris Cherny in September 2024. Back then, Claude often messed up even simple Bash commands and crashed after a few minutes. But now, powered by Claude Sonnet 4.5 and Opus 4.5, it can run for hours or even days, tackling super complex tasks. The key tech is its “Stop Hooks” mechanism—when Claude wants to pause, you can “poke” it with a script to keep working, like running tests or automatically fixing failures. Anthropic defines an AI Agent as “LLM + cyclical autonomous tool invocation,” and Claude Code is a perfect embodiment of this definition.

AI News Image


5. DeepMind Documentary “The Thinking Game” Hits 200 Million Views in 4 Weeks

Demis Hassabis himself is hyping up the DeepMind documentary, “The Thinking Game,” which tells the story of AlphaFold’s creation. It racked up over 200 million YouTube views in just four weeks. If you’re curious about how an AGI lab operates or how Nobel Prize-level projects come to life, this is a great watch for the holidays. With director Greg Kohs and music by Dan Deacon, the production team is seriously stacked.

AI News Image


6. LongVideoAgent: Enabling AI to Truly “Understand” One-Hour Long Videos

LongVideoAgent tackles the current limitations of multimodal large models when processing long videos, which typically resort to “compressed summaries + frantic frame extraction,” losing all the juicy details. This new paper introduces LongVideoAgent: a main Agent for reasoning and decision-making, a localization Agent to find relevant segments, and a visual Agent to extract fine-grained details. Reinforcement learning teaches the main Agent when to explore and when to stop. The results? GPT-5-mini jumped from 62.4% to 71.1% on long video Q&A benchmarks, and Qwen2.5-3B soared from 23.5% to 47.4%, effectively doubling its performance. An Agent-based design is truly the right way to approach long video understanding.

AI News Image


7. Genfocus: A Small AI Model Dedicated to Adjusting Depth of Field and Aperture

Genfocus is a super interesting model, specifically designed to adjust depth of field and aperture effects in images. It can even transform shallow depth-of-field photos into fully focused ones. It’s not a huge, all-encompassing image editing model; it just does one thing exceptionally well. The model is already open-source on HuggingFace, so photography enthusiasts can definitely have some fun with it.


8. Sam Altman: Google Remains a Huge Threat, ChatGPT Needs “Red Alert” Twice a Year

Sam Altman’s latest statement is a real talker: Google remains a massive threat to OpenAI, and the ChatGPT team might “go into red alert twice a year, and it will last for a long time.” This is a candid admission—the AI competition is far from over, and the clash of titans will definitely continue.

AI News Image


9. A Little Trick for a Free Month of ChatGPT Plus

Here’s a sweet trick worth snagging: after canceling your ChatGPT Plus subscription, OpenAI might give you a free month (100% off) to keep you around. No idea how long this strategy will last, but it’s working right now!

AI News Image


10. China Releases Draft Regulations for AI Human Interaction

China is currently drafting regulatory rules for “AI with human-like interactive capabilities.” The specific details aren’t fully public yet, but the direction is clear: as AI becomes more human-like, regulation needs to keep pace. This is a significant signal for domestic teams working on AI companions and AI customer service.


📌 Worth Watching

[Open Source] awesome-llm-apps - awesome-llm-apps is an 84K Star collection of LLM applications, packed with RAG and Agent examples.

[Open Source] vibe-kanban - vibe-kanban is a Kanban tool that boosts Claude Code/Codex efficiency by 10x, boasting 7K Stars.

[Open Source] Mole - Mole is a deep cleaning tool for Mac, with 21K Stars, created by Chinese developers.

[Open Source] Fresh - Fresh is a simple yet powerful terminal text editor, no need to memorize Vim shortcuts.

[Product] Omni-Design Custom Workshop - Omni-Design Custom Workshop, by Refly.ai, can deconstruct any complex concept into 4K HD images.

[Business] French Telecom Giant Orange Employees Using AI Tool Site Made by Chinese Developers - Employees at French telecom giant Orange are using an AI tool site developed by Chinese creators. Indirectly serving a Fortune 500 company?

[Other] Li Feifei’s “K12 Education is a Waste of Time” Statement Clarified - Li Feifei’s original statement was severely misinterpreted; it’s recommended to read the original text.


📊 More Updates

#TypeTitleLink
1Open SourceTheAlgorithms/Python - Python implementations of all algorithmsGitHub
2Open SourceRustPython - A Python interpreter written in RustGitHub
3Open SourceQuantConnect Lean - An algorithmic trading engineGitHub
4ResearchVLNVerse - Wu Qi’s team’s full-stack embodied navigation platformDetails
5TutorialHow to Build a $100K AI SaaS Without CodeVideo
6ToolNuggt Canvas - A better-looking MCP clientReddit

❓ Related Questions

How to Experience ChatGPT Plus?

ChatGPT Plus currently requires a $20 monthly subscription to access advanced features like GPT-4o. For users in mainland China, this might involve difficulties with credit card payments or account registration restrictions.

Solution:

  • Aivora offers ready-made ChatGPT Plus accounts.
  • Aivora provides instant delivery, so you can use it right after ordering, without dealing with payment or registration hassles.
  • Aivora offers stable, exclusive accounts with worry-free after-sales support.

Visit aivora.cn to see the full list of AI account services.

Last updated on