2026.05.25 Global AI News Daily

22:56

Developers found the unreleased GPT-5.6 (internal codename iris-alpha) in OpenAI Codex logs. The model supports a 1.5M token context window and can generate high-quality UI without specific instructions, expected to launch in June. Anthropic and Google will also release new models around the same time.

Developers found the unreleased GPT-5.6 (internal codename iris-alpha) in OpenAI Codex logs. The model supports a 1.5M token context window and can generate high-quality UI without specific instructions, expected to launch in June. Anthropic and Google will also release new models around the same time.

22:42

Google DeepMind launches AI mathematics agent AlphaProof Nexus, combining large language model, reinforcement learning and evolutionary algorithm, solves 9 long-standing open Erdős problems that have been unsolved for decades, with the oldest problem hanging for 56 years. The computing cost per problem is only hundreds of dollars, and all proofs are formally verified by Lean compiler.

Google DeepMind launches AI mathematics agent AlphaProof Nexus, combining large language model, reinforcement learning and evolutionary algorithm, solves 9 long-standing open Erdős problems that have been unsolved for decades, with the oldest problem hanging for 56 years. The computing cost per problem is only hundreds of dollars, and all proofs are formally verified by Lean compiler.

16:37

Guangfan Technology, founded by former Xiaomi employee Dong Hongguang, beat Apple to launch AI full-sense headphones with a camera. The single product will be first sold at 1999 yuan on May 31, featuring visual perception and active AI assistant capabilities, can complete various scenario operations without a smartphone.

Guangfan Technology, founded by former Xiaomi employee Dong Hongguang, beat Apple to launch AI full-sense headphones with a camera. The single product will be first sold at 1999 yuan on May 31, featuring visual perception and active AI assistant capabilities, can complete various scenario operations without a smartphone.

16:37

A netizen shares a third-party discount subscription process for Gemini Pro, which costs 38.88 CNY for one year. Users need to prepare accounts in advance to avoid risk control, the whole process takes less than ten minutes, and it is recommended to use an aged old account.

A netizen shares a third-party discount subscription process for Gemini Pro, which costs 38.88 CNY for one year. Users need to prepare accounts in advance to avoid risk control, the whole process takes less than ten minutes, and it is recommended to use an aged old account.

16:01

OmniWork launches Agent OS for creation, integrating multiple domain-specific Experts that automatically divide labor and collaborate, supports Autowork for automatic task execution and three-layer persistent memory. Currently in invite-only beta testing, it enables individuals to complete complex creation projects that originally require multi-person teams.

OmniWork launches Agent OS for creation, integrating multiple domain-specific Experts that automatically divide labor and collaborate, supports Autowork for automatic task execution and three-layer persistent memory. Currently in invite-only beta testing, it enables individuals to complete complex creation projects that originally require multi-person teams.

16:01

Open source project Reasonix, built specifically for DeepSeek V4, boosts cache hit rate to 99.82% through cache optimization, reducing long session usage cost of the large model to about 20% to help users save money.

Open source project Reasonix, built specifically for DeepSeek V4, boosts cache hit rate to 99.82% through cache optimization, reducing long session usage cost of the large model to about 20% to help users save money.

16:00

FaceBit Intelligence, in collaboration with Tsinghua University and OpenBMB, open-sourced the ternary large model BitCPM-CANN on Huawei Ascend. It saves about 6x VRAM usage, with a maximum capability retention rate of 97.2%, and is expected to enable a 60B parameter large model to run on an 8GB memory smartphone in the future.

FaceBit Intelligence, in collaboration with Tsinghua University and OpenBMB, open-sourced the ternary large model BitCPM-CANN on Huawei Ascend. It saves about 6x VRAM usage, with a maximum capability retention rate of 97.2%, and is expected to enable a 60B parameter large model to run on an 8GB memory smartphone in the future.

16:00

Ilya posted a 'The Thinker' artwork with a chip die shot background on Instagram, sparking wide discussion; in the same week, OpenAI announced three major developments: discovered new geometric constructions, upgraded Codex, and is preparing for an IPO this fall.

Ilya posted a 'The Thinker' artwork with a chip die shot background on Instagram, sparking wide discussion; in the same week, OpenAI announced three major developments: discovered new geometric constructions, upgraded Codex, and is preparing for an IPO this fall.

15:59

Embodied intelligence infrastructure company Tianji Intelligence has completed RMB 1 billion Series B and B+ financing, with post-investment valuation approaching RMB 10 billion, co-led by Hillhouse Capital and Meituan Strategic Investment. Tianji builds core components for embodied intelligence, has achieved mass delivery of force-controlled humanoid dual-arm robots, with orders on hand exceeding 10,000 units in Q1 2026.

今日必读

Embodied intelligence infrastructure company Tianji Intelligence has completed RMB 1 billion Series B and B+ financing, with post-investment valuation approaching RMB 10 billion, co-led by Hillhouse Capital and Meituan Strategic Investment. Tianji builds core components for embodied intelligence, has achieved mass delivery of force-controlled humanoid dual-arm robots, with orders on hand exceeding 10,000 units in Q1 2026.

15:59

Guancha, an AI product review community founded by 2002-born founder Zhong Tai, has completed seed round financing from Sequoia China and Huaxing Capital. Six months after launch, it has nearly 2,000 settled projects and nearly 5,000 qualified reviewers, providing fair exposure and supporting services for AI startups.

深度长文

Guancha, an AI product review community founded by 2002-born founder Zhong Tai, has completed seed round financing from Sequoia China and Huaxing Capital. Six months after launch, it has nearly 2,000 settled projects and nearly 5,000 qualified reviewers, providing fair exposure and supporting services for AI startups.

15:13

Anthropic is testing a dual-mode memory system for Claude, adding the Memory Files feature. It also launched Dreams, an asynchronous background memory integration function, and will release the 7x24 always-on Conway Agent platform. First enterprise users saw 97% drop in processing errors and 30% faster document verification.

Anthropic is testing a dual-mode memory system for Claude, adding the Memory Files feature. It also launched Dreams, an asynchronous background memory integration function, and will release the 7x24 always-on Conway Agent platform. First enterprise users saw 97% drop in processing errors and 30% faster document verification.

15:12

AI video large model Keling launched the world's first native 4K function. After testing, the author found that this function improves image quality in still life scenes, but still has many problems in character movement, instruction execution and other aspects.

AI video large model Keling launched the world's first native 4K function. After testing, the author found that this function improves image quality in still life scenes, but still has many problems in character movement, instruction execution and other aspects.

15:12

Amanda Askell from Anthropic shared a Prompt that uses AI to generate fables to help understand new concepts. A Chinese blogger optimized it to support specifying concepts and avoid template routines, it can help understand and memorize new concepts in 5 minutes, and works on multiple AI models.

Amanda Askell from Anthropic shared a Prompt that uses AI to generate fables to help understand new concepts. A Chinese blogger optimized it to support specifying concepts and avoid template routines, it can help understand and memorize new concepts in 5 minutes, and works on multiple AI models.

15:11

The DestinyLinker team released the Mingli-Bench evaluation dataset, testing shows that mainstream large models only achieve 23%-40% accuracy on traditional Chinese numerology problems. The Tianfu Agent developed by the team reaches 50% accuracy, approaching the 53.5% average level of human top 20 players.

The DestinyLinker team released the Mingli-Bench evaluation dataset, testing shows that mainstream large models only achieve 23%-40% accuracy on traditional Chinese numerology problems. The Tianfu Agent developed by the team reaches 50% accuracy, approaching the 53.5% average level of human top 20 players.

15:11

Zhixiang Future releases native full-modal large model HiDream-O1-Image, including an 8B open-source version and a 200B closed-source version, adopting a unified training architecture for all modalities. It achieved over 100 million RMB revenue in 2025, maintains double-digit growth in Q1 2026, and just completed two rounds of hundred-million RMB financing.

深度长文

Zhixiang Future releases native full-modal large model HiDream-O1-Image, including an 8B open-source version and a 200B closed-source version, adopting a unified training architecture for all modalities. It achieved over 100 million RMB revenue in 2025, maintains double-digit growth in Q1 2026, and just completed two rounds of hundred-million RMB financing.

15:10

UniPat AI launches SaaS-Bench, a benchmark containing 23 real SaaS systems and 106 real office tasks. Tests show the top-performing Claude Opus 4.7 has a complete pass rate of only 3.8%, exposing capability flaws of current AI Agents in long-horizon cross-application tasks.

UniPat AI launches SaaS-Bench, a benchmark containing 23 real SaaS systems and 106 real office tasks. Tests show the top-performing Claude Opus 4.7 has a complete pass rate of only 3.8%, exposing capability flaws of current AI Agents in long-horizon cross-application tasks.

15:10

Researcher Shengbang (Peter) Tong, who graduated from New York University and studied under Yann LeCun and Saining Xie, recently announced joining AMI Labs, an AI research institution promoted by Yann LeCun. He will focus on visual large model research, exploring the construction of a unified model that supports both understanding and generation, and a stronger general world model.

深度长文

Researcher Shengbang (Peter) Tong, who graduated from New York University and studied under Yann LeCun and Saining Xie, recently announced joining AMI Labs, an AI research institution promoted by Yann LeCun. He will focus on visual large model research, exploring the construction of a unified model that supports both understanding and generation, and a stronger general world model.

15:05

OmniWork launches an Agent operating system for creative work, enabling multiple AI experts to divide labor and collaborate to complete complete creative tasks. Tests show it can deliver finished products including research, animation shorts, and running games within minutes to hours, and it is currently in internal testing.

OmniWork launches an Agent operating system for creative work, enabling multiple AI experts to divide labor and collaborate to complete complete creative tasks. Tests show it can deliver finished products including research, animation shorts, and running games within minutes to hours, and it is currently in internal testing.

14:47

Qoder, affiliated with Alibaba, launches public beta product QoderWake, which allows users to build a 7×24 working AI digital employee team on local computers, supports switching between multiple models, features management system and security permission control, opens 6 built-in positions in initial release, currently only supports Mac, and offers half-price discount on the official website.

Qoder, affiliated with Alibaba, launches public beta product QoderWake, which allows users to build a 7×24 working AI digital employee team on local computers, supports switching between multiple models, features management system and security permission control, opens 6 built-in positions in initial release, currently only supports Mac, and offers half-price discount on the official website.

14:46

Google CEO Sundar Pichai admitted in a New York Times podcast interview that Gemini lags behind industry leaders in coding agents and complex coding tasks, while noting AGI is closer than previously expected and AI development is currently extremely fast.

Google CEO Sundar Pichai admitted in a New York Times podcast interview that Gemini lags behind industry leaders in coding agents and complex coding tasks, while noting AGI is closer than previously expected and AI development is currently extremely fast.

14:46

British media reports that Silicon Valley giants including Amazon and Meta have implemented internal AI token consumption leaderboards, giving rise to tokenmaxxing. Meta burned 60 trillion tokens in 30 days. Data shows 10x token consumption only yields around 2x output, with soaring costs and disproportionate efficiency gains.

深度长文

British media reports that Silicon Valley giants including Amazon and Meta have implemented internal AI token consumption leaderboards, giving rise to tokenmaxxing. Meta burned 60 trillion tokens in 30 days. Data shows 10x token consumption only yields around 2x output, with soaring costs and disproportionate efficiency gains.

14:46

In the AI era, LLM and Agent have reduced the cost of cross-position action, and the split work closed-loop has returned to individuals. This article proposes that super individuals are not trained, but inspired by curiosity, and organizations need to provide four soils: complete problem ownership, tool authority, user feedback and recognition.

In the AI era, LLM and Agent have reduced the cost of cross-position action, and the split work closed-loop has returned to individuals. This article proposes that super individuals are not trained, but inspired by curiosity, and organizations need to provide four soils: complete problem ownership, tool authority, user feedback and recognition.

14:45

38 scholars from 13 institutions including Fudan University have jointly released the most systematic technical survey on safety in embodied AI to date. The survey is over 70 pages, covers nearly 480 research papers, proposes Capability-Risk Duality, sorts out security risks at different levels, and provides open community resources.

38 scholars from 13 institutions including Fudan University have jointly released the most systematic technical survey on safety in embodied AI to date. The survey is over 70 pages, covers nearly 480 research papers, proposes Capability-Risk Duality, sorts out security risks at different levels, and provides open community resources.

14:45

Open-source AI Agent project OpenClacky is released. It reduces Token consumption through cache optimization and tool simplification, with actual measured cost being 1/3 to 1/6 of similar projects. It also supports users to package professional skills into Agent Skills and achieve compound interest monetization through an authorization code system.

Open-source AI Agent project OpenClacky is released. It reduces Token consumption through cache optimization and tool simplification, with actual measured cost being 1/3 to 1/6 of similar projects. It also supports users to package professional skills into Agent Skills and achieve compound interest monetization through an authorization code system.

14:44

Facebit Intelligence and OpenBMB released BitCPM-CANN, the first ternary large model trained on domestic Huawei Ascend. Four sizes from 0.5B to 8B are open source, the smallest requires only about 200MB memory and can run on smart watches, retaining more than 95% of the original model's capability.

Facebit Intelligence and OpenBMB released BitCPM-CANN, the first ternary large model trained on domestic Huawei Ascend. Four sizes from 0.5B to 8B are open source, the smallest requires only about 200MB memory and can run on smart watches, retaining more than 95% of the original model's capability.

11:31

ima fully opens Copilot, which was previously queued by over 100,000 people. It can activate users' knowledge bases and directly access materials to complete work. Meanwhile, ima knowledge accounts open up Skill publishing, and the knowledge square adds Skill sharing and usage entries.

ima fully opens Copilot, which was previously queued by over 100,000 people. It can activate users' knowledge bases and directly access materials to complete work. Meanwhile, ima knowledge accounts open up Skill publishing, and the knowledge square adds Skill sharing and usage entries.

10:27

Zhu Senhua, former director of Huawei Cloud AI Algorithm Innovation Lab, founded JuNao PanShi, developing Cognitive World Model using cognitive science. The company recently completed a 100-million-yuan level financing, will advance technology R&D and implementation verification.

Zhu Senhua, former director of Huawei Cloud AI Algorithm Innovation Lab, founded JuNao PanShi, developing Cognitive World Model using cognitive science. The company recently completed a 100-million-yuan level financing, will advance technology R&D and implementation verification.

10:16

User Mnimiy published a Claude settings audit post on X, disclosing 18 undisclosed/hidden settings across three Claude platforms. Claude Code has 125 configuration keys with only 40 documented. Proper configuration reduces token waste, cuts costs, improves output quality, and the audit optimization can be completed in 20 minutes.

User Mnimiy published a Claude settings audit post on X, disclosing 18 undisclosed/hidden settings across three Claude platforms. Claude Code has 125 configuration keys with only 40 documented. Proper configuration reduces token waste, cuts costs, improves output quality, and the audit optimization can be completed in 20 minutes.

10:15

Andon Labs conducted experiments letting top large models fully operate businesses including radio stations and physical stores. The San Francisco physical store lost $13,000 within one month, all projects failed, proving current AI cannot fully operate autonomously in the real world.

Andon Labs conducted experiments letting top large models fully operate businesses including radio stations and physical stores. The San Francisco physical store lost $13,000 within one month, all projects failed, proving current AI cannot fully operate autonomously in the real world.

10:15

AI startup Verkor's Design Conductor AI Agent system, using only a 219-word requirement description, independently completed the entire 7nm RISC-V CPU VerCore design process from requirements to layout in 12 hours, with no engineer involvement throughout.

深度长文

AI startup Verkor's Design Conductor AI Agent system, using only a 219-word requirement description, independently completed the entire 7nm RISC-V CPU VerCore design process from requirements to layout in 12 hours, with no engineer involvement throughout.

10:15

Wang Leehom released a new single "Come What May" on 520, and simultaneously launched the music video of the same name billed as the world's first interactive AI music film, using AI technology to generate varied visual effects and tell a heartwarming story.

Wang Leehom released a new single "Come What May" on 520, and simultaneously launched the music video of the same name billed as the world's first interactive AI music film, using AI technology to generate varied visual effects and tell a heartwarming story.

10:14

San Francisco developer Affaan Mustafa built the ECC system with 38 agents and 156 skills based on Claude Code. Open-sourced under MIT license, it has gained 150,000 stars on GitHub. He previously won a hackathon with this technical approach, earning $15K worth of platform credits.

San Francisco developer Affaan Mustafa built the ECC system with 38 agents and 156 skills based on Claude Code. Open-sourced under MIT license, it has gained 150,000 stars on GitHub. He previously won a hackathon with this technical approach, earning $15K worth of platform credits.

10:14

OpenAI partners with Broadcom to develop 10GW custom AI chips, with total production cost around $180 billion, aiming to reduce reliance on NVIDIA and cut costs. The project is currently stalled as Microsoft has not yet fulfilled its commitment to purchase 40% of the first-phase chips required for financing.

OpenAI partners with Broadcom to develop 10GW custom AI chips, with total production cost around $180 billion, aiming to reduce reliance on NVIDIA and cut costs. The project is currently stalled as Microsoft has not yet fulfilled its commitment to purchase 40% of the first-phase chips required for financing.

10:13

On May 22, researchers from multiple institutions published the CODA paper. Through mathematical rewriting, it integrates most Transformer computations into the epilogue of matrix multiplication to reduce memory movement, achieving 5%-20% speedup, and enables LLMs and beginners to write high-performance kernels.

深度长文

On May 22, researchers from multiple institutions published the CODA paper. Through mathematical rewriting, it integrates most Transformer computations into the epilogue of matrix multiplication to reduce memory movement, achieving 5%-20% speedup, and enables LLMs and beginners to write high-performance kernels.

10:13

UniPat AI releases SaaS-Bench, a benchmark for AI Agent in real office scenarios. Among 106 test tasks, Claude Opus 4.7 only has a 3.8% complete task solving rate, while Kimi K2.5 and Gemini 3.1 Pro get zero solving rate, exposing four major bottlenecks of current AI Agent in real office work.

深度长文

UniPat AI releases SaaS-Bench, a benchmark for AI Agent in real office scenarios. Among 106 test tasks, Claude Opus 4.7 only has a 3.8% complete task solving rate, while Kimi K2.5 and Gemini 3.1 Pro get zero solving rate, exposing four major bottlenecks of current AI Agent in real office work.

09:50

Anthropic engineer Arnaud Doko shared three efficient collaboration methods for Claude Code on the official podcast: interactively extract requirements, use HTML instead of Markdown for specifications to save tokens, make verification a native feature of Agent, demonstrated with a bill splitting application example.

Anthropic engineer Arnaud Doko shared three efficient collaboration methods for Claude Code on the official podcast: interactively extract requirements, use HTML instead of Markdown for specifications to save tokens, make verification a native feature of Agent, demonstrated with a bill splitting application example.

09:49

On May 19 in Beijing, Zhixiang Future released HiDream-O1-Image-Pro, an ultra-200B parameter native full-modal image large model, which set new SOTA records in multiple tasks; Zhixiang Future completed another new round of hundred-million level financing within half a month, with multiple institutions participating.

On May 19 in Beijing, Zhixiang Future released HiDream-O1-Image-Pro, an ultra-200B parameter native full-modal image large model, which set new SOTA records in multiple tasks; Zhixiang Future completed another new round of hundred-million level financing within half a month, with multiple institutions participating.

09:49

In May 2026, five Chinese world model startups collectively debuted, each entering the field along different technical routes, and all have completed new rounds of financing. Among them, GigaVision became China's first world model unicorn with a $1.4 billion valuation, and multiple companies have achieved technological implementation or released demos.

In May 2026, five Chinese world model startups collectively debuted, each entering the field along different technical routes, and all have completed new rounds of financing. Among them, GigaVision became China's first world model unicorn with a $1.4 billion valuation, and multiple companies have achieved technological implementation or released demos.

09:49

Researchers from Xiaomi team and others propose Visual Para-Thinker, the first parallel thinking framework for large vision-language models at ICML 2026, integrating Pa-Attention and LPRoPE mechanisms, achieving significant performance improvements on multiple visual tasks.

Researchers from Xiaomi team and others propose Visual Para-Thinker, the first parallel thinking framework for large vision-language models at ICML 2026, integrating Pa-Attention and LPRoPE mechanisms, achieving significant performance improvements on multiple visual tasks.

09:48

A group of senior film practitioners launched MovieFlow Studio, an AI film production system that integrates full-process production, reusable asset library, and 1000-person-level collaborative management. Testing shows that an 80-episode short drama can be completed in only 3 days, with production efficiency increased by over 300% and Token consumption reduced by 70%.

A group of senior film practitioners launched MovieFlow Studio, an AI film production system that integrates full-process production, reusable asset library, and 1000-person-level collaborative management. Testing shows that an 80-episode short drama can be completed in only 3 days, with production efficiency increased by over 300% and Token consumption reduced by 70%.

09:47

Former Google DeepMind Chinese researcher Lun Wang published an article after leaving the company, stating that the real bottleneck of the AI industry is not computing power etc., but the evaluation system. Existing evaluations assume that next-generation models are just enhanced versions of current models; if models enter a completely new capability range, the entire evaluation infrastructure will collapse.

Former Google DeepMind Chinese researcher Lun Wang published an article after leaving the company, stating that the real bottleneck of the AI industry is not computing power etc., but the evaluation system. Existing evaluations assume that next-generation models are just enhanced versions of current models; if models enter a completely new capability range, the entire evaluation infrastructure will collapse.

09:47

Digital marketing agency Graphite released a study in May 2026 showing that since November 2024, the number of AI-generated English articles on the internet has exceeded that written by humans, and the proportion stabilized at over 50% in 2025. Merriam-Webster selected "slop" as its 2025 Word of the Year, specifically referring to low-quality content mass-produced by AI.

Digital marketing agency Graphite released a study in May 2026 showing that since November 2024, the number of AI-generated English articles on the internet has exceeded that written by humans, and the proportion stabilized at over 50% in 2025. Merriam-Webster selected "slop" as its 2025 Word of the Year, specifically referring to low-quality content mass-produced by AI.

09:46

Researchers from The Chinese University of Hong Kong, Shenzhen and other institutions proposed AgentChord, an agent system for robotic manipulation. It has been accepted by RSS 2026 and open-sourced. The system anticipates potential failures in advance and pre-integrates recovery actions into the task graph, experiments show it outperforms existing methods in success rate.

Researchers from The Chinese University of Hong Kong, Shenzhen and other institutions proposed AgentChord, an agent system for robotic manipulation. It has been accepted by RSS 2026 and open-sourced. The system anticipates potential failures in advance and pre-integrates recovery actions into the task graph, experiments show it outperforms existing methods in success rate.

09:34

Nvidia CEO Jensen Huang expects that annual global AI infrastructure spending will reach $3 to $4 trillion by 2030. In Nvidia's latest fiscal quarter, revenue hit $81.6 billion, up 85% year-over-year, with net profit of $58.3 billion, more than tripling year-over-year.

Nvidia CEO Jensen Huang expects that annual global AI infrastructure spending will reach $3 to $4 trillion by 2030. In Nvidia's latest fiscal quarter, revenue hit $81.6 billion, up 85% year-over-year, with net profit of $58.3 billion, more than tripling year-over-year.

09:27

Google launches Gemini for Science, an AI for Science tool suite, at 2026 I/O. It integrates over 30 tools covering the entire R&D workflow, with experimental access opening in May, already used by hundreds of institutions.

Google launches Gemini for Science, an AI for Science tool suite, at 2026 I/O. It integrates over 30 tools covering the entire R&D workflow, with experimental access opening in May, already used by hundreds of institutions.

Global AI Daily News @keyframes heroTypewriter { from { max-width: 0; } to { max-width: 100vw; } }

Global AI Daily News