搜索优化
English
搜索
Copilot
图片
视频
地图
资讯
购物
更多
航班
旅游
酒店
房地产
笔记本
Top stories
Sports
U.S.
Local
World
Science
Technology
Entertainment
Business
More
Politics
过去 7 天
时间不限
过去 1 小时
过去 24 小时
过去 30 天
按相关度排序
按时间排序
eWeek
2 天
FrontierMath Benchmark Exposes AI Struggles in Advanced Math
FrontierMath Benchmark tests AI's limits in solving complex math, revealing challenges in advanced reasoning despite progress ...
MIT Technology Review
6 天
The way we measure progress in AI is terrible
A benchmark is essentially a test that an AI takes. It can be in a multiple-choice format like the most popular one, the ...
3 天
Alibaba releases Qwen with Questions, an open reasoning model that beats o1-preview
QwQ uses inference-time scaling to solve complex reasoning and planning questions, besting OpenAI's o1 in several benchmarks.
4 天
on MSN
Alibaba releases QwQ-32B-Preview, an AI rival to OpenAI's o1
This model is focused on advancing AI reasoning capabilities. In contrast to most AI, QwQ-32B-Preview and similar models can ...
2 小时
Chinese AI firms rush out costly ‘reasoning’ models to take on OpenAI’s o1
Alibaba Cloud is the latest among a slew of Chinese firms to roll out the AI models that take more time to reason through ...
1 天
DeepSeek challenges OpenAI's o1 in chain of thought - but it's missing a few links
To explore the matter, I put OpenAI's o1 against R1-Lite, the newest model from China-based startup DeepSeek. R1-Lite goes ...
InfoQ
5 天
Epoch AI Unveils FrontierMath: A New Frontier in Testing AI's Mathematical Reasoning ...
A monthly overview of things you need to know as an architect or aspiring architect.
MIT Technology Review
6 天
The Download: rethinking AI benchmarks, and the ethics of AI agents
Every time a new AI model is released, it’s typically touted as acing its performance against a series of benchmarks.
ReadWrite
4 天
Alibaba’s new AI model goes head to head with OpenAI o1
AI from Alibaba has taken a dramatic leap, as its new model, QwQ-32B, brings a new reasoning challenger to the market.
5 天
on MSN
Alibaba releases an ‘open’ challenger to OpenAI’s o1 reasoning model
Per Alibaba’s testing, QwQ-32B-Preview beats OpenAI’s o1-preview model on the AIME and MATH tests. AIME uses other AI models ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果
反馈