中概AI算力股拉升，小米开源首个为推理而生的大模型_老虎社区_美港股上老虎 - 老虎社区

2
1
3

中概AI算力股拉升，小米开源首个为推理而生的大模型

老虎资讯综合2025-04-30

周三，中概AI算力股拉升，金山云一度涨超12%，万国数据涨超6%，世纪互联涨超5%。今天，小米开源首个为推理而生的大模型「Xiaomi MiMo」。

今天，小米开源首个为推理（Reasoning）而生的大模型「Xiaomi MiMo」，联动预训练到后训练，全面提升推理能力。

在数学推理（AIME 24-25）和代码竞赛（LiveCodeBench v5）公开测评集上，MiMo 仅用 7B 的参数规模，超越了 OpenAI 的闭源推理模型 o1-mini 和阿里 Qwen 更大规模的开源推理模型 QwQ-32B-Preview。

强化学习潜力超越经典开源32B模型

随着DeepSeek-R1引发业界强化学习(RL)共创潮，DeepSeek-R1-Distill-7B和Qwen2.5-32B已成为广泛使用的强化学习起步模型。

在相同RL训练数据情况下，MiMo-7B 的数学&代码领域的强化学习潜力显著领先。

预训练+后训练，联动提升推理能力

MiMo推理能力的提升，由预训练和后训练阶段中数据和算法等多层面的创新联合驱动，包括：

预训练：核心是让模型见过更多推理模式
- 数据：着重挖掘富推理语料，并合成约200B tokens推理数据。
- 训练：进行了三阶段训练，逐步提升训练难度，总训练25T tokens。

后训练：核心是高效稳定的强化学习算法和框架
- 算法：提出 Test Difficulty Driven Reward 来缓解困难算法问题中的奖励稀疏问题，并引入 Easy Data Re-Sampling 策略，以稳定 RL 训练。
- 框架：设计了Seamless Rollout系统，使得RL训练加速2.29倍，验证加速1.96倍。

免责声明：本文观点仅代表作者个人观点，不构成本平台的投资建议，本平台不对文章信息准确性、完整性和及时性做出任何保证，亦不对因使用或信赖文章信息引发的任何损失承担责任。

2

举报

评论（1）

Blackland
·2025-04-30
模型多的不得了，都疲惫了，实际上有啥用
回复
1
举报

热议股票

7x24

{"i18n":{"language":"zh_CN"},"data":{"share":"https://www.laohu8.com/m/news/1100141639?lang=zh_CN&edition=full","thumbnail":"https://static.tigerbbs.com/d9a5281307c7da041b475a4c7060860b","is_english":false,"pubTime":"2025-04-30 21:56","share_image_url":"https://static.laohu8.com/e9f99090a1c2ed51c021029395664489","id":"1100141639","market":"us","top_or_hot":-1,"title":"中概AI算力股拉升，小米开源首个为推理而生的大模型","media":"老虎资讯综合","content":"<html><head></head><body><p>周三，中概AI算力股拉升，金山云一度涨超12%，万国数据涨超6%，世纪互联涨超5%。今天，小米开源首个为推理而生的大模型「Xiaomi MiMo」。</p><p></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/2faad5c544dc9a1cf62e9ada8514e536\" tg-width=\"560\" tg-height=\"240\"/></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/0e4f115bd4397174d2bdc4b8e6c2110d\" tg-width=\"560\" tg-height=\"240\"/></p><p>今天，<strong>小米开源首个为推理（Reasoning）而生的大模型「Xiaomi MiMo」</strong>，联动预训练到后训练，全面提升推理能力。</p><p style=\"text-align: justify;\">在数学推理（AIME 24-25）和 代码竞赛（LiveCodeBench v5）公开测评集上，MiMo 仅用 7B 的参数规模，超越了 OpenAI 的闭源推理模型 o1-mini 和阿里 Qwen 更大规模的开源推理模型 QwQ-32B-Preview。</p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/d8d6a80590edb858834c528c29b73ad8\" title=\"\" tg-width=\"1080\" tg-height=\"683\"/></p><p style=\"text-align: justify;\"><strong>  强化学习潜力超越经典开源32B模型</strong></p><p style=\"text-align: justify;\"><strong>随着DeepSeek-R1引发业界强化学习(RL)共创潮，DeepSeek-R1-Distill-7B和Qwen2.5-32B已成为广泛使用的强化学习起步模型。</strong></p><p style=\"text-align: justify;\">在相同RL训练数据情况下，<strong>MiMo-7B 的数学&amp;代码领域的强化学习潜力显著领先。</strong></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/f2612b8bdd7d83843f0c6a344325ab78\" title=\"\" tg-width=\"1080\" tg-height=\"1694\"/></p><p><strong>  预训练+后训练，联动提升推理能力</strong></p><p><strong>MiMo推理能力的提升，由预训练和后训练阶段中数据和算法等多层面的创新联合驱动，包括：</strong></p><ul style=\"\"><li><p><strong>预训练</strong>：核心是<strong>让模型见过更多推理模式</strong></p><ul style=\"\"><li><p><strong>数据：</strong>着重挖掘富推理语料，并合成约200B tokens推理数据。</p></li><li><p><strong>训练：</strong>进行了三阶段训练，逐步提升训练难度，总训练25T tokens。</p></li></ul></li></ul><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/933188cdca1d18b30c259b1dad81bbb5\" title=\"\" tg-width=\"1080\" tg-height=\"282\"/></p><ul style=\"\"><li><p><strong>后训练</strong>：核心是<strong>高效稳定的强化学习算法和框架</strong></p><ul style=\"\"><li><p><strong>算法：</strong>提出 Test Difficulty Driven Reward 来缓解困难算法问题中的奖励稀疏问题，并引入 Easy Data Re-Sampling 策略，以稳定 RL 训练<strong>。</strong></p></li></ul><ul style=\"\"><li><p><strong>框架：</strong>设计了Seamless Rollout系统，使得RL训练<strong>加速2.29倍</strong>，验证<strong>加速1.96倍。</strong></p></li></ul></li></ul><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/ced6c9be1914dbbc0504e07ff7d84714\" title=\"\" tg-width=\"1080\" tg-height=\"763\"/></p><p style=\"text-align: left;\"></p></body></html>","source":null,"html":"<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,minimum-scale=1.0,maximum-scale=1.0,user-scalable=no\"/>\n<meta name=\"format-detection\" content=\"telephone=no,email=no,address=no\" />\n<title>中概AI算力股拉升，小米开源首个为推理而生的大模型</title>\n<style type=\"text/css\">\na,abbr,acronym,address,applet,article,aside,audio,b,big,blockquote,body,canvas,caption,center,cite,code,dd,del,details,dfn,div,dl,dt,\nem,embed,fieldset,figcaption,figure,footer,form,h1,h2,h3,h4,h5,h6,header,hgroup,html,i,iframe,img,ins,kbd,label,legend,li,mark,menu,nav,\nobject,ol,output,p,pre,q,ruby,s,samp,section,small,span,strike,strong,sub,summary,sup,table,tbody,td,tfoot,th,thead,time,tr,tt,u,ul,var,video{ font:inherit;margin:0;padding:0;vertical-align:baseline;border:0 }\nbody{ font-size:16px; line-height:1.5; color:#999; background:transparent; }\n.wrapper{ overflow:hidden;word-break:break-all;padding:10px; }\nh1,h2{ font-weight:normal; line-height:1.35; margin-bottom:.6em; }\nh3,h4,h5,h6{ line-height:1.35; margin-bottom:1em; }\nh1{ font-size:24px; }\nh2{ font-size:20px; }\nh3{ font-size:18px; }\nh4{ font-size:16px; }\nh5{ font-size:14px; }\nh6{ font-size:12px; }\np,ul,ol,blockquote,dl,table{ margin:1.2em 0; }\nul,ol{ margin-left:2em; }\nul{ list-style:disc; }\nol{ list-style:decimal; }\nli,li p{ margin:10px 0;}\nimg{ max-width:100%;display:block;margin:0 auto 1em; }\nblockquote{ color:#B5B2B1; border-left:3px solid #aaa; padding:1em; }\nstrong,b{font-weight:bold;}\nem,i{font-style:italic;}\ntable{ width:100%;border-collapse:collapse;border-spacing:1px;margin:1em 0;font-size:.9em; }\nth,td{ padding:5px;text-align:left;border:1px solid #aaa; }\nth{ font-weight:bold;background:#5d5d5d; }\n.symbol-link{font-weight:bold;}\n/* header{ border-bottom:1px solid #494756; } */\n.title{ margin:0 0 8px;line-height:1.3;color:#ddd; }\n.meta {color:#5e5c6d;font-size:13px;margin:0 0 .5em; }\na{text-decoration:none; color:#2a4b87;}\n.meta .head { display: inline-block; overflow: hidden}\n.head .h-thumb { width: 30px; height: 30px; margin: 0; padding: 0; border-radius: 50%; float: left;}\n.head .h-content { margin: 0; padding: 0 0 0 9px; float: left;}\n.head .h-name {font-size: 13px; color: #eee; margin: 0;}\n.head .h-time {font-size: 11px; color: #7E829C; margin: 0;line-height: 11px;}\n.small {font-size: 12.5px; display: inline-block; transform: scale(0.9); -webkit-transform: scale(0.9); transform-origin: left; -webkit-transform-origin: left;}\n.smaller {font-size: 12.5px; display: inline-block; transform: scale(0.8); -webkit-transform: scale(0.8); transform-origin: left; -webkit-transform-origin: left;}\n.bt-text {font-size: 12px;margin: 1.5em 0 0 0}\n.bt-text p {margin: 0}\n</style>\n</head>\n<body>\n<div class=\"wrapper\">\n<header>\n<h2 class=\"title\">\n中概AI算力股拉升，小米开源首个为推理而生的大模型\n</h2>\n\n<h4 class=\"meta\">\n\n\n<a class=\"head\" href=\"https://laohu8.com/wemedia/102\">\n\n\n<div class=\"h-thumb\" style=\"background-image:url(https://static.tigerbbs.com/8274c5b9d4c2852bfb1c4d6ce16c68ba);background-size:cover;\"></div>\n\n<div class=\"h-content\">\n<p class=\"h-name\">老虎资讯综合 </p>\n<p class=\"h-time\">2025-04-30 21:56</p>\n</div>\n\n</a>\n\n\n</h4>\n\n</header>\n<article>\n<html><head></head><body><p>周三，中概AI算力股拉升，金山云一度涨超12%，万国数据涨超6%，世纪互联涨超5%。今天，小米开源首个为推理而生的大模型「Xiaomi MiMo」。</p><p></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/2faad5c544dc9a1cf62e9ada8514e536\" tg-width=\"560\" tg-height=\"240\"/></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/0e4f115bd4397174d2bdc4b8e6c2110d\" tg-width=\"560\" tg-height=\"240\"/></p><p>今天，<strong>小米开源首个为推理（Reasoning）而生的大模型「Xiaomi MiMo」</strong>，联动预训练到后训练，全面提升推理能力。</p><p style=\"text-align: justify;\">在数学推理（AIME 24-25）和 代码竞赛（LiveCodeBench v5）公开测评集上，MiMo 仅用 7B 的参数规模，超越了 OpenAI 的闭源推理模型 o1-mini 和阿里 Qwen 更大规模的开源推理模型 QwQ-32B-Preview。</p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/d8d6a80590edb858834c528c29b73ad8\" title=\"\" tg-width=\"1080\" tg-height=\"683\"/></p><p style=\"text-align: justify;\"><strong>  强化学习潜力超越经典开源32B模型</strong></p><p style=\"text-align: justify;\"><strong>随着DeepSeek-R1引发业界强化学习(RL)共创潮，DeepSeek-R1-Distill-7B和Qwen2.5-32B已成为广泛使用的强化学习起步模型。</strong></p><p style=\"text-align: justify;\">在相同RL训练数据情况下，<strong>MiMo-7B 的数学&amp;代码领域的强化学习潜力显著领先。</strong></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/f2612b8bdd7d83843f0c6a344325ab78\" title=\"\" tg-width=\"1080\" tg-height=\"1694\"/></p><p><strong>  预训练+后训练，联动提升推理能力</strong></p><p><strong>MiMo推理能力的提升，由预训练和后训练阶段中数据和算法等多层面的创新联合驱动，包括：</strong></p><ul style=\"\"><li><p><strong>预训练</strong>：核心是<strong>让模型见过更多推理模式</strong></p><ul style=\"\"><li><p><strong>数据：</strong>着重挖掘富推理语料，并合成约200B tokens推理数据。</p></li><li><p><strong>训练：</strong>进行了三阶段训练，逐步提升训练难度，总训练25T tokens。</p></li></ul></li></ul><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/933188cdca1d18b30c259b1dad81bbb5\" title=\"\" tg-width=\"1080\" tg-height=\"282\"/></p><ul style=\"\"><li><p><strong>后训练</strong>：核心是<strong>高效稳定的强化学习算法和框架</strong></p><ul style=\"\"><li><p><strong>算法：</strong>提出 Test Difficulty Driven Reward 来缓解困难算法问题中的奖励稀疏问题，并引入 Easy Data Re-Sampling 策略，以稳定 RL 训练<strong>。</strong></p></li></ul><ul style=\"\"><li><p><strong>框架：</strong>设计了Seamless Rollout系统，使得RL训练<strong>加速2.29倍</strong>，验证<strong>加速1.96倍。</strong></p></li></ul></li></ul><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/ced6c9be1914dbbc0504e07ff7d84714\" title=\"\" tg-width=\"1080\" tg-height=\"763\"/></p><p style=\"text-align: left;\"></p></body></html>\n\n</article>\n</div>\n</body>\n</html>\n","isBrief":false,"type":0,"news_type":1,"symbol":"GDS","symbol_name":"万国数据","start_time":0,"source_url":"","article_id":"1100141639","we_media_id":"102","thumbnails":["https://static.tigerbbs.com/d9a5281307c7da041b475a4c7060860b"],"rights":null,"url":"https://stock-news.laohu8.com/highlight/detail?id=1100141639","pubTimestamp":1746021374,"columns":[],"sourceInfo":null,"weMediaInfo":{"media_name":"老虎资讯综合","introduction":"为用户提供金融资讯、行情、数据，旨在帮助投资者理解世界，做投资决策。","home_visible":1,"id":"102","head_image":"https://static.tigerbbs.com/8274c5b9d4c2852bfb1c4d6ce16c68ba"},"summary":"全面提升推理能力。","collect":0,"end_time":0,"defaultTopTitle":"","property":[],"viewcount":null,"language":"zh","relate_stocks":{"GDS":"万国数据","VNET":"世纪互联","KC":"金山云"},"translate_title":"Zhonggai AI computing power stocks rise, Xiaomi Kaiyuan's first large model born for reasoning","themeId":"02f9483e67e5731dba3c85e25229c9ca","isJumpTheme":false,"ttsUrl":"https://static.tigerbbs.com/7ab2edaea39c8e91b23a5f389e9f9f81","symbols_score_info":{"KC":1.1,"GDS":1.1,"VNET":1.1},"content_text":"周三，中概AI算力股拉升，金山云一度涨超12%，万国数据涨超6%，世纪互联涨超5%。今天，小米开源首个为推理而生的大模型「Xiaomi MiMo」。今天，小米开源首个为推理（Reasoning）而生的大模型「Xiaomi MiMo」，联动预训练到后训练，全面提升推理能力。在数学推理（AIME 24-25）和 代码竞赛（LiveCodeBench v5）公开测评集上，MiMo 仅用 7B 的参数规模，超越了 OpenAI 的闭源推理模型 o1-mini 和阿里 Qwen 更大规模的开源推理模型 QwQ-32B-Preview。  强化学习潜力超越经典开源32B模型随着DeepSeek-R1引发业界强化学习(RL)共创潮，DeepSeek-R1-Distill-7B和Qwen2.5-32B已成为广泛使用的强化学习起步模型。在相同RL训练数据情况下，MiMo-7B 的数学&代码领域的强化学习潜力显著领先。  预训练+后训练，联动提升推理能力MiMo推理能力的提升，由预训练和后训练阶段中数据和算法等多层面的创新联合驱动，包括：预训练：核心是让模型见过更多推理模式数据：着重挖掘富推理语料，并合成约200B tokens推理数据。训练：进行了三阶段训练，逐步提升训练难度，总训练25T tokens。后训练：核心是高效稳定的强化学习算法和框架算法：提出 Test Difficulty Driven Reward 来缓解困难算法问题中的奖励稀疏问题，并引入 Easy Data Re-Sampling 策略，以稳定 RL 训练。框架：设计了Seamless Rollout系统，使得RL训练加速2.29倍，验证加速1.96倍。","kind":"news","is_publish_news":true,"is_publish_highlight":true,"is_publish_live":true,"is_publish_wemedia":null,"editions":null,"column":"","sentiment":"0","news_tag":"movement","news_rank":0,"symbols":[],"gpt_button":0,"need_auth":false,"code":"91000000","status":"200"},"commentList":[{"id":430093730083384,"gmtCreate":1746025439167,"gmtModify":1746025440977,"author":{"id":"4160227637143462","authorId":"4160227637143462","name":"Blackland","avatar":"https://community-static.tradeup.com/news/default-avatar.jpg","vip":1,"crmLevel":5,"crmLevelSwitch":0,"authorIdStr":"4160227637143462","idStr":"4160227637143462"},"htmlText":"模型多的不得了，都疲惫了，实际上有啥用","listText":"模型多的不得了，都疲惫了，实际上有啥用","text":"模型多的不得了，都疲惫了，实际上有啥用","images":[],"top":1,"highlighted":1,"essential":1,"paper":1,"likeSize":1,"commentSize":0,"repostSize":0,"link":"https://laohu8.com/post/430093730083384","repostId":1100141639,"repostType":2,"repost":{"id":"1100141639","kind":"news","weMediaInfo":{"introduction":"为用户提供金融资讯、行情、数据，旨在帮助投资者理解世界，做投资决策。","home_visible":1,"media_name":"老虎资讯综合","id":"102","head_image":"https://static.tigerbbs.com/8274c5b9d4c2852bfb1c4d6ce16c68ba"},"pubTimestamp":1746021374,"share":"https://www.laohu8.com/m/news/1100141639?lang=zh_CN&edition=full","pubTime":"2025-04-30 21:56","market":"us","language":"zh","title":"中概AI算力股拉升，小米开源首个为推理而生的大模型","url":"https://stock-news.laohu8.com/highlight/detail?id=1100141639","media":"老虎资讯综合","summary":"全面提升推理能力。","content":"<html><head></head><body><p>周三，中概AI算力股拉升，金山云一度涨超12%，万国数据涨超6%，世纪互联涨超5%。今天，小米开源首个为推理而生的大模型「Xiaomi MiMo」。</p><p></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/2faad5c544dc9a1cf62e9ada8514e536\" tg-width=\"560\" tg-height=\"240\"/></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/0e4f115bd4397174d2bdc4b8e6c2110d\" tg-width=\"560\" tg-height=\"240\"/></p><p>今天，<strong>小米开源首个为推理（Reasoning）而生的大模型「Xiaomi MiMo」</strong>，联动预训练到后训练，全面提升推理能力。</p><p style=\"text-align: justify;\">在数学推理（AIME 24-25）和 代码竞赛（LiveCodeBench v5）公开测评集上，MiMo 仅用 7B 的参数规模，超越了 OpenAI 的闭源推理模型 o1-mini 和阿里 Qwen 更大规模的开源推理模型 QwQ-32B-Preview。</p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/d8d6a80590edb858834c528c29b73ad8\" title=\"\" tg-width=\"1080\" tg-height=\"683\"/></p><p style=\"text-align: justify;\"><strong>  强化学习潜力超越经典开源32B模型</strong></p><p style=\"text-align: justify;\"><strong>随着DeepSeek-R1引发业界强化学习(RL)共创潮，DeepSeek-R1-Distill-7B和Qwen2.5-32B已成为广泛使用的强化学习起步模型。</strong></p><p style=\"text-align: justify;\">在相同RL训练数据情况下，<strong>MiMo-7B 的数学&amp;代码领域的强化学习潜力显著领先。</strong></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/f2612b8bdd7d83843f0c6a344325ab78\" title=\"\" tg-width=\"1080\" tg-height=\"1694\"/></p><p><strong>  预训练+后训练，联动提升推理能力</strong></p><p><strong>MiMo推理能力的提升，由预训练和后训练阶段中数据和算法等多层面的创新联合驱动，包括：</strong></p><ul style=\"\"><li><p><strong>预训练</strong>：核心是<strong>让模型见过更多推理模式</strong></p><ul style=\"\"><li><p><strong>数据：</strong>着重挖掘富推理语料，并合成约200B tokens推理数据。</p></li><li><p><strong>训练：</strong>进行了三阶段训练，逐步提升训练难度，总训练25T tokens。</p></li></ul></li></ul><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/933188cdca1d18b30c259b1dad81bbb5\" title=\"\" tg-width=\"1080\" tg-height=\"282\"/></p><ul style=\"\"><li><p><strong>后训练</strong>：核心是<strong>高效稳定的强化学习算法和框架</strong></p><ul style=\"\"><li><p><strong>算法：</strong>提出 Test Difficulty Driven Reward 来缓解困难算法问题中的奖励稀疏问题，并引入 Easy Data Re-Sampling 策略，以稳定 RL 训练<strong>。</strong></p></li></ul><ul style=\"\"><li><p><strong>框架：</strong>设计了Seamless Rollout系统，使得RL训练<strong>加速2.29倍</strong>，验证<strong>加速1.96倍。</strong></p></li></ul></li></ul><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/ced6c9be1914dbbc0504e07ff7d84714\" title=\"\" tg-width=\"1080\" tg-height=\"763\"/></p><p style=\"text-align: left;\"></p></body></html>","collect":0,"html":"<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,minimum-scale=1.0,maximum-scale=1.0,user-scalable=no\"/>\n<meta name=\"format-detection\" content=\"telephone=no,email=no,address=no\" />\n<title>中概AI算力股拉升，小米开源首个为推理而生的大模型</title>\n<style type=\"text/css\">\na,abbr,acronym,address,applet,article,aside,audio,b,big,blockquote,body,canvas,caption,center,cite,code,dd,del,details,dfn,div,dl,dt,\nem,embed,fieldset,figcaption,figure,footer,form,h1,h2,h3,h4,h5,h6,header,hgroup,html,i,iframe,img,ins,kbd,label,legend,li,mark,menu,nav,\nobject,ol,output,p,pre,q,ruby,s,samp,section,small,span,strike,strong,sub,summary,sup,table,tbody,td,tfoot,th,thead,time,tr,tt,u,ul,var,video{ font:inherit;margin:0;padding:0;vertical-align:baseline;border:0 }\nbody{ font-size:16px; line-height:1.5; color:#999; background:transparent; }\n.wrapper{ overflow:hidden;word-break:break-all;padding:10px; }\nh1,h2{ font-weight:normal; line-height:1.35; margin-bottom:.6em; }\nh3,h4,h5,h6{ line-height:1.35; margin-bottom:1em; }\nh1{ font-size:24px; }\nh2{ font-size:20px; }\nh3{ font-size:18px; }\nh4{ font-size:16px; }\nh5{ font-size:14px; }\nh6{ font-size:12px; }\np,ul,ol,blockquote,dl,table{ margin:1.2em 0; }\nul,ol{ margin-left:2em; }\nul{ list-style:disc; }\nol{ list-style:decimal; }\nli,li p{ margin:10px 0;}\nimg{ max-width:100%;display:block;margin:0 auto 1em; }\nblockquote{ color:#B5B2B1; border-left:3px solid #aaa; padding:1em; }\nstrong,b{font-weight:bold;}\nem,i{font-style:italic;}\ntable{ width:100%;border-collapse:collapse;border-spacing:1px;margin:1em 0;font-size:.9em; }\nth,td{ padding:5px;text-align:left;border:1px solid #aaa; }\nth{ font-weight:bold;background:#5d5d5d; }\n.symbol-link{font-weight:bold;}\n/* header{ border-bottom:1px solid #494756; } */\n.title{ margin:0 0 8px;line-height:1.3;color:#ddd; }\n.meta {color:#5e5c6d;font-size:13px;margin:0 0 .5em; }\na{text-decoration:none; color:#2a4b87;}\n.meta .head { display: inline-block; overflow: hidden}\n.head .h-thumb { width: 30px; height: 30px; margin: 0; padding: 0; border-radius: 50%; float: left;}\n.head .h-content { margin: 0; padding: 0 0 0 9px; float: left;}\n.head .h-name {font-size: 13px; color: #eee; margin: 0;}\n.head .h-time {font-size: 11px; color: #7E829C; margin: 0;line-height: 11px;}\n.small {font-size: 12.5px; display: inline-block; transform: scale(0.9); -webkit-transform: scale(0.9); transform-origin: left; -webkit-transform-origin: left;}\n.smaller {font-size: 12.5px; display: inline-block; transform: scale(0.8); -webkit-transform: scale(0.8); transform-origin: left; -webkit-transform-origin: left;}\n.bt-text {font-size: 12px;margin: 1.5em 0 0 0}\n.bt-text p {margin: 0}\n</style>\n</head>\n<body>\n<div class=\"wrapper\">\n<header>\n<h2 class=\"title\">\n中概AI算力股拉升，小米开源首个为推理而生的大模型\n</h2>\n\n<h4 class=\"meta\">\n\n\n<a class=\"head\" href=\"https://laohu8.com/wemedia/102\">\n\n\n<div class=\"h-thumb\" style=\"background-image:url(https://static.tigerbbs.com/8274c5b9d4c2852bfb1c4d6ce16c68ba);background-size:cover;\"></div>\n\n<div class=\"h-content\">\n<p class=\"h-name\">老虎资讯综合 </p>\n<p class=\"h-time\">2025-04-30 21:56</p>\n</div>\n\n</a>\n\n\n</h4>\n\n</header>\n<article>\n<html><head></head><body><p>周三，中概AI算力股拉升，金山云一度涨超12%，万国数据涨超6%，世纪互联涨超5%。今天，小米开源首个为推理而生的大模型「Xiaomi MiMo」。</p><p></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/2faad5c544dc9a1cf62e9ada8514e536\" tg-width=\"560\" tg-height=\"240\"/></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/0e4f115bd4397174d2bdc4b8e6c2110d\" tg-width=\"560\" tg-height=\"240\"/></p><p>今天，<strong>小米开源首个为推理（Reasoning）而生的大模型「Xiaomi MiMo」</strong>，联动预训练到后训练，全面提升推理能力。</p><p style=\"text-align: justify;\">在数学推理（AIME 24-25）和 代码竞赛（LiveCodeBench v5）公开测评集上，MiMo 仅用 7B 的参数规模，超越了 OpenAI 的闭源推理模型 o1-mini 和阿里 Qwen 更大规模的开源推理模型 QwQ-32B-Preview。</p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/d8d6a80590edb858834c528c29b73ad8\" title=\"\" tg-width=\"1080\" tg-height=\"683\"/></p><p style=\"text-align: justify;\"><strong>  强化学习潜力超越经典开源32B模型</strong></p><p style=\"text-align: justify;\"><strong>随着DeepSeek-R1引发业界强化学习(RL)共创潮，DeepSeek-R1-Distill-7B和Qwen2.5-32B已成为广泛使用的强化学习起步模型。</strong></p><p style=\"text-align: justify;\">在相同RL训练数据情况下，<strong>MiMo-7B 的数学&amp;代码领域的强化学习潜力显著领先。</strong></p><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/f2612b8bdd7d83843f0c6a344325ab78\" title=\"\" tg-width=\"1080\" tg-height=\"1694\"/></p><p><strong>  预训练+后训练，联动提升推理能力</strong></p><p><strong>MiMo推理能力的提升，由预训练和后训练阶段中数据和算法等多层面的创新联合驱动，包括：</strong></p><ul style=\"\"><li><p><strong>预训练</strong>：核心是<strong>让模型见过更多推理模式</strong></p><ul style=\"\"><li><p><strong>数据：</strong>着重挖掘富推理语料，并合成约200B tokens推理数据。</p></li><li><p><strong>训练：</strong>进行了三阶段训练，逐步提升训练难度，总训练25T tokens。</p></li></ul></li></ul><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/933188cdca1d18b30c259b1dad81bbb5\" title=\"\" tg-width=\"1080\" tg-height=\"282\"/></p><ul style=\"\"><li><p><strong>后训练</strong>：核心是<strong>高效稳定的强化学习算法和框架</strong></p><ul style=\"\"><li><p><strong>算法：</strong>提出 Test Difficulty Driven Reward 来缓解困难算法问题中的奖励稀疏问题，并引入 Easy Data Re-Sampling 策略，以稳定 RL 训练<strong>。</strong></p></li></ul><ul style=\"\"><li><p><strong>框架：</strong>设计了Seamless Rollout系统，使得RL训练<strong>加速2.29倍</strong>，验证<strong>加速1.96倍。</strong></p></li></ul></li></ul><p class=\"t-img-caption\"><img src=\"https://static.tigerbbs.com/ced6c9be1914dbbc0504e07ff7d84714\" title=\"\" tg-width=\"1080\" tg-height=\"763\"/></p><p style=\"text-align: left;\"></p></body></html>\n\n</article>\n</div>\n</body>\n</html>\n","type":0,"thumbnail":"https://static.tigerbbs.com/d9a5281307c7da041b475a4c7060860b","relate_stocks":{"GDS":"万国数据","VNET":"世纪互联","KC":"金山云"},"source_url":"","is_english":false,"share_image_url":"https://static.laohu8.com/e9f99090a1c2ed51c021029395664489","article_id":"1100141639","content_text":"周三，中概AI算力股拉升，金山云一度涨超12%，万国数据涨超6%，世纪互联涨超5%。今天，小米开源首个为推理而生的大模型「Xiaomi MiMo」。今天，小米开源首个为推理（Reasoning）而生的大模型「Xiaomi MiMo」，联动预训练到后训练，全面提升推理能力。在数学推理（AIME 24-25）和 代码竞赛（LiveCodeBench v5）公开测评集上，MiMo 仅用 7B 的参数规模，超越了 OpenAI 的闭源推理模型 o1-mini 和阿里 Qwen 更大规模的开源推理模型 QwQ-32B-Preview。  强化学习潜力超越经典开源32B模型随着DeepSeek-R1引发业界强化学习(RL)共创潮，DeepSeek-R1-Distill-7B和Qwen2.5-32B已成为广泛使用的强化学习起步模型。在相同RL训练数据情况下，MiMo-7B 的数学&代码领域的强化学习潜力显著领先。  预训练+后训练，联动提升推理能力MiMo推理能力的提升，由预训练和后训练阶段中数据和算法等多层面的创新联合驱动，包括：预训练：核心是让模型见过更多推理模式数据：着重挖掘富推理语料，并合成约200B tokens推理数据。训练：进行了三阶段训练，逐步提升训练难度，总训练25T tokens。后训练：核心是高效稳定的强化学习算法和框架算法：提出 Test Difficulty Driven Reward 来缓解困难算法问题中的奖励稀疏问题，并引入 Easy Data Re-Sampling 策略，以稳定 RL 训练。框架：设计了Seamless Rollout系统，使得RL训练加速2.29倍，验证加速1.96倍。","news_type":1,"symbols_score_info":{"KC":1.1,"GDS":1.1,"VNET":1.1}},"isVote":1,"likeStatus":false,"favoriteStatus":false,"reportStatus":false,"tweetType":1,"langContent":"CN"}],"hasMoreComments":true,"newsSizeData":{"likeSize":2,"commentSize":1,"repostSize":1,"favoriteSize":3,"likeStatus":false,"favoriteStatus":false},"isCrawlerRequest":true}