
(FM) MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
カートのアイテムが多すぎます
カートに追加できませんでした。
ウィッシュリストに追加できませんでした。
ほしい物リストの削除に失敗しました。
ポッドキャストのフォローに失敗しました
ポッドキャストのフォロー解除に失敗しました
-
ナレーター:
-
著者:
このコンテンツについて
Join us to explore MiniMax-M1, a revolutionary development from MiniMax, hailed as the world's first open-weight, large-scale hybrid-attention reasoning model. At its core, MiniMax-M1 leverages a sophisticated hybrid Mixture-of-Experts (MoE) architecture paired with a novel lightning attention mechanism, which together facilitate the efficient scaling of test-time compute. A significant advancement is its native support for an impressive 1 million token context length, an eightfold expansion compared to competitors like DeepSeek R1, making it exceptionally well-suited for complex tasks demanding the processing of extensive inputs and prolonged reasoning.
Further enhancing its capabilities, MiniMax-M1 was trained using CISPO, a pioneering reinforcement learning algorithm. This method, which clips importance sampling weights rather than token updates, notably boosts RL efficiency, demonstrated by the model’s full RL training completing in just three weeks on 512 H800 GPUs for a cost of only $534,700. The model exhibits particular strengths in practical applications such as complex software engineering, effective tool utilization, and various long-context tasks, having been rigorously trained in diverse real-world software engineering environments. While its innovative design and performance are thoroughly detailed, the provided sources do not explicitly outline any limitations of the MiniMax-M1 model.
To learn more, explore the full technical report: https://arxiv.org/abs/2506.13585.