
Omni-Infer v0.5.0 已经发布,超大规模 MoE 模型推理加速技术。
| 模型 | 硬件 | 精度类型 | 部署形态 | 
|---|---|---|---|
| DeepSeek-R1 | A3 | INT8 | PD分离 | 
| DeepSeek-R1 | A3 | W4A8C16 | PD分离 | 
| DeepSeek-R1 | A3 | BF16 | PD分离 | 
| DeepSeek-R1 | A2 | INT8 | PD分离 | 
| Qwen2.5-7B | A3 | INT8 | 混布(TP>=1 DP=1) | 
| Qwen2.5-7B | A2 | INT8 | 混布(TP>=1 DP=1) | 
| QwQ | A3 | BF16 | PD分离 | 
| Qwen3-32B | A3 | BF16 | PD分离 | 
| Qwen3-235B | A3 | INT8 | PD分离 | 
| Kimi-K2 | A3 | W4A8C16 | PD分离 | 
| 硬件 | 架构 | 镜像文件 | Tar包 | 
|---|---|---|---|
| A3 | arm | docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a3-arm:release_v0.5.0 | omni_infer-a3-arm:v0.5.0 | 
| A3 | x86 | docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a3-x86:release_v0.5.0 | omni_infer-a3-x86:v0.5.0 | 
| A2 | arm | docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a2-arm:release_v0.5.0 | omni_infer-a2-arm:v0.5.0 | 
| A2 | x86 | docker pull swr.cn-east-4.myhuaweicloud.com/omni/omni_infer-a2-x86:release_v0.5.0 | omni_infer-a2-x86:v0.5.0 | 
详情查看:https://gitee.com/omniai/omniinfer/releases/v0.5.0
源码地址:点击下载
以上就是Omni-Infer v0.5.0 发布,超大规模 MoE 模型推理加速技术的详细内容,更多请关注php中文网其它相关文章!
 
                        
                        每个人都需要一台速度更快、更稳定的 PC。随着时间的推移,垃圾文件、旧注册表数据和不必要的后台进程会占用资源并降低性能。幸运的是,许多工具可以让 Windows 保持平稳运行。
 
                Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号