ERNIE 4.5 Gets a Major Inference Speed Boost
How the new PLAS sparse attention update delivers performance gains for long-context inference on ERNIE 4.5 models.
FastDeploy 2.0: A Large-Scale Model Inference and Deployment Toolkit with Native Support for ERNIE 4.5
As large models such as the ERNIE 4.5 family continue to be open-sourced, interest in their inference performance and deployment efficiency has multiplied across both research and industry. FastDeploy 2.0, built on the PaddlePaddle framework, addresses this demand by offering an end-to-end toolkit for efficient deployment and high-performance inference of large models.
Announcing the Open Source Release of the ERNIE 4.5 Model Family
We introduce ERNIE 4.5, a new family of large-scale multimodal models comprising 10 distinct variants. The model family consist of Mixture-of-Experts (MoE) models with 47B and 3B active parameters, with the largest model having 424B total parameters, as well as a 0.3B dense model.