Knowledge of training best practices can reduce the cost of training a desired model significantly. Here, we link to readings and resources on effectively using a given resource budget for model training, including canonical papers on fitting scaling laws.
5 Resources for Model Training: Efficiency & Resource Allocation
- Home /
- Foundation Model Resources /
- Resources for Model Training: Efficiency & Resource Allocation
Efficiency & Resource Allocation
Text 5
Speech 2
Vision 2
Scaling Laws for Neural Language Models
Provide scaling laws to determine the optimal allocation of a fixed compute budget.
TextCerebras Model Lab
A calculator to apply compute-optimal scaling laws for a given budget, including factoring expected total inference usage.
Text Speech VisionScaling Data-Constrained Language Models
Demonstrates an optimal allocation of compute when dataset size is bounded
TextTraining Compute-Optimal Language Models
Proposes an optimal allocation of computational budget between model and dataset size, and shows experimental design for fitting scaling laws for compute allocation in a new setting.
Text