Glossary · Term

Model lightweight

Also known as: lightweight

Model lightweight is a technology that reduces large AI models to smaller ones with minimal performance loss. This is a prerequisite for on-device AI.

Model lightweighting refers to a technology that reduces the size and calculation amount of a huge AI model made up of billions of values while maintaining performance as much as possible. It can be likened to the task of turning a thick encyclopedia into a paperback with only its core contents, and representative techniques include quantization, pruning, and knowledge distillation.

Because large models require expensive servers and GPUs to run, lightweighting is essential to use AI in small devices such as smartphones or laptops. Companies looking to reduce service operating costs are also investing in lightweight technology for the same reason.

It is important to note that lightweighting is not free. As the degree of compression increases, performance, including subtle reasoning abilities and rarely used knowledge, is likely to be reduced, so finding a balance that suits the intended use is the key.

✅ Why it matters

⚠️ Limits and debates

← View all glossary entries