Expert mix
Expert mix is a structure where the inside of the model is divided into several experts and only some of them are activated for each question. It is used as a secret to providing great performance at low cost.
Mixture of Experts (MoE) is a structure that divides the interior into several expert blocks and selects and calculates only the necessary parts for each input, rather than using one huge model as a whole. Just as a general hospital guides patients to the appropriate department for their symptoms, a device called a router assigns each input to the appropriate specialist.
Because only a portion of the actual calculations are performed while keeping the overall parameters large, a larger model can be achieved at the same cost. Thanks to this efficiency, many of the recently released large language models adopt the MoE structure.
A common misconception is that each expert is in charge of a field that humans understand, such as math or law. In reality, roles are automatically divided during the learning process, and the standards for division may differ from human intuition.
✅ Why it matters
- It can produce larger model-level performance at the same computational cost
- Only a few experts operate during inference, making it efficient in terms of speed and cost
- It has established itself as a realistic alternative in the race to expand model size
⚠️ Limits and debates
- Memory requirements are still large because all parameters must be loaded into memory
- Learning is prone to instability, so training difficulty is high
- Expert division often does not match people's areas of expertise