Glossary · Term

Pre-learning

Also known as: pre-training

Pre-learning is the first step in AI training to learn the basics of language with large amounts of text. Afterwards, it is refined to suit the purpose through fine tuning.

Dictionary learning is the first step in creating an AI model. It is a process of repeatedly training to predict the next word by reading a large amount of text such as Internet documents and books. It can be likened to basic education that involves learning Korean, math, and common sense before preparing for a specific test, and in this process, the basics of grammar, knowledge, and reasoning are created. Since it is inefficient to create a new model from scratch for each task, the idea was to learn it once and reuse it for various purposes. It is the basis of modern LLM, as the P in GPT is an abbreviation for Pretrained, and the pre-trained model is refined through fine tuning and RLHF to become an actual service.

The main issue is that pre-training requires a huge amount of GPU and power, making it astronomical in cost, and the copyright issue of learning data is leading to lawsuits.

✅ Why it matters

⚠️ Limits and debates

← View all glossary entries