Glossary · Term

Data labeling

Also known as: labeling, annotation

Data labeling is the task of attaching a correct answer mark to the learning data. It's also a large, manual industry behind AI.

Data labeling is the process of attaching correct answer marks to the data that AI will study. You can think of it as creating an answer sheet in a workbook, such as writing cat on a picture of a cat and marking positive and negative sentences in sentences. Since AI often learns from data with correct answers, the quality of the labels directly determines the AI's skills. Labeling is all about marking pedestrians and lanes in autonomous driving videos, and evaluating whether chatbot answers are good or bad, and it is also a large-scale manual industry behind fancy AI.

Controversy over working conditions has been raised as much of this work is left to workers in low-wage countries. Recently, automation has increased with AI assisting with labeling, but human inspection is still essential.

✅ Why it matters

⚠️ Limits and debates

← View all glossary entries