Token
A token is the smallest text unit an AI model processes, and it is commonly used to calculate usage limits and costs.
A token is the smallest processing unit that AI uses when reading and writing text. A sentence is broken into words or smaller pieces. For example, in English, common words are one token, but long words are broken into multiple tokens, and in Korean, a word is often multiple tokens. Just like assembling a Lego piece block by block, AI understands and creates text in tokens.
The reason why tokens are important is because the AI service fee, usage limit, and length of text that can be handled at one time (context window) are all calculated by the number of tokens. This concept is essential to read the price per token in the API rate table and the context length in the model specification.
It is also good to know in actual use that sentences with the same meaning require different tokens depending on the language, so Korean tends to consume more tokens than English.
✅ Why it matters
- You will be able to understand AI fees and usage limits and manage costs
- It becomes the basis for reading model specifications such as context windows
- It becomes a unit to measure the amount of work when processing long documents or utilizing APIs
⚠️ Limits and debates
- The way tokens are divided varies from model to model, making it difficult to simply compare services
- Korean language consumes more tokens than English, so it tends to be disadvantageous in cost
- It is difficult to count intuitively as it does not exactly match the number of characters or words.