Glossary · Term

Inference model

Also known as: reasoning model, thinking model

Inference model is an AI model that goes through a step-by-step thinking process before answering. It is strong at difficult problems such as math and coding, but the response is slow and expensive.

The inference model is an AI model that does not immediately give an answer upon receiving a question, but instead develops and reviews its thoughts step by step internally before answering. It can be likened to a student who solves a problem by writing down the solution process in a workbook instead of immediately answering with mental calculation.

As the existing LLM repeated mistakes in mathematics, logic, and complex coding, the approach that giving time to think before answering increases accuracy received attention. Major AI companies are competing to come up with inference models in a new direction of expansion that increases the calculations in the answer stage instead of growing the model in the learning stage.

However, since it consumes tokens as much as you think, responses are slow and costs are high, making it rather inefficient for simple questions. There is also research showing that the thought processes displayed do not reflect actual internal judgments.

✅ Why it matters

⚠️ Limits and debates

← View all glossary entries