SantageAI Glossary › Inference
AI Glossary

What is Inference?

Inference is the process of using a trained AI model to generate predictions or outputs from new input data.

What is the core idea behind AI inference?

Training builds the model. Inference uses it.

How do AI inference differ from related concepts?

ConceptDifference
Inference vs TrainingTraining learns from data. Inference applies what was learned
Inference vs ServingInference is prediction. Serving manages delivery
Inference vs ComputeCompute is the resource. Inference is the task

How do AI inference work?

What are the limitations of AI inference?

Why are AI inference important?

Inference is what makes AI useful in practice. Every time you interact with an AI tool, inference is happening. Inference costs are a major factor in AI deployment economics.

How are AI inference used in practice?

Performed billions of times daily across AI services. Optimization techniques include quantization, distillation, caching, and specialized hardware.

Frequently Asked Questions

Why is inference expensive?
Each inference call consumes compute resources. At scale across millions of users, these costs become substantial.
How is inference optimized?
Through model quantization, distillation, caching frequent requests, batching, and using inference-optimized hardware.