Docker Model Runner: Running Machine Learning Models with Docker
LLM Models: Back to the Basics
Before moving forward, let's define some of the common terms related to LLMs. This section is a glossary-style overview of key concepts that will help you understand some of the terminology used in the context of large language models.
Language model: A language model is a type of AI system trained to understand and generate text. It predicts the next word or token based on the context it has seen so far. Inside a model, everything is represented numerically.
Inference: Inference is the process of running the model to generate text after it has been trained. Faster inference means quicker responses.
Weights: The model is made of millions of learned numerical values called weights. Each weight is just a number like 0.1234 or -2.5. Each weight represents a tiny piece of knowledge the model has learned during training.
Token: A token is a piece of text that the model processes. It can be as small as a single character or as large as a whole word, depending on the model's design. Try playing around with tokenizers like OpenAI's tokenizer tool to see how different texts are broken down into tokens.
Parameters: Parameters are the internal numerical values a model learns during training. They store the model's knowledge. More parameters usually mean better capabilities, but also higher memory and compute requirements.
Model size (e.g., 135M, 360M): This indicates how many parameters the model has. For example, 135M means 135 million parameters. Larger models are generally more capable but slower and more resource-intensive.
Painless Docker - 2nd Edition
A Comprehensive Guide to Mastering Docker and its EcosystemEnroll now to unlock all content and receive all future updates for free.
Hurry! This limited time offer ends in:
To redeem this offer, copy the coupon code below and apply it at checkout:
