Lära Defining Zero-Shot Generalization in LLMs | Zero-Shot Generalization Foundations

Svep för att visa menyn

Zero-shot generalization is a key feature of large language models (LLMs). It enables them to perform tasks without explicit training on those tasks. Unlike traditional supervised learning, which requires many labeled examples for each task, LLMs can complete new tasks using only a natural language prompt that describes what is needed.

This capability comes from the way LLMs are trained. They learn from vast and diverse text corpora, modeling the statistical relationships between words, phrases, and structures across many contexts. When faced with a new task — such as answering questions, summarizing text, or translating — they use their broad knowledge to infer the correct approach, even if they have never seen labeled examples for that specific task.

Zero-shot generalization represents a shift from the standard supervised learning paradigm. Instead of memorizing solutions, LLMs learn general patterns that allow them to adapt to new tasks at inference time. This flexibility allows them to handle novel instructions and domains without retraining.

Note

Zero-shot generalization in LLMs emerges from their broad pretraining and distributional learning across massive, diverse datasets. This enables them to infer how to perform new tasks by drawing on generalizable knowledge, rather than relying on explicit task-specific examples.

Historical context

Early machine learning models required explicit supervision — lots of labeled examples for each task. As models and datasets grew, researchers explored whether models could learn more generalizable representations.

Zero-shot vs. one-shot and few-shot

In zero-shot learning, the model receives no labeled examples for a new task; it must generalize from its pretraining alone. One-shot learning provides a single example, while few-shot learning gives a small set of examples. Each paradigm tests the model's ability to transfer knowledge to new domains with minimal data.

Why zero-shot matters

Zero-shot generalization is especially valuable for tasks where labeled data is scarce or expensive, and for enabling models to adapt quickly to new tasks or domains without retraining.

Var allt tydligt?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 1

Fråga AI

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 1. Kapitel 1