Large Language Model

/larj lang-gwij mo-del/

A neural network trained on massive text corpora to generate and understand natural language, forming the foundation of modern AI assistants.

The dominant class of AI system in 2026 — transformer-based, token-predicting, instruction-tuned. Underpins every modern AI assistant.

January 15, 2024 updated December 20, 2024 2 min ai

Also known as: LLM, Large Language Models, language model

A Large Language Model (LLM) is a type of neural network trained on vast corpora of text to predict the next token in a sequence. Through scale — both in parameters (often hundreds of billions) and training data (trillions of tokens) — these models develop emergent capabilities including reasoning, translation, code generation, and conversation.

It is a truth universally acknowledged, that a single transformer in possession of a good attention mechanism, must be in want of a context window.

— paraphrased (in the spirit of Jane Austen, about LLMs)

How they work

LLMs are built on the Transformer architecture, which uses self-attention to process sequences in parallel. Training proceeds in two broad phases:

Pre-training — the model learns statistical patterns of language by predicting the next token over a large corpus. This is unsupervised and extremely compute-intensive.
Post-training — techniques like RLHF (Reinforcement Learning from Human Feedback) align the model with human preferences, instruction following, and safety.

Key properties

Tokenization — text is split into discrete units (tokens) that the model processes. A token is roughly 0.75 words in English.
Context window — the maximum amount of text the model can attend to at once. Modern models range from 8K to 2M tokens. See Context Window.
Embeddings — the internal vector representations the model uses. See Embedding.
Few-shot learning — at sufficient scale, models can perform new tasks from a handful of examples provided in the prompt.

Applications

LLMs power a wide range of applications:

Chat assistants — conversational interfaces like ChatGPT, Claude, Gemini.
AI Agents — systems that use LLMs as a reasoning engine to take actions in the world.
Retrieval-Augmented Generation — combining LLMs with external knowledge bases.
Code generation — tools like GitHub Copilot and Cursor.
Translation, summarization, content generation — replacing or augmenting traditional NLP pipelines.

Limitations

Despite their capabilities, LLMs have well-known limitations:

Hallucination — confidently generating plausible but incorrect information.
Knowledge cutoff — no awareness of events after training.
Context limitations — forgetting or confusing information in long contexts.
Lack of grounding — no direct connection to truth or the physical world.
Computational cost — training and inference require substantial energy and hardware.

The trajectory

The field has moved rapidly from GPT-2 (1.5B parameters, 2019) through GPT-3 (175B, 2020), ChatGPT (2022), and into a multi-model ecosystem where frontier labs release models with increasing capabilities every few months. The dominant architectural paradigm — the Transformer — has remained constant, though training techniques, post-training methods, and inference strategies have evolved dramatically.

Large Language Model

How they work

Key properties

Applications

Limitations

The trajectory

See also

Connected to

Mentioned by

Related articles

Transformer

Embedding

Context Window

Retrieval-Augmented Generation

AI Agent

References