Peluncuran GPT-1

OpenAI merilis GPT-1 (Generative Pre-trained Transformer 1) — transformer decoder 117M parameter yang di-pretrain dengan unsupervised learning, lalu di-fine-tune untuk tugas spesifik.

GPT-1: 117M parameters, 12-layer decoder Transformer, unsupervised pre-training. Fondasi era LLM. Tonggak pretrain + fine-tune paradigm.

Print

Peluncuran GPT-1

Ringkasan

OpenAI merilis paper “Improving Language Understanding by Generative Pre-training” pada 11 Juni 2018 — memperkenalkan GPT-1 (Generative Pre-trained Transformer 1), model bahasa 117M parameter berbasis Transformer decoder.

Inovasi

  • Unsupervised pre-training + supervised fine-tuning
  • Transformer decoder saja (bukan encoder-decoder)
  • Causal language modeling — prediksi token berikutnya
  • 12 layer, 12 attention head, 117M parameters
  • 768 hidden size

Hasil

  • SOTA di 9 dari 12 benchmark NLP (GLUE, NLI, dll)
  • Membuktikan paradigma pre-train + fine-tune berhasil

Legacy

GPT-1 memulai era generative pre-trained Transformer:

  • GPT-1 (2018) — 117M
  • GPT-2 (2019) — 1.5B
  • GPT-3 (2020) — 175B
  • GPT-4 (2023) — multimodal
  • GPT-5 (2025) — agentic

Generative pre-trained Transformer adalah paradigma dominan LLM hingga 2026.

Connected to

Not yet written

The following pages are referenced but don't exist yet — they'd make good future additions.

  • /concepts/large-language-model
  • /sources/openai

References

  1. Wikipedia

Type at least 2 characters to search.

Press to navigate, to open, esc to close.