Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
学ぶ Learning Rate Scheduling Strategies | Section
Pre-training Large Language Models

bookLearning Rate Scheduling Strategies

メニューを表示するにはスワイプしてください

A fixed learning rate is rarely optimal for LLM training. Too high at the start causes unstable updates; too high at the end prevents convergence to a good minimum. Learning rate scheduling adjusts the rate dynamically throughout training.

Linear Warmup

Start with a near-zero learning rate and increase it linearly to the target value over the first warmup_steps steps. This gives the model time to settle into a reasonable parameter space before large gradient updates begin.

Cosine Decay

After warmup, decay the learning rate following a cosine curve – large updates early, fine-grained adjustments later. The rate approaches zero by the end of training. This is the most widely used schedule for LLM pre-training.

Implementation

123456789101112131415161718192021222324252627282930313233343536
import torch import torch.nn as nn import math from torch.optim import AdamW from torch.optim.lr_scheduler import LambdaLR model = nn.Linear(10, 10) optimizer = AdamW(model.parameters(), lr=2e-4) warmup_steps = 100 total_steps = 5000 def cosine_with_warmup(step): if step < warmup_steps: # Linear warmup return step / max(1, warmup_steps) # Cosine decay progress = (step - warmup_steps) / max(1, total_steps - warmup_steps) return 0.5 * (1.0 + math.cos(math.pi * progress)) scheduler = LambdaLR(optimizer, lr_lambda=cosine_with_warmup) # Simulating a training loop for step in range(total_steps): optimizer.zero_grad() # Forward and backward pass would go here loss = model(torch.randn(4, 10)).sum() loss.backward() optimizer.step() scheduler.step() if step % 1000 == 0: current_lr = scheduler.get_last_lr()[0] print(f"Step {step:05d} – lr: {current_lr:.6f}")
copy
question mark

Which of the following statements about learning rate schedules is correct?

正しい答えを選んでください

すべて明確でしたか?

どのように改善できますか?

フィードバックありがとうございます!

セクション 1.  8

AIに質問する

expand

AIに質問する

ChatGPT

何でも質問するか、提案された質問の1つを試してチャットを始めてください

セクション 1.  8
some-alt