Regularization and Optimization Strategies
Regularization and Optimization in Transfer Learning
When adapting pre-trained models, regularization and optimization strategies help prevent overfitting and improve generalization:
- Dropout: randomly drops units during training to prevent co-adaptation;
- Batch normalization: normalizes activations to stabilize and speed up training;
- Data augmentation: expands the training set by applying random transformations (flip, rotate, zoom, etc.);
- Differential learning rates: use lower learning rates for pre-trained layers and higher for new layers.
Data augmentation is especially important when your target dataset is small.
12345678910111213141516171819202122232425# Example: Adding regularization and data augmentation from tensorflow.keras import layers, models from tensorflow.keras.preprocessing.image import ImageDataGenerator # Data augmentation pipeline train_datagen = ImageDataGenerator( rotation_range=20, width_shift_range=0.2, height_shift_range=0.2, horizontal_flip=True, zoom_range=0.2 ) # Build model with dropout and batch normalization model = models.Sequential([ VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3)), layers.Flatten(), layers.Dense(256, activation='relu'), layers.BatchNormalization(), layers.Dropout(0.5), layers.Dense(10, activation='softmax') ]) # Compile model model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# Data augmentation pipeline
train_datagen = ImageDataGenerator(
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
zoom_range=0.2
)
This block defines a data generator that applies random transformations to training images. Instead of manually creating more images, it produces slightly altered versions in real time during training, which helps the model become more robust.
rotation_range=20rotates each image by up to 20 degrees in any direction.width_shift_range=0.2andheight_shift_range=0.2move images horizontally or vertically by up to 20% of their width or height.horizontal_flip=Truemirrors images horizontally, useful when objects can appear facing either direction.zoom_range=0.2randomly zooms in or out by up to 20%.
The augmented images are generated on the fly every epoch, so the model sees slightly different inputs each time, improving generalization and reducing overfitting.
# Build model with dropout and batch normalization
model = models.Sequential([
VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3)),
layers.Flatten(),
layers.Dense(256, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.5),
layers.Dense(10, activation='softmax')
])
The model combines a pre-trained VGG16 feature extractor with new dense layers for classification.
BatchNormalization() stabilizes activations, while Dropout() randomly disables neurons during training to prevent overfitting.
# Compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
The model is compiled using the Adam optimizer and categorical cross-entropy loss, suitable for multi-class classification tasks.
Kiitos palautteestasi!
Kysy tekoälyä
Kysy tekoälyä
Kysy mitä tahansa tai kokeile jotakin ehdotetuista kysymyksistä aloittaaksesi keskustelumme
Awesome!
Completion rate improved to 9.09
Regularization and Optimization Strategies
Pyyhkäise näyttääksesi valikon
Regularization and Optimization in Transfer Learning
When adapting pre-trained models, regularization and optimization strategies help prevent overfitting and improve generalization:
- Dropout: randomly drops units during training to prevent co-adaptation;
- Batch normalization: normalizes activations to stabilize and speed up training;
- Data augmentation: expands the training set by applying random transformations (flip, rotate, zoom, etc.);
- Differential learning rates: use lower learning rates for pre-trained layers and higher for new layers.
Data augmentation is especially important when your target dataset is small.
12345678910111213141516171819202122232425# Example: Adding regularization and data augmentation from tensorflow.keras import layers, models from tensorflow.keras.preprocessing.image import ImageDataGenerator # Data augmentation pipeline train_datagen = ImageDataGenerator( rotation_range=20, width_shift_range=0.2, height_shift_range=0.2, horizontal_flip=True, zoom_range=0.2 ) # Build model with dropout and batch normalization model = models.Sequential([ VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3)), layers.Flatten(), layers.Dense(256, activation='relu'), layers.BatchNormalization(), layers.Dropout(0.5), layers.Dense(10, activation='softmax') ]) # Compile model model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# Data augmentation pipeline
train_datagen = ImageDataGenerator(
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
zoom_range=0.2
)
This block defines a data generator that applies random transformations to training images. Instead of manually creating more images, it produces slightly altered versions in real time during training, which helps the model become more robust.
rotation_range=20rotates each image by up to 20 degrees in any direction.width_shift_range=0.2andheight_shift_range=0.2move images horizontally or vertically by up to 20% of their width or height.horizontal_flip=Truemirrors images horizontally, useful when objects can appear facing either direction.zoom_range=0.2randomly zooms in or out by up to 20%.
The augmented images are generated on the fly every epoch, so the model sees slightly different inputs each time, improving generalization and reducing overfitting.
# Build model with dropout and batch normalization
model = models.Sequential([
VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3)),
layers.Flatten(),
layers.Dense(256, activation='relu'),
layers.BatchNormalization(),
layers.Dropout(0.5),
layers.Dense(10, activation='softmax')
])
The model combines a pre-trained VGG16 feature extractor with new dense layers for classification.
BatchNormalization() stabilizes activations, while Dropout() randomly disables neurons during training to prevent overfitting.
# Compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
The model is compiled using the Adam optimizer and categorical cross-entropy loss, suitable for multi-class classification tasks.
Kiitos palautteestasi!