This comprehensive project implements a Deep Convolutional Generative Adversarial Network (DCGAN) from scratch using TensorFlow to generate synthetic fashion images. The implementation leverages the Fashion MNIST dataset - a collection of 70,000 28x28 grayscale images across 10 fashion categories including T-shirts, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots.
The project demonstrates the complete machine learning workflow for generative modeling:
- Data preprocessing and normalization
- Neural network architecture design
- GAN training dynamics and adversarial training
- Model checkpointing and persistence
- Real-time visualization of training progress
- Synthetic image generation and evaluation
The generator is designed as a transposed convolutional network that learns to map random noise vectors from a latent space to realistic 28x28 fashion images:
Generator Architecture:
┌─────────────────────────────────────────────────────────────┐
│ Input: 100-dimensional random noise vector │
│ ↓ │
│ Dense Layer: 7*7*256 units + BatchNorm + LeakyReLU │
│ ↓ │
│ Reshape: 7×7×256 feature maps │
│ ↓ │
│ Conv2DTranspose: 128 filters (5×5), stride 1 + BatchNorm │
│ ↓ │
│ LeakyReLU activation │
│ ↓ │
│ Conv2DTranspose: 64 filters (5×5), stride 2 + BatchNorm │
│ ↓ │
│ LeakyReLU activation │
│ ↓ │
│ Conv2DTranspose: 1 filter (5×5), stride 2 + tanh activation│
│ ↓ │
│ Output: 28×28×1 generated image │
└─────────────────────────────────────────────────────────────┘
Key Features:
- Uses transposed convolutions for upsampling
- Batch normalization for stable training
- LeakyReLU activations to prevent dying neurons
- Tanh activation in final layer for pixel values in range [-1, 1]
The discriminator acts as a binary classifier that distinguishes between real Fashion MNIST images and synthetic images generated by the generator:
Discriminator Architecture:
┌─────────────────────────────────────────────────────────────┐
│ Input: 28×28×1 grayscale image │
│ ↓ │
│ Conv2D: 64 filters (5×5), stride 2 + LeakyReLU │
│ ↓ │
│ Dropout: 30% rate for regularization │
│ ↓ │
│ Conv2D: 128 filters (5×5), stride 2 + LeakyReLU │
│ ↓ │
│ Dropout: 30% rate for regularization │
│ ↓ │
│ Flatten: Convert to 1D vector │
│ ↓ │
│ Dense: 1 unit (real/fake classification) │
│ ↓ │
│ Output: Binary classification logit │
└─────────────────────────────────────────────────────────────┘
Key Features:
- Convolutional layers for feature extraction
- Dropout layers to prevent overfitting
- LeakyReLU activations for gradient flow
- No final activation (uses logits for numerical stability)
The GAN training follows a minimax game between generator (G) and discriminator (D):
Training Loop:
for each training iteration:
█ Sample real images from Fashion MNIST dataset
█ Sample random noise vectors
█ Generate fake images using generator
█ Train discriminator:
- Maximize log(D(real)) + log(1 - D(G(noise)))
- Learn to distinguish real vs fake images
█ Train generator:
- Maximize log(D(G(noise)))
- Learn to fool the discriminator
█ Update weights using Adam optimizer
Discriminator Loss:
def discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
return real_loss + fake_lossGenerator Loss:
def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)- Optimizer: Adam (Adaptive Moment Estimation)
- Learning Rate: 1e-4 for both generator and discriminator
- Batch Size: 256 samples
- Epochs: 50 training cycles
- Noise Dimension: 100 (latent space size)
# Load and preprocess Fashion MNIST
(train_images, train_labels), _ = tf.keras.datasets.fashion_mnist.load_data()
# Normalize pixel values from [0, 255] to [-1, 1]
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32')
train_images = (train_images - 127.5) / 127.5
# Create TensorFlow Dataset for efficient batching and shuffling
train_ds = tf.data.Dataset.from_tensor_slices(train_images)
.shuffle(60000)
.batch(batch_size)Data Characteristics:
- Total Samples: 60,000 training images
- Image Dimensions: 28×28 pixels
- Color Channels: 1 (grayscale)
- Normalization: [-1, 1] range for tanh activation compatibility
- Classes: 10 fashion categories with balanced distribution
The implementation includes comprehensive training monitoring:
- Progress Tracking: Real-time epoch timing and progress display
- Visualization: Automatic image generation at each epoch
- Checkpointing: Model state saved every 15 epochs for recovery
- Clear Output: Dynamic notebook cell clearing for clean visualization
def generate_and_save_images(model, epoch, input):
# Generate images in inference mode
predictions = model(input, training=False)
# Create 4x4 grid of generated images
fig = plt.figure(figsize=(4, 4))
for i in range(predictions.shape[0]):
plt.subplot(4, 4, i+1)
plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
plt.axis('off')
# Save visualization for later comparison
plt.savefig(f'image_at_epoch_{epoch:04d}.png')
plt.show()# Configure checkpoint saving
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint(
generator_optimizer=generator_optimizer,
discriminator_optimizer=discriminator_optimizer,
generator=generator,
discriminator=discriminator
)# Initialize models
generator = build_generator()
discriminator = build_discriminator()
# Start training process
train(train_ds, epochs=50)# Generate 16 sample images
seed = tf.random.normal([16, noise_dimension])
generated_images = generator(seed, training=False)
# Display results
generate_and_save_images(generator, 'final', seed)# Restore from latest checkpoint
checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))
# Continue training or generate imagesEarly Epochs (1-10):
- Generator produces noisy, unrecognizable patterns
- Discriminator quickly learns to distinguish real vs fake
- Loss values fluctuate significantly
Middle Epochs (10-30):
- Generator begins to learn fashion item shapes
- Adversarial competition intensifies
- Generated images show basic structure and outlines
Late Epochs (30-50):
- Generator produces recognizable fashion items
- Training stabilizes as models reach Nash equilibrium
- High-quality synthetic images with clear features
While GANs are typically evaluated qualitatively, key observable metrics include:
- Visual quality of generated images
- Diversity across different fashion categories
- Training stability (absence of mode collapse)
- Convergence behavior of generator and discriminator losses
# Experiment with different configurations
configurations = {
'learning_rate': [1e-4, 5e-4, 1e-3],
'batch_size': [128, 256, 512],
'noise_dim': [50, 100, 200],
'dropout_rate': [0.3, 0.5, 0.7]
}- Adjust number of convolutional filters
- Experiment with different activation functions
- Modify network depth and complexity
- Try alternative normalization techniques
- Mode Collapse: Generator produces limited variety
- Solution: Adjust learning rates, use different architectures
- Vanishing Gradients: Use LeakyReLU, proper weight initialization
- Oscillating Losses: Balance generator/discriminator training frequency
- Use GPU acceleration for faster training
- Implement mixed-precision training
- Optimize data pipeline with prefetching
- Monitor memory usage with large batch sizes
This project serves as an excellent learning resource for:
- GAN fundamentals and adversarial training concepts
- TensorFlow 2.x implementation patterns
- Deep convolutional networks for image generation
- Neural network debugging and visualization techniques
- Hyperparameter tuning and experimental methodology
Potential improvements and extensions:
- Conditional GAN for category-specific generation
- Progressive growing for higher resolution images
- Wasserstein GAN with gradient penalty for improved stability
- Style-based generators for better control over image features
- Quantitative evaluation metrics (FID, Inception Score)
GAN Generative-Adversarial-Networks TensorFlow Deep-Learning Computer-Vision Image-Generation Fashion-MNIST DCGAN Machine-Learning Neural-Networks Convolutional-Networks Adversarial-Training Generative-Models Artificial-Intelligence Deep-Convolutional-GAN Image-Synthesis TensorFlow-Implementation Neural-Network-Architecture Model-Training Checkpointing
This project provides a solid foundation for understanding and implementing generative adversarial networks, with practical code that can be extended for various image generation tasks and research experiments.