Thanks to visit codestin.com
Credit goes to github.com

Skip to content

This project implements a Generative Adversarial Network (GAN) to generate fashion images using the Fashion MNIST dataset. The notebook contains a complete implementation of both generator and discriminator models, training loop, and visualization of generated images.

Notifications You must be signed in to change notification settings

AnderCruz/Images-Generation-Generative-Adversarial-Networks-GANs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Images Generation with Generative Adversarial Networks (GANs)

Project Description

This comprehensive project implements a Deep Convolutional Generative Adversarial Network (DCGAN) from scratch using TensorFlow to generate synthetic fashion images. The implementation leverages the Fashion MNIST dataset - a collection of 70,000 28x28 grayscale images across 10 fashion categories including T-shirts, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots.

The project demonstrates the complete machine learning workflow for generative modeling:

  • Data preprocessing and normalization
  • Neural network architecture design
  • GAN training dynamics and adversarial training
  • Model checkpointing and persistence
  • Real-time visualization of training progress
  • Synthetic image generation and evaluation

Technical Architecture

Generator Network

The generator is designed as a transposed convolutional network that learns to map random noise vectors from a latent space to realistic 28x28 fashion images:

Generator Architecture:
┌─────────────────────────────────────────────────────────────┐
│ Input: 100-dimensional random noise vector                  │
│ ↓                                                           │
│ Dense Layer: 7*7*256 units + BatchNorm + LeakyReLU         │
│ ↓                                                           │
│ Reshape: 7×7×256 feature maps                              │
│ ↓                                                           │
│ Conv2DTranspose: 128 filters (5×5), stride 1 + BatchNorm   │
│ ↓                                                           │
│ LeakyReLU activation                                        │
│ ↓                                                           │
│ Conv2DTranspose: 64 filters (5×5), stride 2 + BatchNorm    │
│ ↓                                                           │
│ LeakyReLU activation                                        │
│ ↓                                                           │
│ Conv2DTranspose: 1 filter (5×5), stride 2 + tanh activation│
│ ↓                                                           │
│ Output: 28×28×1 generated image                            │
└─────────────────────────────────────────────────────────────┘

Key Features:

  • Uses transposed convolutions for upsampling
  • Batch normalization for stable training
  • LeakyReLU activations to prevent dying neurons
  • Tanh activation in final layer for pixel values in range [-1, 1]

Discriminator Network

The discriminator acts as a binary classifier that distinguishes between real Fashion MNIST images and synthetic images generated by the generator:

Discriminator Architecture:
┌─────────────────────────────────────────────────────────────┐
│ Input: 28×28×1 grayscale image                             │
│ ↓                                                           │
│ Conv2D: 64 filters (5×5), stride 2 + LeakyReLU             │
│ ↓                                                           │
│ Dropout: 30% rate for regularization                       │
│ ↓                                                           │
│ Conv2D: 128 filters (5×5), stride 2 + LeakyReLU            │
│ ↓                                                           │
│ Dropout: 30% rate for regularization                       │
│ ↓                                                           │
│ Flatten: Convert to 1D vector                              │
│ ↓                                                           │
│ Dense: 1 unit (real/fake classification)                   │
│ ↓                                                           │
│ Output: Binary classification logit                        │
└─────────────────────────────────────────────────────────────┘

Key Features:

  • Convolutional layers for feature extraction
  • Dropout layers to prevent overfitting
  • LeakyReLU activations for gradient flow
  • No final activation (uses logits for numerical stability)

Training Methodology

Adversarial Training Process

The GAN training follows a minimax game between generator (G) and discriminator (D):

Training Loop:
for each training iteration:
    █ Sample real images from Fashion MNIST dataset
    █ Sample random noise vectors
    █ Generate fake images using generator
    █ Train discriminator:
        - Maximize log(D(real)) + log(1 - D(G(noise)))
        - Learn to distinguish real vs fake images
    █ Train generator:
        - Maximize log(D(G(noise)))
        - Learn to fool the discriminator
    █ Update weights using Adam optimizer

Loss Functions

Discriminator Loss:

def discriminator_loss(real_output, fake_output):
    real_loss = cross_entropy(tf.ones_like(real_output), real_output)
    fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
    return real_loss + fake_loss

Generator Loss:

def generator_loss(fake_output):
    return cross_entropy(tf.ones_like(fake_output), fake_output)

Optimization Configuration

  • Optimizer: Adam (Adaptive Moment Estimation)
  • Learning Rate: 1e-4 for both generator and discriminator
  • Batch Size: 256 samples
  • Epochs: 50 training cycles
  • Noise Dimension: 100 (latent space size)

Data Pipeline

Dataset Preparation

# Load and preprocess Fashion MNIST
(train_images, train_labels), _ = tf.keras.datasets.fashion_mnist.load_data()

# Normalize pixel values from [0, 255] to [-1, 1]
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32')
train_images = (train_images - 127.5) / 127.5

# Create TensorFlow Dataset for efficient batching and shuffling
train_ds = tf.data.Dataset.from_tensor_slices(train_images)
            .shuffle(60000)
            .batch(batch_size)

Data Characteristics:

  • Total Samples: 60,000 training images
  • Image Dimensions: 28×28 pixels
  • Color Channels: 1 (grayscale)
  • Normalization: [-1, 1] range for tanh activation compatibility
  • Classes: 10 fashion categories with balanced distribution

Key Features & Implementation Details

Training Monitoring

The implementation includes comprehensive training monitoring:

  1. Progress Tracking: Real-time epoch timing and progress display
  2. Visualization: Automatic image generation at each epoch
  3. Checkpointing: Model state saved every 15 epochs for recovery
  4. Clear Output: Dynamic notebook cell clearing for clean visualization

Image Generation Utilities

def generate_and_save_images(model, epoch, input):
    # Generate images in inference mode
    predictions = model(input, training=False)
    
    # Create 4x4 grid of generated images
    fig = plt.figure(figsize=(4, 4))
    for i in range(predictions.shape[0]):
        plt.subplot(4, 4, i+1)
        plt.imshow(predictions[i, :, :, 0] * 127.5 + 127.5, cmap='gray')
        plt.axis('off')
    
    # Save visualization for later comparison
    plt.savefig(f'image_at_epoch_{epoch:04d}.png')
    plt.show()

Model Persistence

# Configure checkpoint saving
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint(
    generator_optimizer=generator_optimizer,
    discriminator_optimizer=discriminator_optimizer,
    generator=generator,
    discriminator=discriminator
)

Usage Examples

Basic Training Execution

# Initialize models
generator = build_generator()
discriminator = build_discriminator()

# Start training process
train(train_ds, epochs=50)

Image Generation After Training

# Generate 16 sample images
seed = tf.random.normal([16, noise_dimension])
generated_images = generator(seed, training=False)

# Display results
generate_and_save_images(generator, 'final', seed)

Model Restoration

# Restore from latest checkpoint
checkpoint.restore(tf.train.latest_checkpoint(checkpoint_dir))

# Continue training or generate images

Expected Results & Performance

Training Progression

Early Epochs (1-10):

  • Generator produces noisy, unrecognizable patterns
  • Discriminator quickly learns to distinguish real vs fake
  • Loss values fluctuate significantly

Middle Epochs (10-30):

  • Generator begins to learn fashion item shapes
  • Adversarial competition intensifies
  • Generated images show basic structure and outlines

Late Epochs (30-50):

  • Generator produces recognizable fashion items
  • Training stabilizes as models reach Nash equilibrium
  • High-quality synthetic images with clear features

Performance Metrics

While GANs are typically evaluated qualitatively, key observable metrics include:

  • Visual quality of generated images
  • Diversity across different fashion categories
  • Training stability (absence of mode collapse)
  • Convergence behavior of generator and discriminator losses

Advanced Configuration Options

Hyperparameter Tuning

# Experiment with different configurations
configurations = {
    'learning_rate': [1e-4, 5e-4, 1e-3],
    'batch_size': [128, 256, 512],
    'noise_dim': [50, 100, 200],
    'dropout_rate': [0.3, 0.5, 0.7]
}

Architecture Modifications

  • Adjust number of convolutional filters
  • Experiment with different activation functions
  • Modify network depth and complexity
  • Try alternative normalization techniques

Troubleshooting & Common Issues

Training Instabilities

  • Mode Collapse: Generator produces limited variety
  • Solution: Adjust learning rates, use different architectures
  • Vanishing Gradients: Use LeakyReLU, proper weight initialization
  • Oscillating Losses: Balance generator/discriminator training frequency

Performance Optimization

  • Use GPU acceleration for faster training
  • Implement mixed-precision training
  • Optimize data pipeline with prefetching
  • Monitor memory usage with large batch sizes

Educational Value

This project serves as an excellent learning resource for:

  • GAN fundamentals and adversarial training concepts
  • TensorFlow 2.x implementation patterns
  • Deep convolutional networks for image generation
  • Neural network debugging and visualization techniques
  • Hyperparameter tuning and experimental methodology

Future Enhancements

Potential improvements and extensions:

  • Conditional GAN for category-specific generation
  • Progressive growing for higher resolution images
  • Wasserstein GAN with gradient penalty for improved stability
  • Style-based generators for better control over image features
  • Quantitative evaluation metrics (FID, Inception Score)

Project Tags

GAN Generative-Adversarial-Networks TensorFlow Deep-Learning Computer-Vision Image-Generation Fashion-MNIST DCGAN Machine-Learning Neural-Networks Convolutional-Networks Adversarial-Training Generative-Models Artificial-Intelligence Deep-Convolutional-GAN Image-Synthesis TensorFlow-Implementation Neural-Network-Architecture Model-Training Checkpointing

This project provides a solid foundation for understanding and implementing generative adversarial networks, with practical code that can be extended for various image generation tasks and research experiments.

About

This project implements a Generative Adversarial Network (GAN) to generate fashion images using the Fashion MNIST dataset. The notebook contains a complete implementation of both generator and discriminator models, training loop, and visualization of generated images.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published