sebastiansarasti
/

AutoEncoderImageColorization

image-colorization

pytorch_model_hub_mixin

Model card Files Files and versions

AutoEncoderImageColorization / README.md

sebastiansarasti's picture

sebastiansarasti

Update README.md

9997f1f verified about 1 year ago

|

history blame contribute delete

3.41 kB

	---
	tags:
	- autoencoder
	- image-colorization
	- pytorch
	- pytorch_model_hub_mixin
	license: apache-2.0
	datasets:
	- flwrlabs/celeba
	language:
	- en
	metrics:
	- mse
	pipeline_tag: image-to-image
	---

	# Model Colorization Autoencoder

	## Model Description

	This autoencoder model is designed for image colorization. It takes grayscale images as input and outputs colorized versions of those images. The model architecture consists of an encoder-decoder structure, where the encoder compresses the input image into a latent representation, and the decoder reconstructs the image in color.

	### Architecture

	- Encoder: The encoder comprises three convolutional layers followed by max pooling and ReLU activations, each paired with batch normalization. It ends with a flattening layer and a fully connected layer to produce a latent vector.
	- Decoder: The decoder mirrors the encoder, using linear and transposed convolutional layers with ReLU activations and batch normalization. The final layer outputs a color image using a sigmoid activation function.

	The architecture details are as follows:
	```python
	class ModelColorization(nn.Module, PyTorchModelHubMixin):
	def __init__(self):
	super(ModelColorization, self).__init__()
	self.encoder = nn.Sequential(
	nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1),
	nn.MaxPool2d(kernel_size=2, stride=2),
	nn.ReLU(),
	nn.BatchNorm2d(64),
	nn.Conv2d(64, 32, kernel_size=3, stride=1, padding=1),
	nn.MaxPool2d(kernel_size=2, stride=2),
	nn.ReLU(),
	nn.BatchNorm2d(32),
	nn.Conv2d(32, 16, kernel_size=3, stride=1, padding=1),
	nn.MaxPool2d(kernel_size=2, stride=2),
	nn.ReLU(),
	nn.BatchNorm2d(16),
	nn.Flatten(),
	nn.Linear(164545, 4000),
	)
	self.decoder = nn.Sequential(
	nn.Linear(4000, 16 * 45 * 45),
	nn.ReLU(),
	nn.Unflatten(1, (16, 45, 45)),
	nn.ConvTranspose2d(16, 32, kernel_size=3, stride=2, padding=1, output_padding=1),
	nn.ReLU(),
	nn.BatchNorm2d(32),
	nn.ConvTranspose2d(32, 64, kernel_size=3, stride=2, padding=1, output_padding=1),
	nn.ReLU(),
	nn.BatchNorm2d(64),
	nn.ConvTranspose2d(64, 3, kernel_size=3, stride=2, padding=1, output_padding=1),
	nn.Sigmoid()
	)

	def forward(self, x):
	x = self.encoder(x)
	x = self.decoder(x)
	return x

	```

	### Training Details
	The model was trained using PyTorch for 5 epochs. Here are the training and validation losses observed during the training:

	Epoch 1: Training Loss: 0.0063, Validation Loss: 0.0042

	Epoch 2: Training Loss: 0.0036, Validation Loss: 0.0035

	Epoch 3: Training Loss: 0.0032, Validation Loss: 0.0032

	Epoch 4: Training Loss: 0.0030, Validation Loss: 0.0030

	Epoch 5: Training Loss: 0.0029, Validation Loss: 0.0030

	The model demonstrated continuous improvement in reducing both training and validation loss over the epochs.

	### Usage
	You can load the model from the Hugging Face Hub using the following code:

	```python
	# Ensure you have the necessary dependencies installed:
	pip install torch torchvision transformers

	from transformers import AutoModel

	model = AutoModel.from_pretrained("sebastiansarasti/AutoEncoderImageColorization")
	```