Autoencoders — Escape the curse of dimensionality.

4 min readApr 1, 2019

Autoencoders falls under the class of unsupervised learning where the Function tries to mimic itself with some constraints such as pushing the input towards the bottleneck such that it just learns enough significant features of the input data to reconstruct it back with minimal loss

Audience

People who are new to the space of deep learning

Prerequisite

Basic understanding of convolutional neural networks

Auto-encoders have three components

First the encoding unit
Second the latent space
Third the decoder unit

In the encoder part, the image is loosing its free dimensions and tries to learn a significant part of the underlying data.

The latent space is the bottleneck layer when the whole image is compressed and represented in minimal dimensions. In the below example conv2d_3 is the bottleneck layer

The decoder unit tries to reconstruct the image which it has learned in the previous layers using upsampling.

Installing Dependencies

pip install keras

Keras is an open-source neural-network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, Theano, or PlaidML. Designed to enable fast experimentation with deep neural networks, it focuses on being user-friendly, modular, and extensible

pip install numpy

NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

pip install matplotlib

Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, Qt, or GTK+

The acutal coding starts here
Importing Libraries

from IPython.display import Image, SVG
import matplotlib.pyplot as plt%matplotlib inlineimport numpy as np
import keras
from keras.models import Model, Sequential
from keras.layers import Input, Dense, Conv2D, MaxPooling2D, UpSampling2D, Flatten, Reshape,Dropout
from keras import regularizers

Loads the training and test data sets ignoring class labels since we are using autoencoder we don't need the class labels

from keras.datasets import mnist
(x_train, _), (x_test, _) = mnist.load_data()

Normalization of the input data between to scale it between 0 and 1

max_value = float(x_train.max())
x_train = x_train.astype(‘float32’) / max_value
x_test = x_test.astype(‘float32’) / max_value

Dimension of train and test data

print(x_test.shape,x_train.shape)

Output: (10000, 28, 28) (60000, 28, 28)

Changing the train and test data to a 4-dimensional tensor as keras expects 4-Dimensional tensor as input

The first dimension is for the number of images
The Second and third is for the width and height for the image
The fourth dimension is for the number of channels

x_train = x_train[:,:,:, None]
x_test = x_test[:,:,:,None]
print(x_test.shape,x_train.shape)

Output: (10000, 28, 28,1) (60000, 28, 28,1)
Defining the Dimension of the image using Input function in keras

input_img = Input(shape=(28, 28, 1))

Architecture of Encoder

x = Conv2D(16, (3, 3), activation=’relu’, padding=’same’)(input_img)
x = MaxPooling2D((2, 2), padding=’same’)(x)
x = Conv2D(8, (3, 3), activation=’relu’, padding=’same’)(x)
x = MaxPooling2D((2, 2), padding=’same’)(x)
encoded = Conv2D(8, (3, 3), activation=’relu’, padding=’same’)(x)

ReLU layer will apply the function f(x)=max(0,x) in all elements on an input tensor, without changing it’s spatial or depth information and brings nonlinearity to the networks
Architecture of Decoder

x = Conv2D(8, (3, 3), activation=’relu’, padding=’same’)(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation=’relu’, padding=’same’)(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation=’relu’, padding=’same’)(x)
decoded = Conv2D(1, (3, 3), activation=’sigmoid’, padding=’same’)(x)

For the decoder, we use Upsampling from keras instead, as we have to reconstruct the image to its original dimensions.

Defining the model

autoencoder = Model(input_img, decoded)
autoencoder.summary()
Output: _________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 28, 28, 1)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 28, 28, 16)        160       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 16)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 14, 14, 8)         1160      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 7, 7, 8)           0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 7, 7, 8)           584       
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 7, 7, 8)           584       
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 14, 14, 8)         0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 14, 14, 8)         584       
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 28, 28, 8)         0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 28, 28, 16)        1168      
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 28, 28, 1)         145       
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0

Compiling and Fitting the model

autoencoder.compile(optimizer=’adam’, loss=’mean_squared_error’)
autoencoder.fit(x_train, x_train,
epochs=10,
batch_size=256,
shuffle=True,
validation_data=(x_test, x_test))

As this is a regression problem I choose to use mse-error as my loss function and Adam is the optimizer most commonly used

In conclusion, autoencoder is forced to form a representation at the intermediate hidden layer that has a smaller number of variables than the input. This forces the autoencoder to keep only the components that are useful for reconstructing the common features of the inputs and to reject any components that are not common features. As a result, an autoencoder will tend to learn a representation in the hidden layer that rejects noise from the input.

About me

I am an intern at Wavelabs.ai. We at Wavelabs help you leverage Artificial Intelligence (AI) to revolutionize user experiences and reduce costs. We uniquely enhance your products using AI to reach your full market potential. We try to bring cutting edge research into your applications. Have a look at us.

You can reach me out at LinkedIn

Autoencoders — Escape the curse of dimensionality.

Written by Ronak Chhatbar