Generative Adversarial Networks (GAN)- An AI — 'Cat and Mouse Game'

Generative Adversarial Networks (GAN)- An AI — 'Cat and Mouse Game'Pankaj KishoreBlockedUnblockFollowFollowingDec 16, 2018Art of Generative Adversarial NetworksCode link for all the work mention in the post:-pankajkishore/CodeContribute to pankajkishore/Code development by creating an account on GitHub.


comWe had this pleasure of working on Generative adversarial network project for our final project for Business Data Science in our curriculum.

Though we could have chosen any other subject as our final project yet we went ahead with the challenge of training a GAN to generate X-ray images learning from a dataset consisting of 880 X-ray images of size 28*28.

This project was accomplished by Pankaj Kishore, Jitender and Karthik.

Our initial idea was to explore GAN and in the process write our own code using Tensorflow to train a GAN to generate X-ray Images.

I will briefly walkthrough the journey of our project.

Data CollectionThe first step was to collect enough number of training dataset for Discriminator to train our neural net.

We extracted our data from below mentioned website:-BoxEdit descriptionnihcc.



comOnce we had sufficient amount of data we tried exploring with GAN and went through implementing GAN for already existing dataset to learn and then apply the same implementation technique on our dataset.

Initial Learning1.

MNIST datasetWe first explored through MNIST dataset and found enough online resource to train our first GAN model on MNIST dataset which is considered to be easiest to train and it turned out that it was indeed really easy to train yet a great learning experience as we explored the code and understood the basic underlying principles behind GAN.

We tried numerous variations of GAN on MNIST dataset.

The numerous variations we tried are listed below:-Vanilla GANConditional GANWasserstien GANDCGANVAEThe results we achieved were of really good quality and we just had to tweak the code a bit to achieve that result.

We interestingly found that adding more hidden layers for generator had no effect on image quality.

We also tried Variational Autoencoder ( VAE) by referencing code of Ashish Bora from below link and got good enough results.

DCGANVAE generated ImageMNIST at different Epochs through VAEAshishBora/csgmCode to reproduce results from the paper: "Compressed Sensing using Generative Models".

– AshishBora/csgmgithub.


Fashion MNISTWe tried the same variation of MNIST on fashion MNIST dataset and were successfully able to generate images for different classifier item.

This led our interest to going for the bigger devil and that was to write our own code for X-ray images and see if we can reciprocate the same for our images.

We started by learning about GAN in-depth and woyld love to share few key concepts learnt throughout the journey.

The architecture which worked well for us is below:-Images generated for Fashion MNIST code:-99000 epochs60000 epochs40000 epochsFashion MNIST GIF at different epochsGenerative Adversarial Networks an OverviewUnless you have been living under a hut for the last year or so, everyone in Deep Learning — and even some not involved in Deep Learning — have heard and talked about GANs.

GANs or Generative Adversarial Networks are Deep Neural Networks that are generative models of data.

What this means is, given a set of training data, GANs can learn to estimate the underlying probability distribution of the data.

This is very useful, because apart from other things, we can now generate samples from the learnt probability distribution that may not be present in the original training set.

Generative Adversarial Networks are actually two deep networks in competition with each other.

Given a training set X (say a few thousand images of cats), The Generator Network, G(x), takes as input a random vector and tries to produce images similar to those in the training set.

A Discriminator network, D(x), is a binary classifier that tries to distinguish between the real cat images according the training set X and the fake cat images generated by the Generator.

As such, the job of the Generator network is to learn the distribution of the data in X, so that it can produce real looking cat images and make sure the Discriminator cannot distinguish between cat images from the training set and cat images from the Generator.

The Discriminator needs to learn keep up with the Generator trying new tricks all the time to generate fake cat images and fool the Discriminator.

Ultimately, if everything goes well, the Generator (more or less) learns the true distribution of the training data and becomes really good at generating real-looking cat images.

The Discriminator can no longer distinguish between training set cat images and generated cat images.

In this sense, the two networks are continuously trying to make sure the other does not do a good job at their task.

So then, how can this work at all?Another way to look at the GAN setup is that the Discriminator is trying to guide the Generator by telling it what real cat images look like.

And eventually, the Generator figures it out and starts generating real-looking cat images.

The method of training GANs is similar to the Minimax algorithm from Game Theory and the two networks try to achieve what is called the Nash Equilibrium with respect to each other.

What’s going on between the generator and the discriminator here is a 2 player zero sum game.

In other words, in every move, the generator is trying to maximize the chance of the discriminator misclassifying the image and the discriminator is in turn trying to maximize its chances of correctly classifying the incoming image.

A simple flowchart of a GANDiscriminatorGeneratorOnce we learnt the lessons we moved towards actual getting our hands dirty and writing our own implementation of concepts learnt.

Mathematics Behind The NetworkDeep learning or Machine learning is all about optimizing the function or specifically minimizing the loss of algorithm.

We use gradient descent to do that.

Let X be our true dataset and Z be the normal distributed noise.

Let p(z) be data from latent space Z.

G and D are differentiable functions of generative network and discriminative network respectively.

D(x) represents probability that data come from real dataset X.

We train D to maximize the probability log(D(x)) and train G to minimize log(1 — D(G(z)).

In short they play min max game as explained above with each other and obtain global optimality.

GAN ( Loss function )Above function serves loss function for our generative adversarial network.

Now, it is important to note that log(1 — D(G(z)) saturates so we don’t minimize it rather we maximize log(D(G(z))).

In order to prove that sample generated by generator network is exactly the same as X we need to go to deeper in mathematics and use Kullback-Leibler divergence theorem and Jensen-Shannon divergence.

Source used:-Generative Adversarial Networks — A simple introduction.

People are so enthusiastic about doing research in deep learning now days.

Result of this, everyday or month a new…medium.

comGenerating X-Ray Images:-Problem Statement:-High quality delineation of important features is a critical component in biomedical image interpretation for accurate diagnosis and/or assessment of a disease.

Convolutional Neural Networks (CNNs) based Deep Learning (DL) techniques have been proven highly successful in image classification and segmentation tasks by utilizing large number of training images, potentially promising higher throughput and more consistent results in biomedical image interpretation.

Computerized tools enhanced by DL are rapidly proving to be a state-of-the-art solution toward improved accuracy in biomedical image interpretation.

Furthermore, researchers have reported successful training of Generative Adversarial Network (GAN) models to generate synthetic training images as potential solutions to solve the scarcity of training sets.

The following diagram shows the basic concept of GAN network.

In the picture, the generator tries to produce synthetic images to fool the discriminator where as the discriminator tries to tell synthetic images from real images.

General GAN architectureCurrent ChallengesTraining a CNN for Deep Learning typically requires a large amount of labeled training image data, which remains a challenge in the biomedical domain because of the expense of expert annotation.

Although researchers have reported successful training of Generative Adversarial Network (GAN) models to generate synthetic images as a potential solution to solve the training set bottleneck, training CNNs and GANs in a timely manner could require demanding programming experience and skill sets from domain experts.

Our aim is to provide end users a streamlined workflow to facilitate rapid utilization of GAN networks to produce synthetic radiology images for DL trainingSolution ProposedWe tried implementing our GAN to generate the X-ray images and we started with creating simple functions for Generator and Discriminator.

We initially went with three layers for both discriminator and generator.

The activation function used was leaky relu for generator and relu for discriminator.

The random noise was feeded to real image as well using glorot_init function.

We normalize the input layer by adjusting and scaling the activations using batch normalization.

The loss functions were as below for both discriminator and generator# Loss functionsdisc_loss = tf.

reduce_mean( tf.


sigmoid_cross_entropy_with_logits( logits=r_logits, labels=tf.

ones_like(r_logits)) + tf.


sigmoid_cross_entropy_with_logits( logits=f_logits, labels=tf.

zeros_like(f_logits)))gen_loss = tf.

reduce_mean( tf.


sigmoid_cross_entropy_with_logits( logits=f_logits, labels=tf.

ones_like(f_logits)))The strides and learning step use for both Discriminator and Generator were as follows:-gen_step = tf.




minimize(gen_loss, var_list=gen_vars) # G train stepdisc_step = tf.




minimize(disc_loss, var_list=disc_vars) # G train stepA simple function was written to generate random noise( Fake Images) as well which was fed to discriminator along with real images.

The epoch used was 1000 and number of samples on which we trained our neural net was 851 first time and 5000 samples( Downsampled dataset) next time.

The output images we got were really noise and we realized the mistakes why we were not getting the correct output.

The screenshot of image from our code:-Input Images to DiscriminatorOutput Image generatedThe potential reasons which led to generator virtually learning nothing was a lot of factor just to state a few:-The low resolution quality image input to discriminatorDownsampling the actual 1240*1240 images to 28*28Not sufficient dataset for neural net to learnGradient Descent vanishing problem as discriminator gets too successful that the generator gradient vanishes and learns nothingMode Collapse: The generator collapses which produces limited number of samplesThe more we tried the output was getting only better at producing more whitespaces from above samples.

Even after numerous tries we were not able to generate good quality images from GAN because of limitations with the dataset we had.

Pokemon GenerationAfter failed attempts to create X-ray images from NIH website we thought of trying our code for generating pokemon just to ensure that probably it’s a dataset issue.

Unfortunately we got good but not such great results.

Below is the generated pokemon from our X-ray codePokemon’s from X-ray codeBetter than X-ray images for sure.

We then instead tried to run the code created by a popular machine learning Youtuber Siraj Raval (which he borrowed from moxiegushi).

llSourcell/Pokemon_GANPokemon_GAN — This is the code for “Generating Pokemon with a Generative Adversarial Network” by Siraj Raval on Youtubegithub.

comThe code has to be modified a bit though to suit the more advanced dataset of 851 pokemon’s collected from Kaggle website.

The code was modified to process higher resolution Pokemon as well to train the discriminator.

The results were stupendous even though it took lot of time to run.

The architecture for the code was as below:-Architecture for PokemonAbout DiscriminatorIn DCGAN architecture, the discriminator D is Convolutional Neural Networks (CNN) that applies a lot of filters to extract various features from an image.

The discriminator network will be trained to discriminate between the original and generated image.

The process of convolution is shown in the illustration below :About GeneratorThe generator G, which is trained to generate image to fool the discriminator, is trained to generate image from a random input.

In DCGAN architecture, the generator is represented by convolution networks that upsample the input.

The goal is to process the small input and make an output that is bigger than the input.

It works by expanding the input to have zero in-between and then do the convolution process over this expanded area.

The convolution over this area will result in larger input for the next layer.

The process of upsampling is shown below:Depending on sources, you can find various annotations for the upsample process.

Sometimes they are referred as full convnets, in-network upsampling, fractionally-strided convolution, deconvolution and it goes on and on.

Overview of the network architecture for GeneratorHyperparameter of DCGANOne thing that everyone notices is that the GANs are highly computationally expensive.

The thing that people overlook generally is how fragile GANs are with respect to hyperparameters.

GANs work exceptionally well with certain parameters but not with others.

Currently tunning these knobs are part of the art in designing network architecture.

The hyperparameteres that we have decided to go with are as follows:HyperparameterMini-batch size of 64Weight initialize from normal distribution with std = 0.

02LRelu slope = 0.

2Adam Optimizer with learning rate = 0.

0002and momentum = 0.

5ResultsGiven the limited time and cost associated we didn’t train our model completely but the pokemon generated for 1300 epochs I am attaching below:-Pokemon after 1300 epochsPokemon Generation for incremental EpochsKey Learnings from the project:-It was a great experience working on this project even though it was really challenging yet it opened our horizons to all new possibilities of magic neural networks can deliver.

We learnt a great dealt through our mistakes and realized the possible modifications we need to do to make GAN work for our X-ray image dataset.

The key takeaways were:-It’s really hard to train a GANWe need to do batch normalization along with other factors to get better results.

Need to keep an eye on gradient descent and loss values to ensure none of the generator or discriminator reaches zero valueRelu activation functions are real friend.

Convolution network needs to be implemented they really work for most dataset if not allWe need GPU’s to train a GAN it’s impossible to train on CPU unless you can wait for years!!Future Prospects:There are a lot of things we can try first try to generate images of quality 64*64 instead of 256*256 which we tried generating currently and see if that improves the image quality.

We would definitely keep working on the X-ray images and try to implement convolution neural network for it and see if we can generate better images.

The collection of training data of maybe around 100,000 images might help train the CNN model’s better and might drastically improve the quality of generator images.

Though it might get trickier and harder but if CNN works better than would love to implement the solution for higher resolution images or maybe colored images and see how GAN performs.

We would even like to try X-Ray images for Variational Encoders and see if that helps.

We would like to even try things we found online like:-Minibatch discrimination : Let the generator classify multiple images in a ‘minibatch’ instead of just oneVirtual batch normalization : Each example is normalized based on a reference batch of samplesConclusion:This project has really provided the impetus for numerous implementations we can try with neural network.

It has really garnered the attention of all and sundry and it’s hard to not be appreciative of endless possibilities GAN can lead to.

We really learnt a lot in terms of using Tensorflow or cloud while working on this project.

We realized how hard it is to actually train a GAN and even though things may work on one dataset it doesn’t necessarily mean it would work the same way on another given dataset.

The concept of GANs is not that hard to understand (i.


, Understanding Generative Adversarial Networks).

But implementing them to produce quality images can be tricky.

Considering the limitless opportunities with GAN I can state a few in future:-GANS can be potentially be used in:Supervised learning:predicting as accurately as possible subject to {some constraints} e.


fairnessdata augmentation in Imbalanced ClassificationOutlier detectionvector-2-vector predictions e.


multi-label prediction with arbitrary loss that does not assume Labels to be independent conditional on FeaturesSemi-supervised:in real-life datasets, it happens quite often that only a small subset of data is labelled.

Unsupervised :matrix completionembeddingsThanks everyone for reading!!.Hope you enjoyed our naive attempt at GAN and we really hope to continue working and update with some more awesome results for any dataset!!Other Links Used for reference for the post:https://philparadis.


com/2017/04/24/training-gans-better-understanding-and-other-improved-techniques/NIPS 2016 GAN TutorialConditional GAN.

. More details

Leave a Reply