Who’s That Pokémon?Using Python to answer that age-old question.
Yish LimBlockedUnblockFollowFollowingJan 11So I’ve been a lifelong Pokémon fan and a recently-converted Computer Science nerd.
While my main interest lies in Data Science (plug for my last post on the Ethics of Data Science), I find myself thinking up random coding projects all the time, most of which never come to fruition.
I was working on a classification project with my friend Augustine Chang and while looking at a scatterplot of our data, we saw a familiar silhouette.
As a joke, we overlaid images of Pokémon onto our graphs in our presentation slides to look like it was some fun matplotlib function.
That’s how this mini-project was born.
Groudon and Kyogre.
If only, matplotlib.
Step 1: Interpreting (Pokémon) ImagesWith my final goal being to be able to fit Pokémon to blobs, I first needed to be able to find a way to compare images.
I found a dataset on Kaggle with images of all Pokémon conveniently in 256 x 256 transparent PNGs, and this would be the set of images I’d compare my blobs to.
In another post I wrote on Computational Photography, I found that images could be interpreted as arrays within arrays — tensors, if you will, with each pixel being represented by an array, this array being the RGB values of that pixel.
Using the imread method in matplotlib.
image, I was able to easily output this array given the image.
On top of that, matplotlib has the capability to output arrays of the right dimensionality as an image on a 2D graph using imshow.
‘paths’ is the routes of the images; where the images are savedStep 2: Comparing Images Within DatasetTo compare images, I decided that the simplest way was to compare the RGB values at every corresponding pixel of two images.
More specifically, I took the Euclidean distance between every pair of matching pixels across the two images, and used that to find the average pixel distance across the two images.
To start, so as not to have to deal with processing other images, I wanted to see what the most similar Pokémon would be, given a single Pokémon.
In my code, I wanted to be able to input the Pokémon ID number and return its best match.
And so I set it up:A function that would return the original and best-matching image, side by side.
A function that would return the average pixel distance between two images.
A function that would return a list of distances between the input image and images of every Pokémon.
A function, nesting the three aforementioned functions, that would output the intended result.
Here are some of the results at this point:Charmander (ID: 4) is most similar to Charmeleon!Voltorb (ID: 100) is most similar to Electrode!Pretty good!.Because the metric I’m using is the average pixel distance, and because the images have transparent backgrounds, both the color and the shape of the Pokémon are taken into consideration.
Step 3: Using Any Input ImageHere, all I had to do was make a few tweaks to my existing functions so they could take in any image.
A limitation was that I could only use images that have the same dimensionality as the images in the Pokémon dataset.
So, I first tested my updated functions on other 256 x 256 images.
Cool!.But what if the image is not 256 x 256?.Python’s Pillow package does this for us quite easily with the aptly named resizeimage function, and I wrote a function to take in the file name of the image and resize it to the dimensions I need.
Step 4: Putting It All TogetherHere’s all the code used for my function, bestfitpokemon()!The Entire Code!A caveat: I found that JPEG files did not work because the Pokémon’s PNG dataset meant that each pixel has an additional ‘alpha’ value that JPEGs do not have.
The function would break when finding Euclidean distance between a 3D and a 4D array.
I took our original scatterplots and with some shoddy photoshop work, extracted just the red and blue blobs to transparent PNG images, just to see what the best-fit Pokémon would be.
Best-Fit PokémonTo Conclude…There are some obvious next steps for me here.
I could crop and rescale, or find some other way to deal with white space so that colors would be weighted higher.
I could also potentially turn this into a categorical model by feeding in multiple images of each Pokémon.
So my blobs didn’t turn out to be Groudon or Kyogre.
Inspecting these results, it seems that the shapes of the Pokémon returned are pretty similar to the blob shapes!.So by my model, Psyduck and Marowak are the best-fit Pokémon for my blobs.
.. More details