London Design Festival (Part 3): Computer VisionPart 3: Analysing 3K images from Twitter using computer visionVishal KumarBlockedUnblockFollowFollowingJan 16IntroductionIn this final blog post of the series, I apply computer vision techniques to understand 3,300 images about the London Design Festival 2018, a seven-day design festival that happened from 15 to 23 September 2018.
London Design Festival 2018 (LDF18) had a very active events programme spanning 11 different ‘Design Districts’, five ‘Design Destinations’, and three ‘Design Routes’ across London.
It’s another fantastic example of London’s flexibility as a built environment to act as a canvas to display creative ideas.
In part 2 and part 1 of this series, I presented natural language processing and exploratory data analysis of 11,000 tweets about the festival.
Yet, only 3,300 of those tweets had media data (images), so the aim of this article is to use computer vision analysis to understand and contextualise those images I streamed from Twitter.
Please scroll down to view the analysis!An image of LDF18 at the V&A Museum.
Source: FlickrData and MethodsUsing the Twitter API, I collected tweets about LDF18 that contained the hashtag #LDF18.
In total, there were 11,000 tweets but only 3,300 tweets had media data (images).
Read part 2 for more.
Then, labels for each image were extracted using Google Cloud’s Vision API.
The Cloud Vision API leverages “Google’s vast network of machine learning expertise” (great article by Sara Robinson) to detect features and labels about images.
In total, 1,045 different labels were given to the 3,300 images.
Machine learning techniques called feature extraction and reverse image search were then done using Gene Kogan’s code to find images based on visual similarity.
First, a pre-trained convolutional neural network was used to extract “features” for each image, then, the cosine similarity of those features was computed to “search” for a handful of images similar to a query image.
The main role of features in computer vision is to “transform visual information into vector space”.
Similar images should produce similar features, which we can exploit to do information retrieval.
Based on these features, we can also cluster images by similarity using a method called t-SNE.
Analysis of imagesIn this section, I present the findings of my computer vision analysis.
Below, I report on the following three metrics:Label detection for images;Image search based on visual similarity;Image clustering based on visual similarity.
Label detectionLabels for each photo was generated using Use Google Cloud Vision API.
The idea behind this was to categorise the images so that I could identify similar ones.
The bar graph below shows the top 10 labels for the 3,300 images.
We see that “product”, “font”, “furniture”, “table” and “design” appeared the most.
These labels make sense because it was a design festival!.This is good news, it demonstrates that the Cloud Vision API has done a good job of tagging images about a design festival.
However, these tags don’t explicitly describe the artworks themselves — I am interested in a slightly more detailed contextual understanding — which highlights a drawback of some label detection techniques.
Image Search — visual similarityInstead of using labels to understand the images, we could program the computer to learn visual similarities between the images.
A technique called feature extraction and reverse image search does exactly this.
Using a Keras VGG16 neural network model running on a TensorFlow backend, I first extracted a feature for each image in the dataset.
A feature is a 4096-element array of numbers for each image.
Our expectations are that “the feature forms a very good representation of the image such that similar images will have a similar feature” (Gene Kogan, 2018).
The feature’s dimensions were then reduced using principal component analysis (PCA) to create an embedding, and then, the distance — cosine distance — of one image’s PCA embedding to another was computed.
I was finally able to send the computer a random query image and it selected and return five other images in the dataset that had a similar feature vector.
Three examples are below:A reverse image search for Dazzle by Pentagram and 14–18 NOW at LDF18This technique can be really helpful when trying to find similar images from an album of many images, which is in fact what I was doing!Image Clustering — similarityNow that we have an embedding for each image in a vector space, we can use a popular machine learning visualization algorithm called t-SNE to cluster and then visualize that vector space in 2-dimensions.
“The aim of tSNE is to cluster small “neighbourhoods” of similar data points while also reducing the overall dimensionality of the data so it is more easily visualized” (Google AI Blog, 2018)Below we see clusters forming based on visual similarity.
In the image below, I highlight three artworks — Multiply by Waugh Thistleton Architects, Please Feed the Lions by Es Devlin and Dazzle by Pentagram— and their cluster profiles.
The clustering of images of three art installations at LDF18.
Source: TwitterConclusionSo there you have it!.I’ve only really dipped my toe into the wonderful world of computer vision.
There’s is still so much that I need to learn but this was a great first step for me.
My findings showed it’s possible to use machine learning and computer vision techniques to understand and contextualise images about the LDF18.
One obvious next step would be for me to count how many art installations appeared within the data set to measure “popularity”.
I’m going to continue playing around with this dataset.
The endThis is the end of my blog series on LDF18!.This series has been part of a longer discussion I am having about using data science to understand and measure the impact of culture in cities.
Stay tuned!Thanks for reading!VishalVishal is a Research Student at The Bartlett at UCL in London.
He is interested in the economic and social impact of culture in cities.
.. More details