This is where we use the word-to-index map.
Consider you want to get the embedding vector for the word “although”, according to the word-to-index map this word is represented by the number 2511.
In the next step, it is necessary to create a one-hot-encoded vector of size 18339 (number of words in the dataset), where each entry is 0 except for the 2511th entry which has the value of 1.
By doing a dot-product between the embedding matrix and the one-hot-encoded vector we obtain the 2511th column of the matrix, which is the embedding vector for the word “although”.
This way we can feed whole string-paragraphs or Netflix reviews into an LSTM.
We just look up for each word the integer value in the word-to-index map, create the appropriate one-hot-encoded vector and perform a dot-product with the matrix.
The review is then fed word by word (vector by vector) into the LSTM network.
Obtain the Sentiment of the ReviewSo far, you have seen how to preprocess the data and how to feed in the reviews in the LSTM network.
Now, let's discuss how we can finally get the sentiment of a given review.
For each time step t, the LSTM network receives an input vector x(t) which results in the output vector y(t).
This process is repeated until x(n), n being the number of words in the review.
Let's say n=20 words.
Until x(n) the LSTM network produced y(n) output vectors.
Each of these 20 vectors represents something, but not the sentiment we are looking for.
Rather the vectors y are an encoded representation of features of the review that (according to the neural network) will be important in determining the sentiment.
y(8) represents the features the neural networks recognized for the first 8 words of the review.
y(20), on the other hand, represents the features for the whole review.
Although it is sufficient to use only the last output vector y(20) in practice, I have found that it leads to more accurate results if we use all vectors y(0) — y(20) for determining of the sentiment.
This can be achieved by computing the mean value over all vectors.
Let's call this mean value vector y_mean.
Finally, the feature representation of the review that is encoded in y_mean can be used to classify the review into the categories of being positive or being negative.
In order to do so, it is required to add a final classification layer, which is nothing else than the dot product between y_mean and another weight matrix W.
This process of sentiment analysis I just described is implemented in a deep learning model in my GitHub repo.
You are welcome to check it out and try it for yourself.
After the model is trained the can perform the sentiment analysis on yet unseen reviews:Test Samples:Review: "the film is a hoot and is just as good if not better than much of whats on saturday morning tv especially the pseudo educational stuff we all cant stand"pos.
04 %Review: "the things this movie tries to get the audience to buy just wont fly with most intelligent viewers"pos.
89 %Review: "although life or something like it is very much in the mold of feel good movies the cast and director stephen hereks polished direction pour delightfully piquant wine from aged bottles"pos.
03 % Review: "this is the case of a pregnant premise being wasted by a script that takes few chances and manages to insult the intelligence of everyone in the audience"pos.
98 %Special Announcement: There is an online-course coming!We’re close to wrapping up our long-awaited course “Deep Learning for Predictive Analytics”.
The course has an emphasis on building Deep Learning applications in the field of Predictive Analytics and making it work in a production environment.
A skillset which is usually not covered by other online courses — but crucial for those who want to work in this field professionally.
If you are interested in getting notified when the course will be released or to receive further details, you can subscribe to the newsletter below — nice!PS: Also by subscribing you are securing one of the limited places and a 50% discount on release.
Deep Learning AcademyDeep Learning Academy Email Formseepurl.