Relaxed?At peace?(Lord knows we need these things haha).
These emotions largely convey the intent of the author.
Can we interpret feeling/sentiment from a poem using non-feeling computer? Well, if this post exists, definitely :).
For this task, we’ll be using the tidytext, a lovely R package built to simplify text analytics.
Lets get to it!Data PreparationFirst, we’ll store the poem under a variable text.
The c() function converts the text into a unified vector.
text <- c(“ The wind blows “, “I breathe deeply”, “I close my eyes so tight”, “I can barely see light “, “It feels as if I float in thin air “, “It’s relaxing, it’s refreshing”, “It’s peace, it’s nature”, “It’s as if I can fly” ,running text will give us this : " The wind blows " "I breathe deeply"  "I close my eyes so tight" "I can barely see light "  "It feels as if I float in thin air " "It's relaxing, it's refreshing"  "It's peace, it's nature" "It's as if I can fly"  "And never die"As we can see, sentences in the poem have been assigned numbers corresponding to their position.
We will then pass this poem coded as a vector to a data structure called a tibble that will convert it into a more parsable table(data frame) format.
library(dplyr)text_df <- tibble(line = 1:9, text = text)running text_df will give us this :> text_df# A tibble: 9 x 2 line text <int> <chr> 1 1 " The wind blows " 2 2 I breathe deeply 3 3 I close my eyes so tight 4 4 "I can barely see light " 5 5 "It feels as if I float in thin air "6 6 It's relaxing, it's refreshing 7 7 It's peace, it's nature 8 8 It's as if I can fly 9 9 And never dieTo fully utilize tidytext and its sentiment analysis capabilities, we’ll have to break down into tokens: units of a sentence such as a word or phrase from which meaning can be extracted.
In our case well break it down into individual words and try derive meaning from them.
For this, we will use the unnested_tokens() function in the tidytext library.
library(tidytext) tokenised <- text_df %>% unnest_tokens(word, text)the %>% function indicates were passing text_df as input .
running tokenised, we get the following output.
# A tibble: 43 x 2 line word <int> <chr> 1 1 the 2 1 wind 3 1 blows 4 2 i 5 2 breathe 6 2 deeply 7 3 i 8 3 close 9 3 my 10 3 eyesAs we can see, each word now occupies a row.
One step closer (insert dance emoji).
We do, however, have a problem.
Some words hold no sentiment, such as the, an, which and are referred to as stop words.
We will remove them through preforming anti_join (returns non common fields) with a data frame containing these (stop)words.
tokenised <- tokenised %>% anti_join(stop_words)Our new tokenised table now has less values but more meaning.
> tokenised # A tibble: 19 x 2 line word <int> <chr> 1 1 wind 2 1 blows 3 2 breathe 4 2 deeply 5 3 close 6 3 eyes 7 3 tight 8 4 barely 9 4 light 10 5 feels 11 5 float 12 5 thin 13 5 air 14 6 relaxing 15 6 refreshing16 7 peace 17 7 nature 18 8 fly 19 9 dieAnalysisNow that we have meaningful words extracted from the poem, its time to compare them to a sentiment library/lexicon and see if we can get any matches out of it.
for this, we will use the “nrc” library from the get_sentiments function as it can return values such as hope, joy, anger which I feel will serve our case better.
feelings <- tokenised %>% inner_join( get_sentiments(“nrc”)) %>% count(word, sort = TRUE)Running feelings gives us :word n <chr> <int>1 peace 42 die 33 refreshing 1The main theme of the poem, as per out analysis, is peace, alluding the first conclusion we had drawn earlier in the article.
“Die” or death as a theme is picked up too, which is rather interesting.
Give it another read perhaps?VisualizationThe last part of this will be trying to visualize this information.
For this, I’ll use a word cloud(they look really cool and represent textual data well) from the wordcloud2 library.
library(wordcloud2)wordcloud2(feelings)There we go!.A word bubble showing most popular themes, representing frequency in terms of size.
code used can be obtained from Github here.
Bryson Mwamburi is a curious, thought-loving soul exploring Data Science.