Let’s Read A Story: a study on storytelling for children using machine learning tools

I was drawn to Aesop Fables texts because of their concise yet rich story lines, the use of animals as metaphors, and the strong morals embedded in each story.Aesop Fables for kids, project gutenbergEach original Aesop Fable contains:A short title, usually very descriptive of the story’s content and characters.The story itself, usually no more than 30 sentences.The moral of the story, usually contains a metaphor built on the inherent nature or trait of the animals in the story.✨Cleaning the datasetFor the analysis of the content I compiled a JSON file holding all stories broken down to individual sentences, their titles, characters, and animals.This file is key for generation of the experiment’s new stories, as it holds all sentences and acts as the ‘database’ for the experiment.furthermore, this file serves as a source for fetching the seed sentence from, for every new generated story from which the new generated story grows from.⚙️Analyzing the sentencesUsing Google’s Universal Sentence Encoder, a machine learning model that encodes text into high dimensional vectors that can be used for text classification, semantic similarity, clustering and other natural language tasks, I analyzed all sentences derived from the fables (~1500 sentences).This yields a JSON file containing sentence embeddings for each sentence in a 512 dimensional space, this is the similarity map I use to compare and generate new adjacencies.Example line from the file:{“message”: “There was once a little Kid whose growing horns made him think he was a grown-up Billy Goat and able to take care of himself.”, “message_embedding”: [0.06475523114204407, -0.026618603616952896, -0.05429006740450859, 0.003563014790415764 ………..,0.06475523194004407]}For processing and retrieval of information for similarities, averages and distances between sentences I used the ML5 Word2Vec class and changed it a bit to work with the universal sentence encoder scheme.????First resultsStarting with generating 10 sentences, The very first result was pretty surprising, making sense and pretty convincing (for a machine ????), though a bit dark to my taste:First test in generating output text from the universal sentence encoder (10 lines from a random seed)They cried.He cried.Oho, what have we here?.they cried.They had done me no harm.It was their feast, they thought.It was now the Ass’s turn to confess.Vexed and disappointed he began to cry.He was all alone.A passerby heard his cries and asked what had happened.and that was all the sympathy the poor Mouse got.Another try yielded different results, very grim, but still interesting:Second test in generating output text from the universal sentence encoder (10 lines from a random seed)No matter how hard he tries, he cannot leave home.He cannot fool us again, they said.The kid knew there was little hope for him.But the best he could do was to get his head out.My sons, he said, heed what I have to say to you.For the moment he forgot just where he was.No, no!.Daddy is watching and Daddy will kill him if he should come near!Please let me go he pleaded.It would take so little to break me, he said, You know how fragile I am.He was all alone.????Building the web applicationFor the first version of this project I thought it would be best if it lived on web browsers, that way it would be accessible to almost everybody..I chose building the first version on Node.JS for the server side (similarities calculations, sentiment analysis and serving the content) and javascript for the frontend functionality (everything else).????.+????Adding Illustrations & Musical phrases to the storyTo enrich the stories, I chose to use Google Magenta’s sketch-RNN model: A Generative Model for Vector Drawings to reconstruct illustrations from a pre trained model to accompany the generated stories.The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!..The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located.The smart people at google magenta trained a publicly available recurrent neural network model called sketch-rnn..they have taught this neural net to draw by training it on millions of doodles collected from the Quick, Draw!.game..while I’m using it to simply reconstruct animal and other general illustrations from the story, there are many other creative applications to this insanely big dataset and network.For Let’s Read A Story, I chose to use this model while implementing a simple RegEx search on the resulting sentences..The javascript functionality determines which animal appears in the generated story and then reconstructs an illustration from the trained sketch RNN model using P5.JS..If the sentence contains an animal that does not exist in the model, there is another function that ‘enriches’ the model’s keywords and matches similar animals specified in the sentence.These illustrations then become musical phrases based on some pre determined ‘musical’ rules:Lion Illustration and SoundLion Illustration and soundWith the help of AFINN-based sentiment analysis library I analyze each sentence and determine whether it has a positive or negative sentiment..Based on that sentiment (a score between -5 to 5), I’m mapping the illustration’s X and Y coordinates to musical notes on a major or minor B scale — positive scores get a major scale and negative scores get a minor scale.According to the animal appearing in the sentence, I choose a different tone.js synthesizer and a different number of musical notes.. More details

Leave a Reply