Introduction to Uber’s LudwigCreate deep learning models without writing codeGilbert TannerBlockedUnblockFollowFollowingMar 2Figure 1: Ludwig LogoUber’s AI Lab continues with opensourcing deep learning framework with there newest release which is called Ludwig, a toolbox build on top of TensorFlow that allows users to create and train models without writing code.
Finding the right model architecture and hyperparameters for your model is a dificult aspepct of the deep learning pipeline.
As an data scientist you can spend hours experimenting with different hyperparameters and architectures to find the perfect fit for your specifc problem.
This procedure isn’t only time consuming, code intensive but also requires knowledge of all the algorithms used and state-of-the-art techniques, which are used to squezze out the last percent of performance.
Ludwig tries to provide you with a toolbox that allows you to train and test your deep learning model without writinng code.
This helpes domain experts without a lot of deep learning knowledge to build there own high performing models.
LudwigUber has delevoped Ludwig internally over the past two year to streamline and simplify the use of deep learning models.
They have witnessed its value on several of their own projects such as nformation extraction from driver licenses, identification of points of interest during conversations between driver-partners and riders and many more.
For this reason they decided to release it as open source, so everybody can get the flexibility and ease of use Ludwig provides.
Ludwig was build with the following core principles:No coding required: no coding skills are required to train a model and use it for obtaining predictions.
Generality: a new data type-based approach to deep learning model design that makes the tool usable across many different use cases.
Flexibility: experienced users have extensive control over model building and training, while newcomers will find it easy to use.
Extensibility: easy to add new model architecture and new feature data types.
Understandability: deep learning model internals are often considered black boxes, but we provide standard visualizations to understand their performance and compare their predictions.
Ludwig allows us to train a deep learning model by only providing a file containing the data like a csv and a YAML configuration file in which we need to specify some information about the features contained in our data file like if they are dependent or independent variables.
If more than one dependent/output variable is specified, Ludwig will learn to predict all of the output simultaneously.
The main new idea behind Ludwig is the notion of data-type specifc encoders and decoders.
These specifc type of encoders and decoders can be set in the configuration file and provide us with a highly modularized and extensible architecture that has specifc preprocessing steps for each type of data.
Figure 2: Different input and output featuresThis design gives the user access to a lot of different functions and options that allow them to build cutting edge models for there specifc domain without demanding a lot of deep learning knowledge.
Using LudwigTo use Ludwig we need to install it which can be done with the following command:pip install git+https://github.
com/uber/ludwigpython -m spacy download enThe next step would be to create our model definition YAML file that specifies our input and output features as well as some additional information about the specifc preprocessing steps we want to take.
But before we can create this file we need to decide what data-set we want to use.
For this article I decided to use the Twitter US Airline Sentiment data-set, which is freely available for download.
Now that we have our dataset we can start writing our model definiton.
input_features: – name: content type: textoutput_features: – name: airline_sentiment type: categoryWith our YAML configuration file ready, we can start training our model using the following command:ludwig train –data_csv Tweets.
csv –model_definition_file model_definition.
yamlLudwig now performs a random data split into training, validation and test set, preprocesses them and then builds a model with the specified encoders and decoders.
It also prints the displayes the training process inside the console and also provides TensorBoard capapility.
After training, Ludwig creates a result directory containing the trained model with its hyperparameters as well as some summary statistics which can be used to visualize the training process.
One of these visualizations can be executed with the followiing command:ludwig visualize –visualization learning_curves –training_stats results/training_stats.
jsonThis will display a graph that showes the loss and accuracy as functions of the number of epochs.
After training we can use the model to make predictions by typing:ludwig predict –data_csv path/to/data.
csv –model_path /path/to/modelLudwig’s programmatic APILudwig also provides a Python programmatic API that allows us to train or load a model using Python.
The problem above can be implemented using thee programmetic API as shown below.
Recommended ReadingsIntroduction to Deep Learning with KerasHow to use the Keras Deep Learning librarytowardsdatascience.
comConclusionLudwig is a toolbox build on top of TensorFlow that allows users to create and train models without writing code.
It provides us with lots of lot of different functions and options — like data-type specifc encoders and decoders — that allow us to build cutting edge deep learning models.
If you liked this article consider subscribing on my Youtube Channel and following me on social media.
The code covered in this article is available as a Github Repository.
If you have any questions, recommendations or critiques, I can be reached via Twitter or the comment section.