Let’s demonstrate using a simple case — a DL image classification problem with the fashion MNIST dataset.
Case illustration with a DL classification task:The approachThe detailed notebook is given here in my Github repo.
You are encouraged to go through it and fork it for your own use and extension.
Code is essential for building great software but not necessarily suitable for a Medium article, which you are reading to gain insights, and not practicing a debugging or refactoring exercise.
Therefore, I will just pick up chosen code snippets and try to point out how I tried to encode some of the principles, detailed earlier, in this Notebook.
The core ML task and the higher order business problemThe core ML task is simple — building a deep learning classifier for the fashion MNIST dataset, which is a fun spin on the original famous MNIST hand-written digit dataset.
Fashion MNIST consists of 60,000 training images of 28 x 28 pixel size — of objects related to fashion e.
hat, shoe, trouser, t-shirt, dresses, etc.
It also consists of 10,000 test images for model validation and testing.
Fashion MNIST (https://github.
com/zalandoresearch/fashion-mnist)But what if there is a higher order optimization or visual analytics question around this core ML task —how the model architecture complexity impacts the minimum epochs it takes to reach the desired accuracy?It should be clear to the reader, why we even bother about such a question.
Because this is related to the overall business optimization.
Training a neural net is not a trivial computational matter.
Therefore, it makes sense to investigate what minimum training effort must be spent to achieve a target performance metric and how the choice of architecture impacts that.
In this example, we will not even use a convolutional net, as a simple densely connected neural net can accomplish reasonably high accuracy, and, in fact, a somewhat sub-optimal performance is required to illustrate the main point of the higher order optimization question we posed above.
Our solutionSo, we have to solve two problems -How to determine what the minimum number of epochs is for reaching the desired accuracy target?How the specific architecture of the model impacts this number or training behavior?To achieve the goals, we will use two simple OOP principles,Creating an inherited class from a base class objectCreate utility functions and call them from a compact code block which can be presented to an external user for higher order optimization and analyticsCode snippets to show the good practicesHere we show some code snippets to illustrate how simple OOP principles have been utilized to achieve our solution.
The snippets are marked with comments for easy understanding.
First, we inherit a Keras class and write our own subclass adding a method for checking training accuracy and taking an action based on that value.
This simple callback results in dynamic control of the epochs — the training stops automatically when the accuracy reaches the desired threshold.
We put the Keras model construction codes in a utility function so that a model of an arbitrary number of layers and architecture (as long as they are densely connected) can be generated using simple user input in the form of some function arguments.
We can even put the compilation and training code into a utility function to use those hyperparameters in a higher-order optimization loop conveniently.
Next, it’s time for visualization.
Again here, we go by the practice of functionalization.
Generic plot functions take raw data as input.
However, if we have a specific purpose of plotting the evolution of training set accuracy and showing how it compares to the target, then our plot function should just take the deep learning model as the input and generate the desired plot.
A typical result looks like following,Final analytics code — super compact and simpleAnd, now we can take advantage of all the functions and classes, we defined earlier, and bring them all together to accomplish the higher order task.
Consequently, our final code will be super compact but it will generate the same interesting plots of loss and accuracy over epochs, that we show above, for a variety of accuracy threshold values and neural network architectures.
This will give a user the ability to to use a minimal amount of code to produce visual analytics about the choice of performance metric (accuracy in this case), and neural network architecture.
This is the first step towards building an optimized machine learning system.
We generate a few cases for investigation,Our final analytics/optimization code is succinct and easy to follow for a high-level user, who does not need to know the complexity of Keras model building or callbacks classes.
This is the core principle behind OOP — the abstraction of the layers of complexity, which we are able to accomplish for our deep learning task.
Note, how we pass on the print_msg=False to the class instance.
While we needed basic printing of status for initial check/debug, we should execute the analysis silently for the optimization task.
If we did not have this argument in our class definition, then we would not have a way to stop printing debugging messages.
We show some of the representative results which are automatically generated from executing the code block above.
It clearly shows, how with a minimal amount of high-level code, we are able to generate visual analytics to judge the relative performance of various neural architectures for various levels of performance metrics.
This gives a user, without tweaking the lower-level functions, easily make a judgment on the choice of a model as per his/her performance demand.
Also, note the custom titles for each plot.
These titles clearly enunciate the target performance and the complexity of the neural net, thereby making the analytics easy.
It was a small addition to the plotting utility function but this shows the need for careful planning while creating such functions.
If we had not planned for such an argument to the function, it would not have been possible to generate a custom title for each plot.
This careful planning of API (application program interface) is part and parcel of good OOP.
Finally, turn the scripts into a simple Python moduleSo far, you may be working with a Jupyter notebook, but you may want to turn this exercise into a neat Python module, which you can import from any time you want.
Just like you write “from matplotlib import pyplot”, you can import these utility functions (Keras model build, train, and plotting) anywhere.
Summary and conclusionsWe showed some simple good practices, borrowed from OOP, to apply to a DL analysis task.
Almost all of them may seem trivial to seasoned software developers but this post is for budding data scientists who may not have that background but should understand the importance of imbuing these good practices in their machine learning workflow.
At the risk of repeating myself one too many times, let me summarize them again here,Whenever get a chance, turn repetitive code blocks into utility functionsThink very carefully about the API of the function i.
what minimal set of arguments is required and how they will serve a purpose for a higher level programming taskDon’t forget to write a docstring for a function, even if it is a one-liner descriptionIf you start accumulating many utility functions related to the same object, consider turning that object to a class and putting the utility functions as methodsExtend class functionality whenever you get a chance for accomplishing complex analysis using inheritanceDon’t stop at Jupyter notebooks.
Turn them into executable scripts and put them in a small module.
Build the habit of modularizing your work so that it can be easily re-used and extended by anyone, anywhere.
Who knows, you may be able to release a utility package on the Python package repository (PyPi server) when you accumulate enough of useful classes and sub-modules.
You will have the bragging right of releasing an original open-source package then :-)If you have any questions or ideas to share, please contact the author at tirthajyoti[AT]gmail.
Also, you can check the author’s GitHub repositories for other fun code snippets in Python, R, or MATLAB and machine learning resources.
If you are, like me, passionate about machine learning/data science, please feel free to add me on LinkedIn or follow me on Twitter.
Tirthajyoti Sarkar – Sr.
Principal Engineer – Semiconductor, AI, Machine Learning – ON…Georgia Institute of Technology Master of Science – MS, Analytics This MS program imparts theoretical and practical…www.