Introducing TensorWatch: Microsoft Research New Tool for Debugging Deep Learning Programs

That’s a question that data scientists need to answer in every deep learning scenario.

Many deep learning techniques are complex in nature and, although they result very accurate in many scenarios, they can become incredibly difficult to interpret.

If we can plot some of the best-known deep learning models in a chart that correlates accuracy and interpretability, we will get something like the following:Extrapolating the accuracy-interpretability friction to debugging means that accurate models are, very often, next to impossible to debug.

What You See is What You Log DilemmaGiven the structural complexity of deep learning models, most debugging tools follow a what you see if what you log(WYSIWYL) approach that records every single action in the model and represents the results using a number of pre-defined visualizations.

The challenge with WYSIWYL techniques is that we are using predetermined visuals to interpret constantly-changing models.

Very often, data scientists need to evaluate the debugging data in order to formulate a solid thesis about the behavior of the model.

Enter TensorWatchTensorWatch tries to address some of the challenges outlined in the previous section by moving debugging to the data scientist’s favorite environment: Jupyter Notebooks.

Instead of relying on prepacked visualizations, TensorWatch leverages Jupyter Notebooks to provide an interactive debugging of real-time training processes using either the composable UI in Jupyter Notebooks or the live shareable dashboards in Jupyter Lab.

One of the central premises of TensorWatch is that every data point is treated as streams.

That abstraction encompasses diverse data sources such as files, console, sockets, cloud storage, and even visualizations themselves.

With a common interface, TensorWatch streams can listen to other streams, which enables the creation of custom data flow graphs.

When you write to a TensorWatch stream, the values get serialized and sent to a TCP/IP socket as well as the file you specified.

From Jupyter Notebook, TensorWatch loads the previously logged values from the file and then listen to that TCP/IP socket for any future values.

The visualizer listens to the stream and renders the values as they are processed.

TensorWatch includes many types of visualizations which can be leveraged natively from the Jupyter Notebook environment.

The TensorWatch programming model includes two fundamental constructs: Watcher and WatcherClient.

The Watcher allows you to create TensorWatch streams and it listen to any incoming requests from anyone to get those streams.

The WatcherClient can be running on same or different machine, it connects to the Watcher, requests these streams and feeds them to visualizers.

Both constructs can be initialized using the following code:import tensorwatch as twtrain = tw.

WatcherClient(port=0)test = tw.

WatcherClient(port=1)Creating debugging visualizations with TensorWatch is extremenly simple.

Let’s take an example in which we would like to plot the epoch and batch loss during training.

That can be accomplished in the following three lines of code:loss_stream = train.

create_stream(expr='lambda d:(d.

metrics.

epochf, d.

metrics.

batch_loss)', event_name='batch')loss_plot = tw.

Visualizer(loss_stream, vis_type='line', xtitle='Epoch', ytitle='Train Loss')loss_plot.

show()One of the unique capabilities of TensorWatch is known as lazy logging mode and refers to a configuration in which the engine doesn’t require all the explicit logging information beforehand.

Instead, TensorWatch can observe the relevant variables related to the debugging process.

A data scientist can then use TensorWatch to perform interactive queries that run in the context of these variables and returns the streams as a result.

These streams can then be visualized, saved, or processed as needed.

The WatcherClient class is a key component of the lazy loading mode.

Functionally, The WatcherClient allows you to connect to Watcher and have it execute a Python lambda expression.

The following code illustrates that concept using a simple lambda expression that sums values in a weights array.

import tensorwatch as twclient = tw.

WatcherClient()stream = client.

create_stream(expr='lambda d: np.

sum(d.

weights)')line_plot = tw.

Visualizer(stream, vis_type='line')line_plot.

show()TensorWatch brings several unique contributions to the debugging of deep learning models but it doesn’t do that all by itself.

The tool leverages several popular libraries including hiddenlayer, torchstat, Visual Attribution to allow performing the usual debugging and analysis activities in one consistent package and interface.

TensorWatch is certainly a unique approach to improve the debugging experience in deep learning programs.

The open source release of TensorWatch opens the door to more contributions in this important areas of the deep learning lifecycle.

Certainly, I would like to see TensorWatch merger into some of the existing deep learning frameworks and platforms in the market.

.

. More details

Leave a Reply