Remaining Life Estimation with KerasFrom Time Series to Images… Asking to a CNN ‘when does the next fault occour?’Marco CerlianiBlockedUnblockFollowFollowingApr 23Predict rare events is becoming an important topic of research and development in a lot of Artificial Intelligient solutions.

Survival Analysis, Customer churn prediction, Predictive Maintenance and Anomaly Detection are some exemples of the most popular fields of application that deal with rare events.

Given these scenarios, we can image a rare event as a particular state that occours under specific condition, divergents from a normal behaviour, but which plays a key role in term of economic interest.

In this post I’ve developed a Machine Learning solution to predict the Remaining Useful Life (RUL) of a particular engine component.

This kind of problem plays a key role in the field of Predictive Maintenance, where the purpose is to say ‘How much time is left before the next fault?’.

To achive this target I developed a Convolutional NN in Keras that deals with time series in form of images.

THE DATASETFor Data Scientist the most important ploblem, when dealing with this kind of task, is the lack of rare events in form of obsevations available.

So the first step to achive a good performance is try to have at disposal the reachness dataset that treats every kind of possible scenarios.

Turbofan Engine Degradation Simulation Dataset, provided by NASA, is becoming an important benchmarck in Remaining Useful Life (RUL) estimation for a fleet of engines of the same type (100 in total).

Data are available in form of time series: 3 operational settings, 21 sensor mesurements and cycle – i.

e.

obsevations in term of time for working life.

The engine is operating normally at the start of each time series, and develops a fault at some point during the series.

In the training set, the fault grows in magnitude until system failure.

In the test set, the time series ends some time prior to system failure.

The objective is to predict the number of remaining operational cycles before failure in the test set, i.

e.

, the number of operational cycles after the last cycle that the engine will continue to operate.

To understand better this explation we try to have a look at the data:train_df.

id.

value_counts().

plot.

bar()Engines have different life durations.

The average working time in train data is 206 cycles with a minimum of 128 and a maxima of 362.

The operational settings and sensor mesurements, in train set for a singular engine, are plotted belove:engine_id = train_df[train_df['id'] == 1]engine_id[train_df.

columns[2:]].

plot(subplots=True, sharex=True, figsize=(20,30))settings 1–3, sensors 1–9 for engine1sensors 10–21 for engine1Plot is always a good idea… In this way we can have an impressive and general overwiev of the data at our disposal.

At the end of the majority of the series we can observe a divergent behaviour, which annunces a future failure.

PREPARE THE DATAIn order to predict the RUL for each engine we’ve pursued a classification approach, generating the label by ourself in this way:From 0 (fault) to 15 remaining cycles we’ve labeled as 2, from 16 to 45 cycles we’ve labeled as 1 and the rest (>46) as 0.

Is clear that in a realistic scenario, the category labeled as 2 is the most economic valuable.

Predict this class with good performance will permit to operate an adequate program of maintenance, avoiding future faults and saving money.

In order to have at our disposal the maximum number of data for the train, we split the series with a fixed window and a sliding of 1 step.

For example, engine1 have 192 cycles in train, with a window length equal to 50 we extract 142 time series with length 50:window1 -> from cycle0 to cycle50, window2 -> from cycle1 to cycle51, … , window142 -> from cycle141 to cycle50, window191.

Each window is labeled with the corresponding label of the final cycle taken into account by the window.

sequence_length = 50def gen_sequence(id_df, seq_len, seq_cols): data_matrix = id_df[seq_cols].

values n_elem = data_matrix.

shape[0] for a,b in zip(range(0,n_elem-seq_len), range(seq_len,n_elem)): yield data_matrix[a:b,:] def gen_labels(id_df, seq_len, lab): data_matrix = id_df[lab].

values n_elem= data_matrix.

shape[0] return data_matrix[seq_len:n_elem,:]FROM TIME SERIES TO IMAGESTo make the things more interesting I have decided to transform the series at our disposal in images; in order to feed our classification model with them.

I’ve created the images following this amazing resource.

The concept is simple… when we try to transform time series into images we always make use of spectrogram.

This choice is clever but not always the best one (as you can read here).

In this post the author explains his justified perplexity about dealing audio series with a spectrogram representation.

He talks about sound but the meaning can be translate in our scenario.

Spectrograms are powerfull but their usage may result in a loss of information, particularly if we try to approach at the problem in a computer vision way.

To be efficient a 2D CNN requires the spatial invariance; this builds on the assumption that features of a classical image (like a photo) carry the same meaning regardless of their location.

On the other side a spectrogram implies a two dimensions representation made by two different units (frequency and time).

For these reasons I decided to transform my time series windows (of lenght 50 cycles) making use of Recurrence Plots.

They are easy to implement in python with few line of code, making use of Scipy.

from scipy.

spatial.

distance import pdist, squareformdef rec_plot(s, eps=0.

10, steps=10): d = pdist(s[:,None]) d = np.

floor(d/eps) d[d>steps] = steps Z = squareform(d) return ZWith this function we are able to generate an image of 50×50 for every time series at our disposal (I’ve expcluded the costant time series with 0 variance).

So every single obsevation is made by an array of images of size 50x50x17 (17 are the time series with no zero variance) like below.

Exemple of a train observationTHE MODELAt this point we are ready to build our model.

I’ve adopted a classical 2D CNN architecture:checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=2, save_best_only=True, mode='max', save_weights_only=False)model = Sequential()model.

add(Conv2D(32, (3, 3), activation='relu', input_shape=(50, 50, 17)))model.

add(Conv2D(32, (3, 3), activation='relu'))model.

add(MaxPooling2D(pool_size=(2, 2)))model.

add(Dropout(0.

25))model.

add(Conv2D(64, (3, 3), activation='relu'))model.

add(Conv2D(64, (3, 3), activation='relu'))model.

add(MaxPooling2D(pool_size=(2, 2)))model.

add(Dropout(0.

25))model.

add(Flatten())model.

add(Dense(256, activation='relu'))model.

add(Dropout(0.

5))model.

add(Dense(3, activation='softmax'))model.

compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])model.

fit(x_train_img, y_train, batch_size=200, epochs=10, callbacks=[checkpoint], validation_split=0.

2, verbose=1)I fit for only 10 epochs and I’ve achive an ACCURACY of 0.

832%.

from sklearn documentationFrom the confusion matrix we can see that our model can well discriminate when an engine is close to failure (2 label: <16 cycles remaining) or when it works normally (0 label: >45 cycles).

A little bit noise is present in the intermediate class (>15, <46 cycles).

We are satisfied to achive a great and clear result for the prediction of class 2 – i.

e.

near to failure.

SUMMARYIn this post we try to solve a Predictive Maintenance problem.

Estimating RUL of engines we are conscious to deal with rare events, due to the difficult to collect this kind of data.

We propose an interesting solution transforming time series into images, making use or Recurrence Plots.

In this way we are able to discriminate well engines which are at the end of their working life.

CHECK MY GITHUB REPOKeep in touch: Linkedin.