Anomaly Detection

Analytics Zoo provides pre-defined models based on LSTM to detect anomalies in time series data. A sequence of values (e.g., last 50 hours) leading to the current time are used as input for the model, which then tries to predict the next data point. Anomalies are defined when actual values are distant from the model predictions.


  1. Keras style models, could use Keras style APIs(compile and fit), as well as NNFrames or BigDL Optimizer for training.
  2. Models are defined base on LSTM.

Build an AnomalyDetction model

You can call the following API in Scala and Python respectively to create an AnomalyDetrctor model


val model = AnomalyDetector(featureShape, hiddenLayers, dropouts)


from zoo.models.anomalydetection import AnomalyDetector
model = AnomalyDetector(feature_shape=(10, 3), hidden_layers=[8, 32, 15], dropouts=[0.2, 0.2, 0.2])

Unroll features

To prepare input for an AnomalyDetector model, you can use unroll a time series data with a unroll length.


val unrolled = AnomalyDetector.unroll(dataRdd, unrollLength, predictStep)


unrolled = AnomalyDetector.unroll(data_rdd, unroll_length, predict_step)

Detect anomalies

After training the model, it can be used to predict values using previous data, then to detect anomalies. Anomalies are defined by comparing the predictions and actual values. It ranks all the absolute difference of predictions and actual values with descending order, the top anomalySize data points are anomalies).


val anomalies = AnomalyDetector.detectAnomalies(yTruth, yPredict, amonalySize)


anomalies = AnomalyDetector.detect_anomalies(y_truth, y_predict, anomaly_size)

Save Model

After building and training an AnomalyDetector model, you can save it for future use.


model.saveModel(path, weightPath = null, overWrite = false)


model.save_model(path, weight_path=None, over_write=False)

Load Model

To load an AnomalyDetector model (with weights) saved above:


AnomalyDetector.loadModel[Float](path, weightPath = null)


AnomalyDetector.load_model(path, weight_path=None)