Inference


Inference Model is a package in Analytics Zoo aiming to provide high-level APIs to speed-up development. It allows user to conveniently use pre-trained models from Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR). Inference Model provides Java, Scala and Python interfaces.

Highlights

  1. Easy-to-use APIs for loading and prediction with deep learning models of Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR).
  2. Support transformation of various input data type, thus supporting future prediction tasks.
  3. Transparently support the OpenVINO toolkit, which deliver a significant boost for inference speed (up to 19.9x).

Basic usage of Inference Model:

  1. Directly use InferenceModel or write a subclass extends InferenceModel (AbstractInferenceModel in Java).
  2. Load pre-trained models with corresponding load methods, e.g, doLoadBigDL for Analytics Zoo, and doLoadTensorflow for TensorFlow.
  3. Do prediction with predict method.

OpenVINO requirements:

System requirements:

Ubuntu 16.04.3 LTS or higher (64 bit)
CentOS 7.6 or higher (64 bit)
macOS 10.14 or higher (64 bit)

Python requirements:

tensorflow>=1.2.0,<2.0.0
networkx>=1.11
numpy>=1.12.0
defusedxml>=0.5.0
test-generator>=0.1.1

Supported models:

  1. Analytics Zoo Models
  2. Caffe Models
  3. TensorFlow Models
  4. OpenVINO models

Load pre-trained model

Load pre-trained Analytics Zoo model

Load Analytics Zoo model with corresponding load methods (load for Java and Python, doLoad for Scala).

Java

public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadBigDL(modelPath, weightPath);

Scala

val model = new InferenceModel()
model.doLoadBigDL(modelPath, weightPath)

Python

model = InferenceModel()
model.load_bigdl(modelPath, weightPath)

Load pre-trained Caffe model

Load Caffe model with loadCaffe methods (loadCaffe for Java, doLoadCaffe for Scala and load_caffe Python).

Java

public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadCaffe(modelPath, weightPath);

Scala

val model = new InferenceModel()
model.doLoadCaffe(modelPath, weightPath)

Python

model = InferenceModel()
model.load_caffe(modelPath, weightPath)

Load pre-trained TensorFlow model

Load model into TFNet with corresponding loadTensorflow methods (loadTensorflow for Java, doLoadTensorflow for Scala and load_tensorflow for Python)

We provide loadTensorflow with the following parameters:

Note that we prepare several implementations with less parameters based on this method, e.g., loadTensorflow(modelPath, modelType) for frozenModel.

Java

public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadTensorflow(modelPath, modelType);

Scala

val model = new InferenceModel()
model.doLoadTensorflow(modelPath, modelType)

Python

model = InferenceModel()
model.load_tensorflow(modelPath, modelType)

Load OpenVINO model

Load OpenVINO model with loadOpenVINO methods (loadOpenVINO for Java, doLoadOpenVINO for Scala and load_openvino Python).

Java

public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadOpenVINO(modelPath, weightPath);

Scala

val model = new InferenceModel()
model.doLoadOpenVINO(modelPath, weightPath)

Python

model = InferenceModel()
model.load_openvino(modelPath, weightPath)

Predict with loaded model

After loading pre-trained models with load methods, we can make prediction with unified predict method.

Do prediction with predict methods (predict for Java and Python, doPredict for Scala).

Java

List<List<JTensor>> predictOutput = model.predict(predictInput);

Scala

val predictOutput = model.doPredict(predictInput)

Python

predict_output = model.predict(predict_input)