Inference

Inference Model is a package in Analytics Zoo aiming to provide high-level APIs to speed-up development. It allows user to conveniently use pre-trained models from Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR). Inference Model provides Java, Scala and Python interfaces.

Highlights

Easy-to-use APIs for loading and prediction with deep learning models of Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR).
Support transformation of various input data type, thus supporting future prediction tasks.
Transparently support the OpenVINO toolkit, which deliver a significant boost for inference speed (up to 19.9x).

Basic usage of Inference Model:

Directly use InferenceModel or write a subclass extends InferenceModel (AbstractInferenceModel in Java).
Load pre-trained models with corresponding load methods, e.g, doLoadBigDL for Analytics Zoo, and doLoadTensorflow for TensorFlow.
Do prediction with predict method.

OpenVINO requirements:

System requirements:

Ubuntu 16.04.3 LTS or higher (64 bit)
CentOS 7.6 or higher (64 bit)
macOS 10.14 or higher (64 bit)

Python requirements:

tensorflow>=1.2.0,<2.0.0
networkx>=1.11
numpy>=1.12.0
defusedxml>=0.5.0
test-generator>=0.1.1

Supported models:

Load pre-trained model

Load pre-trained Analytics Zoo model

Load Analytics Zoo model with corresponding load methods (load for Java and Python, doLoad for Scala).

Java

public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadBigDL(modelPath, weightPath);

Scala

val model = new InferenceModel()
model.doLoadBigDL(modelPath, weightPath)

Python

model = InferenceModel()
model.load_bigdl(modelPath, weightPath)

modelPath: String. Path of pre-trained model.
weightPath: String. Path of pre-trained model weight. Default is null.

Load pre-trained Caffe model

Load Caffe model with loadCaffe methods (loadCaffe for Java, doLoadCaffe for Scala and load_caffe Python).

Java

public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadCaffe(modelPath, weightPath);

Scala

val model = new InferenceModel()
model.doLoadCaffe(modelPath, weightPath)

Python

model = InferenceModel()
model.load_caffe(modelPath, weightPath)

modelPath: String. Path of pre-trained model.
weightPath: String. Path of pre-trained model weight.

Load pre-trained TensorFlow model

Load model into TFNet with corresponding loadTensorflow methods (loadTensorflow for Java, doLoadTensorflow for Scala and load_tensorflow for Python)

We provide loadTensorflow with the following parameters:

modelPath: String. Path of pre-trained model.
modelType: String. Type of pre-trained model file.
Inputs: Array[String]. The inputs of the model.
Outputs: Array[String]. The outputs of the model.
intraOpParallelismThreads: Int. The number of intraOpParallelismThreads.
interOpParallelismThreads: Int. The number of interOpParallelismThreads.
usePerSessionThreads: Boolean. Whether to perSessionThreads

Note that we prepare several implementations with less parameters based on this method, e.g., loadTensorflow(modelPath, modelType) for frozenModel.

Java

public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadTensorflow(modelPath, modelType);

Scala

val model = new InferenceModel()
model.doLoadTensorflow(modelPath, modelType)

Python

model = InferenceModel()
model.load_tensorflow(modelPath, modelType)

Load OpenVINO model

Load OpenVINO model with loadOpenVINO methods (loadOpenVINO for Java, doLoadOpenVINO for Scala and load_openvino Python).

Java

public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadOpenVINO(modelPath, weightPath);

Scala

val model = new InferenceModel()
model.doLoadOpenVINO(modelPath, weightPath)

Python

model = InferenceModel()
model.load_openvino(modelPath, weightPath)

modelPath: String. Path of pre-trained OpenVINO model.
weightPath: String. Path of pre-trained OpenVINO model weight.

Predict with loaded model

After loading pre-trained models with load methods, we can make prediction with unified predict method.

predictInput: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Input data for prediction. JTensor is a 1D List, with Array[Int] shape.
predictOutput: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Prediction result.

Do prediction with predict methods (predict for Java and Python, doPredict for Scala).

Java

List<List<JTensor>> predictOutput = model.predict(predictInput);

Scala

val predictOutput = model.doPredict(predictInput)

Python

predict_output = model.predict(predict_input)