Inference
Inference Model is a package in Analytics Zoo aiming to provide high-level APIs to speed-up development. It allows user to conveniently use pre-trained models from Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR). Inference Model provides Java, Scala and Python interfaces.
Highlights
- Easy-to-use APIs for loading and prediction with deep learning models of Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR).
- Support transformation of various input data type, thus supporting future prediction tasks.
- Transparently support the OpenVINO toolkit, which deliver a significant boost for inference speed (up to 19.9x).
Basic usage of Inference Model:
- Directly use InferenceModel or write a subclass extends
InferenceModel
(AbstractInferenceModel
in Java). - Load pre-trained models with corresponding
load
methods, e.g,doLoadBigDL
for Analytics Zoo, anddoLoadTensorflow
for TensorFlow. - Do prediction with
predict
method.
OpenVINO requirements:
Ubuntu 16.04.3 LTS or higher (64 bit)
CentOS 7.6 or higher (64 bit)
macOS 10.14 or higher (64 bit)
Python requirements:
tensorflow>=1.2.0,<2.0.0
networkx>=1.11
numpy>=1.12.0
defusedxml>=0.5.0
test-generator>=0.1.1
Supported models:
Load pre-trained model
Load pre-trained Analytics Zoo model
Load Analytics Zoo model with corresponding load
methods (load
for Java and Python, doLoad
for Scala).
Java
public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadBigDL(modelPath, weightPath);
Scala
val model = new InferenceModel()
model.doLoadBigDL(modelPath, weightPath)
Python
model = InferenceModel()
model.load_bigdl(modelPath, weightPath)
modelPath
: String. Path of pre-trained model.weightPath
: String. Path of pre-trained model weight. Default isnull
.
Load pre-trained Caffe model
Load Caffe model with loadCaffe
methods (loadCaffe
for Java, doLoadCaffe
for Scala and load_caffe
Python).
Java
public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadCaffe(modelPath, weightPath);
Scala
val model = new InferenceModel()
model.doLoadCaffe(modelPath, weightPath)
Python
model = InferenceModel()
model.load_caffe(modelPath, weightPath)
modelPath
: String. Path of pre-trained model.weightPath
: String. Path of pre-trained model weight.
Load pre-trained TensorFlow model
Load model into TFNet
with corresponding loadTensorflow
methods (loadTensorflow
for Java, doLoadTensorflow
for Scala and load_tensorflow
for Python)
We provide loadTensorflow
with the following parameters:
modelPath
: String. Path of pre-trained model.modelType
: String. Type of pre-trained model file.Inputs
: Array[String]. The inputs of the model.Outputs
: Array[String]. The outputs of the model.intraOpParallelismThreads
: Int. The number of intraOpParallelismThreads.interOpParallelismThreads
: Int. The number of interOpParallelismThreads.usePerSessionThreads
: Boolean. Whether to perSessionThreads
Note that we prepare several implementations with less parameters based on this method, e.g., loadTensorflow(modelPath, modelType)
for frozenModel.
Java
public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadTensorflow(modelPath, modelType);
Scala
val model = new InferenceModel()
model.doLoadTensorflow(modelPath, modelType)
Python
model = InferenceModel()
model.load_tensorflow(modelPath, modelType)
Load OpenVINO model
Load OpenVINO model with loadOpenVINO
methods (loadOpenVINO
for Java, doLoadOpenVINO
for Scala and load_openvino
Python).
Java
public class ExtendedInferenceModel extends AbstractInferenceModel {
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
model.loadOpenVINO(modelPath, weightPath);
Scala
val model = new InferenceModel()
model.doLoadOpenVINO(modelPath, weightPath)
Python
model = InferenceModel()
model.load_openvino(modelPath, weightPath)
modelPath
: String. Path of pre-trained OpenVINO model.weightPath
: String. Path of pre-trained OpenVINO model weight.
Predict with loaded model
After loading pre-trained models with load methods, we can make prediction with unified predict
method.
predictInput
: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Input data for prediction. JTensor is a 1D List, with Array[Int] shape.predictOutput
: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Prediction result.
Do prediction with predict
methods (predict
for Java and Python, doPredict
for Scala).
Java
List<List<JTensor>> predictOutput = model.predict(predictInput);
Scala
val predictOutput = model.doPredict(predictInput)
Python
predict_output = model.predict(predict_input)