Model Serving
Inference Model is a package in Analytics Zoo aiming to provide high-level APIs to speed-up development. It allows user to conveniently use pre-trained models from Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR). Inference Model provides Java, Scala and Python interfaces.
Highlights
- Easy-to-use APIs for loading and prediction with deep learning models of Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR).
- Support transformation of various input data type, thus supporting future prediction tasks.
- Transparently support the OpenVINO toolkit, which deliver a significant boost for inference speed (up to 19.9x).
Load and predict with pre-trained model
Basic usage of Inference Model:
- Directly use InferenceModel or write a subclass extends
InferenceModel
(AbstractInferenceModel
in Java). - Load pre-trained models with corresponding
load
methods, e.g,doLoadBigDL
for Analytics Zoo, anddoLoadTensorflow
for TensorFlow. - Do prediction with
predict
method.
Supported models:
Predict input and output
predictInput
: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Input data for prediction. JTensor is a 1D List, with Array[Int] shape.predictOutput
: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Prediction result.
OpenVINO requirements:
Ubuntu 18.04 LTS (64 bit)
CentOS 7.4 (64 bit)
macOS 10.13, 10.14 (64 bit)
Python requirements:
tensorflow>=1.2.0
networkx>=1.11
numpy>=1.12.0
protobuf==3.6.1
Java
Write a subclass that extends AbstractInferenceModel
, implement or override methods. Then, load model with corresponding load
methods (load Analytics Zoo, caffe, OpenVINO and TensorFlow model with loadBigDL
, loadCaffe
, loadOpenVINO
and loadTensorflow
), and do prediction with predict
method.
import com.intel.analytics.zoo.pipeline.inference.AbstractInferenceModel;
import com.intel.analytics.zoo.pipeline.inference.JTensor;
public class ExtendedInferenceModel extends AbstractInferenceModel {
public ExtendedInferenceModel() {
super();
}
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
// Load Analytics Zoo model
model.loadBigDL(modelPath, weightPath);
// Predict
List<List<JTensor>> result = model.predict(inputList);
Scala
New an instance of InferenceModel
, and load model with corresponding load
methods (load Analytics Zoo, caffe, OpenVINO and TensorFlow model with doLoadBigDL
, doLoadCaffe
, doLoadOpenVINO
and doLoadTensorflow
), then do prediction with predict
method.
import com.intel.analytics.zoo.pipeline.inference.InferenceModel
val model = new InferenceModel()
// Load Analytics Zoo model
model.doLoadBigDL(modelPath, weightPath)
// Predict
val result = model.doPredict(inputList)
In some cases, you may want to write a subclass that extends InferenceModel
, implement or override methods. Then, load model with corresponding load
methods, and do prediction with predict
method.
import com.intel.analytics.zoo.pipeline.inference.InferenceModel
class ExtendedInferenceModel extends InferenceModel {
}
val model = new ExtendedInferenceModel()
// Load Analytics Zoo model
model.doLoadBigDL(modelPath, weightPath)
// Predict
val result = model.doPredict(inputList)
Python
New an instance of InferenceModel
, and load Zoo model with corresponding load
methods (load Analytics Zoo, caffe, OpenVINO and TensorFlow model with load_bigdl
, load_caffe
, load_openvino
and load_tensorflow
), then do prediction with predict
method.
from zoo.pipeline.inference import InferenceModel
model = InferenceModel()
# Load Analytics Zoo model
model.load_bigdl(model_path, weight_path)
# Predict
result = model.predict(input_list)
In some cases, you may want to write a subclass that extends InferenceModel
, implement or override methods. Then, load model with corresponding load
methods, and do prediction with predict
method.
from zoo.pipeline.inference import InferenceModel
class ExtendedInferenceModel(InferenceModel):
def __init__(self):
pass
model = ExtendedInferenceModel()
# Load Analytics Zoo model
model.load_bigdl(model_path, weight_path)
# Predict
result = model.predict(input_list)
Examples
We provide examples based on InferenceModel.
See here for the Java example.
See here for the Scala example.
InferenceModel described on this page allows user to do inference without Spark. See this example for the usage without spark dependencies.