Model Serving

Inference Model is a package in Analytics Zoo aiming to provide high-level APIs to speed-up development. It allows user to conveniently use pre-trained models from Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR). Inference Model provides Java, Scala and Python interfaces.

Highlights

Easy-to-use APIs for loading and prediction with deep learning models of Analytics Zoo, Caffe, Tensorflow and OpenVINO Intermediate Representation(IR).
Support transformation of various input data type, thus supporting future prediction tasks.
Transparently support the OpenVINO toolkit, which deliver a significant boost for inference speed (up to 19.9x).

Load and predict with pre-trained model

Basic usage of Inference Model:

Directly use InferenceModel or write a subclass extends InferenceModel (AbstractInferenceModel in Java).
Load pre-trained models with corresponding load methods, e.g, doLoadBigDL for Analytics Zoo, and doLoadTensorflow for TensorFlow.
Do prediction with predict method.

Supported models:

Predict input and output

predictInput: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Input data for prediction. JTensor is a 1D List, with Array[Int] shape.
predictOutput: JList[JList[JTensor]] or Tensor for Scale and Java, Numpy for Python. Prediction result.

OpenVINO requirements:

System requirements:

Ubuntu 18.04 LTS (64 bit)
CentOS 7.4 (64 bit)
macOS 10.13, 10.14 (64 bit)

Python requirements:

tensorflow>=1.2.0
networkx>=1.11
numpy>=1.12.0
protobuf==3.6.1

Java

Write a subclass that extends AbstractInferenceModel, implement or override methods. Then, load model with corresponding load methods (load Analytics Zoo, caffe, OpenVINO and TensorFlow model with loadBigDL, loadCaffe, loadOpenVINO and loadTensorflow), and do prediction with predict method.

import com.intel.analytics.zoo.pipeline.inference.AbstractInferenceModel;
import com.intel.analytics.zoo.pipeline.inference.JTensor;

public class ExtendedInferenceModel extends AbstractInferenceModel {
    public ExtendedInferenceModel() {
        super();
    }
}
ExtendedInferenceModel model = new ExtendedInferenceModel();
// Load Analytics Zoo model
model.loadBigDL(modelPath, weightPath);
// Predict
List<List<JTensor>> result = model.predict(inputList);

Scala

New an instance of InferenceModel, and load model with corresponding load methods (load Analytics Zoo, caffe, OpenVINO and TensorFlow model with doLoadBigDL, doLoadCaffe, doLoadOpenVINO and doLoadTensorflow), then do prediction with predict method.

import com.intel.analytics.zoo.pipeline.inference.InferenceModel

val model = new InferenceModel()
// Load Analytics Zoo model
model.doLoadBigDL(modelPath, weightPath)
// Predict
val result = model.doPredict(inputList)

In some cases, you may want to write a subclass that extends InferenceModel, implement or override methods. Then, load model with corresponding load methods, and do prediction with predict method.

import com.intel.analytics.zoo.pipeline.inference.InferenceModel

class ExtendedInferenceModel extends InferenceModel {

}

val model = new ExtendedInferenceModel()
// Load Analytics Zoo model
model.doLoadBigDL(modelPath, weightPath)
// Predict
val result = model.doPredict(inputList)

Python

New an instance of InferenceModel, and load Zoo model with corresponding load methods (load Analytics Zoo, caffe, OpenVINO and TensorFlow model with load_bigdl, load_caffe, load_openvino and load_tensorflow), then do prediction with predict method.

from zoo.pipeline.inference import InferenceModel

model = InferenceModel()
# Load Analytics Zoo model
model.load_bigdl(model_path, weight_path)
# Predict
result = model.predict(input_list)

from zoo.pipeline.inference import InferenceModel

class ExtendedInferenceModel(InferenceModel):

    def __init__(self):
        pass

model = ExtendedInferenceModel()
# Load Analytics Zoo model
model.load_bigdl(model_path, weight_path)
# Predict
result = model.predict(input_list)

Examples

We provide examples based on InferenceModel.

See here for the Java example.

See here for the Scala example.

InferenceModel described on this page allows user to do inference without Spark. See this example for the usage without spark dependencies.