Image
Analytics Zoo provides a series of Image APIs for end-to-end image processing pipeline, including image loading, pre-processing, inference/training and some utilities on different formats.
Load Image
Analytics Zoo provides APIs to read image to different formats:
Load to Data Frame
Scala:
package com.intel.analytics.zoo.pipeline.nnframes
object NNImageReader {
def readImages(path: String, sc: SparkContext, minPartitions: Int = 1, resizeH: Int = -1, resizeW: Int = -1): DataFrame
}
Read the directory of images from the local or remote source, return DataFrame with a single column "image" of images.
- path: Directory to the input data files, the path can be comma separated paths as the list of inputs. Wildcards path are supported similarly to sc.binaryFiles(path).
- sc: SparkContext to be used.
- minPartitions: Number of the DataFrame partitions, if omitted uses defaultParallelism instead
- resizeH: height after resize, by default is -1 which will not resize the image
- resizeW: width after resize, by default is -1 which will not resize the image
Python:
class zoo.pipeline.nnframes.NNImageReader
static readImages(path, sc=None, minPartitions=1, resizeH=-1, resizeW=-1, bigdl_type="float")
ImageSet
ImageSet
is a collection of ImageFeature
. It can be a DistributedImageSet
for distributed image RDD or
LocalImageSet
for local image array.
You can read image data as ImageSet
from local/distributed image path, or you can directly construct a ImageSet from RDD[ImageFeature] or Array[ImageFeature].
Scala APIs:
object com.intel.analytics.zoo.feature.image.ImageSet
def array(data: Array[ImageFeature]): LocalImageSet
Create LocalImageSet from array of ImeageFeature
- data: array of ImageFeature
def rdd(data: RDD[ImageFeature]): DistributedImageSet
Create DistributedImageSet from rdd of ImageFeature
- data: array of ImageFeature
def read(path: String, sc: SparkContext = null, minPartitions: Int = 1, resizeH: Int = -1, resizeW: Int = -1, imageCodec: Int = Imgcodecs.CV_LOAD_IMAGE_UNCHANGED, withLabel: Boolean = false, oneBasedLabel: Boolean = true): ImageSet
Read images as Image Set. If sc is defined, read image as DistributedImageSet from local file system or HDFS. If sc is null, Read image as LocalImageSet from local file system
- path: path to read images. If sc is defined, path can be local or HDFS. Wildcard character are supported. If sc is null, path is local directory/image file/image file with wildcard character
- sc: SparkContext
- minPartitions: A suggestion value of the minimal splitting number for input data.
- resizeH: height after resize, by default is -1 which will not resize the image
- resizeW: width after resize, by default is -1 which will not resize the image
- imageCodec: specifying the color type of a loaded image, same as in OpenCV.imread. By default is
Imgcodecs.CV_LOAD_IMAGE_UNCHANGED
. - withLabel: whether to treat folders in the path as image classification labels and read the labels into ImageSet.
- oneBasedLabel: whether the labels start from 1. If true, the labels starts from 1, else the labels start from 0.
Example:
// create LocalImageSet from an image folder
val localImageSet = ImageSet.read("/tmp/image/")
// create DistributedImageSet from an image folder
val distributedImageSet2 = ImageSet.read("/tmp/image/", sc, 2)
Python APIs:
class zoo.feature.image.ImageSet
read(path, sc=None, min_partitions=1, resize_height=-1, resize_width=-1, image_codec=-1, with_label=False, one_based_label=True, bigdl_type="float")
Read images as Image Set. If sc is defined, read image as DistributedImageSet from local file system or HDFS. If sc is null, Read image as LocalImageSet from local file system
- path: path to read images. If sc is defined, path can be local or HDFS. Wildcard character are supported. If sc is null, path is local directory/image file/image file with wildcard character
- sc: SparkContext
- min_partitions: A suggestion value of the minimal splitting number for input data.
- resize_height height after resize, by default is -1 which will not resize the image
- resize_width width after resize, by default is -1 which will not resize the image
- image_codec: specifying the color type of a loaded image, same as in OpenCV.imread. By default is -1(
Imgcodecs.CV_LOAD_IMAGE_UNCHANGED
). - with_label: whether to treat folders in the path as image classification labels and read the labels into ImageSet.
- one_based_label: whether the labels start from 1. By default it is true, else the labels start from 0.
Python example:
# create LocalImageSet from an image folder
local_image_set2 = ImageSet.read("/tmp/image/")
# create DistributedImageSet from an image folder
distributed_image_set = ImageSet.read("/tmp/image/", sc, 2)
Image Transformer
Analytics Zoo provides many pre-defined image processing transformers built on top of OpenCV. After create these transformers, call transform
with ImageSet to get transformed ImageSet. Or pass the transformer to NNEstimator/NNClassifier to preprocess before training.
Scala APIs:
package com.intel.analytics.zoo.feature.image
object ImageBrightness
def apply(deltaLow: Double, deltaHigh: Double): ImageBrightness
Adjust the image brightness.
- deltaLow: low bound of brightness parameter
- deltaHigh: high bound of brightness parameter
Example:
val transformer = ImageBrightness(0.0, 32.0)
val transformed = imageSet.transform(transformer)
Python APIs:
class zoo.feature.image.imagePreprocessing.ImageBrightness
def __init__(delta_low, delta_high, bigdl_type="float")
Adjust the image brightness.
- delta_low: low bound of brightness parameter
- delta_high: high bound of brightness parameter
Example:
transformer = ImageBrightness(0.0, 32.0)
transformed = imageSet.transform(transformer)
Scala APIs:
package com.intel.analytics.zoo.feature.image
object ImageBytesToMat
def apply(byteKey: String = ImageFeature.bytes,
imageCodec: Int = Imgcodecs.CV_LOAD_IMAGE_UNCHANGED): ImageBytesToMat
Transform byte array(original image file in byte) to OpenCVMat
- byteKey: key that maps byte array. Default value is ImageFeature.bytes
- imageCodec: specifying the color type of a loaded image, same as in OpenCV.imread. 1. CV_LOAD_IMAGE_ANYDEPTH - If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit. 2. CV_LOAD_IMAGE_COLOR - If set, always convert image to the color one 3. CV_LOAD_IMAGE_GRAYSCALE - If set, always convert image to the grayscale one 4. >0 Return a 3-channel color image. Note The alpha channel is stripped from the output image. Use negative value if you need the alpha channel. 5. =0 Return a grayscale image. 6. <0 Return the loaded image as is (with alpha channel). Default value is Imgcodecs.CV_LOAD_IMAGE_UNCHANGED.
Example:
val imageSet = ImageSet.read(path, sc)
imageSet -> ImageBytesToMat()
3D Image Support
Create ImageSet for 3D Images
For 3D images, you can still use ImageSet as the collection of ImageFeature3D. You can create ImageSet for 3D images in the similar way as for 2D images. Since we do not provide 3D image reader in analytics zoo, before create ImageSet, we suppose you already read 3D images to tensor(scala) or numpy array(python).
Scala example:
val image = ImageFeature3D(tensor)
// create local imageset for 3D images
val arr = Array[ImageFeature](image)
val localImageSet = ImageSet.array(arr)
// create distributed imageset for 3D images
val rdd = sc.parallelize(Seq[ImageFeature](image))
val imageSet = ImageSet.rdd(rdd)
Python example:
# get image numpy array
img_np =
# create local imageset for 3D images
local_imageset = LocalImageSet(image_list=[img_np])
# create distributed imageset for 3D images
rdd = sc.parallelize([img_np])
dist_imageSet = DistributedImageSet(image_rdd=rdd)
3D Image Transformers
Analytics zoo also provides several image transformers for 3D Images.
The usage is similar as 2D image transformers. After create these transformers, call transform
with ImageSet to get transformed ImageSet.
Currently we support three kinds of 3D image transformers: Crop, Rotation and Affine Transformation.
Crop transformers
Crop3D
Scala:
import com.intel.analytics.zoo.feature.image3d.Crop3D
// create Crop3D transformer
val cropper = Crop3D(start, patchSize)
val outputImageSet = imageset.transform(cropper)
Crop a patch from a 3D image from 'start' of patch size. The patch size should be less than the image size. * start: start point array(depth, height, width) for cropping * patchSize: patch size array(depth, height, width)
Python:
from zoo.feature.image3d.transformation import Crop3D
crop = Crop3D(start, patch_size)
transformed_image = crop(image_set)
- start: start point list[]depth, height, width] for cropping
- patch_size: patch size list[]depth, height, width]
RandomCrop3D
Scala:
import com.intel.analytics.zoo.feature.image3d.RandomCrop3D
// create Crop3D transformer
val cropper = RandomCrop3D(cropDepth, cropHeight, cropWidth)
val outputImageSet = imageset.transform(cropper)
Crop a random patch from an 3D image with specified patch size. The patch size should be less than the image size. * cropDepth: depth after crop * cropHeight: height after crop * cropWidth: width after crop
Python:
from zoo.feature.image3d.transformation import RandomCrop3D
crop = RandomCrop3D(crop_depth, crop_height, crop_width)
transformed_image = crop(image_set)
- crop_depth: depth after crop
- crop_height: height after crop
- crop_width: width after crop
CenterCrop3D
Scala:
import com.intel.analytics.zoo.feature.image3d.CenterCrop3D
// create Crop3D transformer
val cropper = CenterCrop3D(cropDepth, cropHeight, cropWidth)
val outputImageSet = imageset.transform(cropper)
Crop a cropDepth
x cropWidth
x cropHeight
patch from center of image. The patch size should be less than the image size.
* cropDepth: depth after crop
* cropHeight: height after crop
* cropWidth: width after crop
Python:
from zoo.feature.image3d.transformation import CenterCrop3D
crop = CenterCrop3D(crop_depth, crop_height, crop_width)
transformed_image = crop(image_set)
- crop_depth: depth after crop
- crop_height: height after crop
- crop_width: width after crop
Rotation
Scala:
import com.intel.analytics.zoo.feature.image3d.Rotate3D
// create Crop3D transformer
val rotAngles = Array[Double](yaw, pitch, roll)
val rot = Rotate3D(rotAngles)
val outputImageSet = imageset.transform(rot)
Rotate a 3D image with specified angles. * rotationAngles: the angles for rotation. Which are the yaw(a counterclockwise rotation angle about the z-axis), pitch(a counterclockwise rotation angle about the y-axis), and roll(a counterclockwise rotation angle about the x-axis).
Python:
from zoo.feature.image3d.transformation import Rotate3D
rot = Rotate3D(rotation_angles)
transformed_image = rot(image_set)
Affine Transformation
Scala:
import com.intel.analytics.zoo.feature.image3d.AffineTransform3D
import com.intel.analytics.bigdl.tensor.Tensor
// create Crop3D transformer
val matArray = Array[Double](1, 0, 0, 0, 1.5, 1.2, 0, 1.3, 1.4)
val matTensor = Tensor[Double](matArray, Array[Int](3, 3))
val trans = Tensor[Double](3)
trans(1) = 0
trans(2) = 1.8
trans(3) = 1.1
val aff = AffineTransform3D(mat=matTensor, translation = trans, clampMode = "clamp", padVal = 0)
val outputImageSet = imageset.transform(aff)
Affine transformer implements affine transformation on a given tensor. To avoid defects in resampling, the mapping is from destination to source. dst(z,y,x) = src(f(z),f(y),f(x)) where f: dst -> src
- mat: [Tensor[Double], dim: DxHxW] defines affine transformation from dst to src.
- translation: [Tensor[Double], dim: 3, default: (0,0,0)] defines translation in each axis.
- clampMode: [String, (default: "clamp",'padding')] defines how to handle interpolation off the input image.
- padVal: [Double, default: 0] defines padding value when clampMode="padding". Setting this value when clampMode="clamp" will cause an error.
Python:
from zoo.feature.image3d.transformation import AffineTransform3D
affine = AffineTransform3D(affine_mat, translation, clamp_mode, pad_val)
transformed_image = affine(image_set)
- affine_mat: numpy array in 3x3 shape.Define affine transformation from dst to src.
- translation: numpy array in 3 dimension.Default value is np.zero(3). Define translation in each axis.
- clamp_mode: str, default value is "clamp". Define how to handle interpolation off the input image.
- pad_val: float, default is 0.0. Define padding value when clampMode="padding". Setting this value when clampMode="clamp" will cause an error.