Relation


A Relation represents the relationship between two items.

Scala/Python

relation = Relation(id1, id2, label)

A RelationPair is made up of two relations of the same id1, namely:


Read Relations

From csv or txt file

Each record is supposed to contain id1, id2 and label described above in the exact order.

For csv file, it should be without header.

For txt file, each line should contain one record with fields separated by comma.

Scala

relationsRDD = Relations.read(path, sc, minPartitions = 1)
relationsArray = Relations.read(path)

Python

relations_rdd = Relations.read(path, sc, min_partitions = 1)
relations_list = Relations.read(path)

From parquet file

Read relations from parquet file exactly with the schema in Relation. Return RDD of Relation.

Scala

relationsRDD = Relations.readParquet(path, sqlContext)

Python

relations_rdd = Relations.read_parquet(path, sc)