Spark RDD fullOuterJoin [Pair]

- 1 min

fullOuterJoin [Pair]

对两个paired RDD进行按key的full outer Join操作。

注意:

函数原型:

def fullOuterJoinW: RDD[(K, (Option[V], Option[W]))] def fullOuterJoinW: RDD[(K, (Option[V], Option[W]))] def fullOuterJoinW: RDD[(K, (Option[V], Option[W]))]

例子:


val pairRDD1 = sc.parallelize(List( ("cat",2), ("cat", 5), ("book", 4),("cat", 12)))
val pairRDD2 = sc.parallelize(List( ("cat",2), ("cup", 5), ("mouse", 4),("cat", 12)))
pairRDD1.fullOuterJoin(pairRDD2).collect

res5: Array[(String, (Option[Int], Option[Int]))] = Array((book,(Some(4),None)), (mouse,(None,Some(4))), (cup,(None,Some(5))), (cat,(Some(2),Some(2))), (cat,(Some(2),Some(12))), (cat,(Some(5),Some(2))), (cat,(Some(5),Some(12))), (cat,(Some(12),Some(2))), (cat,(Some(12),Some(12))))
comments powered by Disqus
rss facebook twitter github youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora