Shuffle in Spark | Session-10 | Apache Spark Series from A-Z

Опубликовано: 21 Декабрь 2019
на канале: GK Codelabs
10,056
194

Hi Friends
Apache spark is a distributed computing framework, that basically means the data that is being processed is Distributed among the nodes, but when the data is to be computed the distributed data many a times need to be Shuffled across the different partitions of Distributed data.
In this video I have explained about Spark Shuffle, and why it is important and inevitable park of Apache spark.