什么时候需要持久化以及什么时候需要取消对RDD的持久化

更新时间：2022-06-22 22:29:26

Spark自动监视每个节点上的缓存使用情况，并以最近最少使用(LRU)的方式丢弃旧的数据分区.如果要手动删除RDD而不是等待它脱离缓存，请使用RDD.unpersist()方法.

Spark automatically monitors cache usage on each node and drops out old data partitions in a least-recently-used (LRU) fashion. If you would like to manually remove an RDD instead of waiting for it to fall out of the cache, use the RDD.unpersist() method.