Skip to main content
3 events
when toggle format what by license comment
Jan 29, 2020 at 12:17 vote accept mccarthyj
Feb 18, 2016 at 20:06 comment added Tzach Zohar note that this solution is limited to cases where peopleById is small enough to fit into driver memory (single machine), in which case you don't really need Spark at all... if this collection gets larger, you're likely to get an OutOfMemoryError on the second line, which collects all data from cluster to the driver machine.
Feb 18, 2016 at 16:38 history answered mccarthyj CC BY-SA 3.0