a) Keep the number of partitions fewer than a few thousand
b) Get the most out of your hardware and process as much data as possible by processing multiple partitions in parallel
c) Avoid spilling to disk when processing indexes and aggregations (the ProcessIndex step of the processing stage)
d) Target 20 million rows or 250 MB per partition