1

I have a Data Warehouse in Redshift. The redshift cluster is a 2 nodes ra3.xlplus (4 vCPU, 32GB Memory) .

I have relatively smaller dimensions - The largest one has 1M records. The fact tables would contain around 10M records.

Based on the blogs, answers, and videos that I have checked so far, Could below be the right combination of DISTKEY and SORTKEY?

For all dimensions - DIST STYLE - ALL (since the data is less)

SORT KEY - surrogate key of the dimension

For all fact tables - DIST STYLE - KEY

DIST KEY - The most important dim table's surrogate key which is frequently joined in my BI queries.

SORT KEY - Dim_Date_ID since this is used in where clauses.

Can someone please help in confirming whether this could be the correct combination?

Reference links that I have checked - This and This

Thank you!

Sanket

1 Answer 1

4

You are correct. In general:

  • Set the DISTKEY to the column most commonly used in JOIN
  • Set the SORTKEY to the column most used in WHERE

If the tables are small, then DISTKEY ALL is fine -- it will replicate the tables between all nodes, thereby reducing cross-node data transfer.

Preferably, use the same DISTKEY on all tables that are JOINed. That way, the data is distributed on the same node.

3
  • 2
    John is spot on, as usual. I'd just add that when joining a dist style all dim table to a fact table it doesn't matter what the dist key is on the fact table. Every join target is local. It will matter when joining 2 tables with dist keys. So focus on the fact to fact joins when selecting the dist key. Commented Nov 23, 2022 at 11:28
  • Hey @BillWeiner -- does the latest incarnation of Redshift automatically figure out the best DISTKEY and SORTKEY these days? Commented Nov 23, 2022 at 20:35
  • Redshift does have an auto distribution mode but that is far from choosing the best dist key. It rarely chooses a key but rather all or even. Even is equivalent to random which is a join and group by performance killer. Unless there is a fall off a log obvious dist key I haven't seen RS make a good choice. Your experience different? Commented Nov 23, 2022 at 20:45

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.