I have a Data Warehouse in Redshift. The redshift cluster is a 2 nodes ra3.xlplus (4 vCPU, 32GB Memory)
.
I have relatively smaller dimensions - The largest one has 1M records. The fact tables would contain around 10M records.
Based on the blogs, answers, and videos that I have checked so far, Could below be the right combination of DISTKEY and SORTKEY?
For all dimensions - DIST STYLE - ALL (since the data is less)
SORT KEY - surrogate key of the dimension
For all fact tables - DIST STYLE - KEY
DIST KEY - The most important dim table's surrogate key which is frequently joined in my BI queries.
SORT KEY - Dim_Date_ID since this is used in where clauses.
Can someone please help in confirming whether this could be the correct combination?
Reference links that I have checked - This and This
Thank you!
Sanket