I would like to sum values of one column based on another column(s) value as efficiently as possible. I was not sure if there was a way to use the summarize command. Here is an example data set:
Cancer1 Cancer2 Cancer3 Disease1
1 0 1 1
0 1 0 0
1 0 0 1
In this case I am looking to sum Disease1 based on if the person has a given cancer. I am looking for an output that would say the total number of people that have Cancer1 and Disease1 is 2, the total number of people that have Cancer2 and Disease1 is 0 and the total number of people that have Cancer3 and Disease1 is 1.