I have 8k samples with pairs of scores each:
sample 1:
Score A : [0.419, 0.348,0.271,0.12,0.25,0.145,0.375,0.172,0.082]
Score B : [0.997, 0.802, 0.62, 0.67, 0.72,0.64, 0.91, 0.65, 0.55]
each sample can have a different length (up to 12 elements ). I would like to check how correlated is score A with score B in means of the order/ rank (i.e the indexes with high score A have a High score B as well and vice versa ). I transformed the scores into argument sort (min to max, using numpy.argsort(scoreA) )
Score A order : [9 4 6 8 5 3 2 7 1]
Score B order: [9 3 6 8 4 5 2 7 1]
And calculated the Spearman rank coeff :
SpearmanrResult(correlation=0.9500000000000001, pvalue=8.762523965086177e-05)
Now , as mentioned, I have 8k sample, Should I just calculate 8k tests like that and average them to see what is the overall correlation?
Also, is there any other method that will be more sensitive to the number themself? i.e for a case like :
sample X:
Score A : [0.419, 0.420, 0.271, 0.1]
Score B : [0.997, 0.996, 0.62, 0.37]
Where the difference between the first two elements is very small but the order of each is not the same. In that case, I would like to get a high correlation.