Feedback PR

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Pattern Recognition Assignment 1 - Feedback

Dimitris Vogiatzopoulos, Jakub Kováč and Pavlos Sakellaridis


January 2023

General tip: Say you test a model with values x1 , x2 , . . . xn for which
xi < xi+1 (so x1 < x2 etc). Let us also say that x2 performs best. Then it is a
good idea to search again for a few numbers i for which x1 < i < x3 .
In general, your SVM should perform better than logistic regression. Most
groups have a little less accurate Logistic regression model and a little more
accurate SVM model.
These feedback points are sorted from important to most important! So 1.
is the most important

1. Parameter tuning SVM: what you missed. In general, you missed some
parameters you tuned, and also what you tuned was not very rigorous.
You did say you achieved the best score with the polynomial kernel, but
you did not say which kernels you tested. You also did not tune any kernel
parameters (degree, gamma, etc).
2. Parameter tuning SVM: feedback on what parameters you did tune. First
of all, you tune tolerance, which is unnecessary. Tolerance is an early
stopping criterion, which means that it does not need to be hyperparam-
eterized. The values you choose for C are very far apart, and testing a
bigger C would also be good. So you test very few values of C. The gen-
eral tip can also be used here (so you search again around the optimal
value you just found).
3. Own feature. Vague explanation (it counts how many pixels are bigger
than 0, it does not sum those pixels). The feature is a variation of ink
and not really complimentary. Does not talk about this either. Only gives
a 6% performance increase to ink (and ink only gives a 3% performance
increase to your feature). Do note that the performance with ink does not
necessarily have to be very high. You are only using one feature. What
is important is that there is more of a performance increase when using
them together.
4. Analysis. You did not mention that there are useless features (also report
on which features are useless and why. Some features don’t give any infor-
mation to the model at all). Did not give any other interesting statistics
(apart from std mean of ink and class distribution).

1
5. Last sentence of the conclusion you suddenly mention McNemar. You
don’t have a contingency matrix, you don’t specify alfa, you don’t re-
port the actual p-value and you also do not explain the test you use.
Also ”strong evidence” against the null hypothesis while you could make
stronger claims here (null hypothesis rejected, alternate hypothesis ac-
cepted).
6. Logistic regression. You use very small C values. Please test more values
with a higher max C value. Also, why do you give a mean? Do you not
test on the validation set here? You should test on a validation set and
give the accuracy of that. Do not test on the data that you have trained
on.
7. Your analysis of the model using both features (ink and your own) is very
short. You do not talk about what information the features add to each
other.

8. Layout looks ok (although all confusion matrices stick out, very large font
size as well). You already announce the results in chapter 3, which is really
confusing if you forget it is there and read chapter 4. It really should be
in chapter 4 instead.

You might also like