3
$\begingroup$

Is there any difference in minimizing the sum of squared errors in a linear regression model learning, compared to minimizing the mean of the sum of squared errors, apart from having easier math when calculating the derivative of the error function?

The formula I am talking about is:

$$Error_{\theta} = \sum_{i=1}^{N} \left(\left(y_i - \left(\theta^T \cdot x_i \right) \right)^2 \right)$$

And usually the following is the minimization problem:

$$\underset{\theta}{\min} \left( \frac{1}{N} \sum_{i=1}^{N} \left(\left(y_i - \left(\theta^T \cdot x_i \right) \right)^2 \right) \right)$$

or add a $2$ to make derivation easier:

$$\underset{\theta}{\min} \left( \frac{1}{2N} \sum_{i=1}^{N} \left(\left(y_i - \left(\theta^T \cdot x_i \right) \right)^2 \right) \right)$$

Wouldn't

$$\underset{\theta}{\min} \left( \sum_{i=1}^{N} \left(\left(y_i - \left(\theta^T \cdot x_i \right) \right)^2 \right) \right)$$

work just as well?

$\endgroup$
1
  • 6
    $\begingroup$ Yes, minimising a function is the same as minimising that function multiplied by a positive constant. Or in fact any strictly increasing transformation of the function. $\endgroup$
    – Denziloe
    Commented Aug 25, 2018 at 14:32

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Browse other questions tagged or ask your own question.