Icann 2023
Icann 2023
Icann 2023
Backpropagation
related to the degree of correlation between the variables in the data, if the
determinant is close to zero, some of the variables are highly correlated, which
implies that the dataset has redundant information. This can be formalized as:
Let the linear layer weight matrix W = (aij )1≤i≤m,1≤j≤n be of size m × n,
then if columns of W are linearly dependent, then there exist a vector ⃗x ∈ Rn
for some n ∈ N such that W⃗x = 0. Then if m > n:
CIFAR10
1 0.204 · 106 57.85 62.72 8.4
2 0.496 · 106 62.37 66.87 7.2
3 1.37 · 106 65.16 68.33 4.9
CIFAR100
1 0.204 · 106 28.40 31.76 11.8
2 0.496 · 106 31.24 33.92 11.7
3 1.37 · 106 33.65 35.52 5.6
Table 1. Results of the models on the CIFAR10 and CIFAR100 datasets
References
1. Erhan, D., Courville, A., Bengio, Y., Vincent, P.: Why does unsupervised pre-
training help deep learning? In: Proceedings of the thirteenth international con-
ference on artificial intelligence and statistics. pp. 201–208. JMLR Workshop and
Conference Proceedings (2010)
2. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward
neural networks. In: Proceedings of the thirteenth international conference on ar-
tificial intelligence and statistics. pp. 249–256. JMLR Workshop and Conference
Proceedings (2010)
3. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.
In: Proceedings of the IEEE conference on computer vision and pattern recognition.
pp. 770–778 (2016)
4. Kumar, S.K.: On weight initialization in deep neural networks. arXiv preprint
arXiv:1704.08863 (2017)
5. Narkhede, M.V., Bartakke, P.P., Sutaone, M.S.: A review on weight initialization
strategies for neural networks. Artificial intelligence review 55(1), 291–322 (2022)
6. Rodrı́guez, P., Gonzalez, J., Cucurull, G., Gonfaus, J.M., Roca, X.: Regularizing
cnns with locally constrained decorrelations. arXiv preprint arXiv:1611.01967 (2016)
7. Sreeram, V., Agathoklis, P.: On the properties of gram matrix. IEEE Transactions
on Circuits and Systems I: Fundamental Theory and Applications 41(3), 234–237
(1994)
8. Vorontsov, E., Trabelsi, C., Kadoury, S., Pal, C.: On orthogonality and learning
recurrent networks with long term dependencies. In: International Conference on
Machine Learning. pp. 3570–3578. PMLR (2017)