Timeline for alternating gradient descent on quadratic function

Current License: CC BY-SA 4.0

13 events

when toggle format	what		by	license	comment
Dec 6, 2023 at 14:53	history	edited	Ki Chao	CC BY-SA 4.0	added 9 characters in body
Dec 6, 2023 at 7:54	comment	added	P. Quinton		For proving that $f(x_{k+1}, y_{k+1})\leq f(x_{k+1}, y_k)$ yes. The problem here is that your problem is biconvex and you are doing alternating biconvex optimization, this is not guarantied to converge and I think it may not in general have a minimum (but maybe your does, I don't know).
Dec 5, 2023 at 19:55	comment	added	Ki Chao		@P.Quinton: thanks for the further explanation. My issue is would it be fine that the Lipschitz constant in this case depend on the other variable $x$?
Dec 5, 2023 at 14:04	comment	added	P. Quinton		I also believe your function is smooth in $y$ for a fixed $x$, so you only need to prove it for $y$. You can do that by observing that $x^T y^T y x=\operatorname{Tr}(y x x^T y^T)$ and therefore using a bit of matrix calculus $\nabla_y f(x, y) = 2 y x x^T$, it is then easy to show that $\\| \nabla_y f(x, a) -\nabla_y f(x, b) \\|_{\text{F}} = 2\\| (a-b) x x^T \\|_{\text{F}}\leq 2 \\| x x^T \\|_{\text{F}} \cdot \\| a-b \\|_{\text F}=2 x^T x \\| a-b\\|_{\text F}$. Therefore your function is actually $2 x^T x$-smooth.
Dec 5, 2023 at 10:57	comment	added	Ki Chao		@P.Quinton: thanks for the reference and explanation. At least in my case, I think the function is smooth and convex in $y$. Do you imply I need it to be smooth also in $x$?
Dec 5, 2023 at 9:29	comment	added	P. Quinton		you can read this : epfl.ch/labs/anchp/wp-content/uploads/2018/05/part3a.pdf The thing is that your descent step might overshoot bring you to a higher point, think of the $f(x)=\|x\|$ function if you are at the point $x_0=-\eta/4$, then doing the gradient step $x_0 - \eta c\dot \partial f(x_0)$, then you get at $x_1 = -3/4 \eta$ and $f(x_0) < f(x_1)$. Smoothness avoids that by saying that the function "looks similar enough" to a quadratic function.
Dec 5, 2023 at 9:08	comment	added	Ki Chao		@P.Quinton: $y$ here is a matrix. Could you explain more about the smoothness assumptions?
Dec 5, 2023 at 8:48	comment	added	P. Quinton		$y^T y$ is a value, did you mean $x^T y y^T x$ ? If so then you just need to show that $f(x^{k+1}, y)$ is convex in $y$ and you also have to be careful about $\eta_{y}$, you would need some assumption such as smoothness to prove that.
Dec 5, 2023 at 8:33	history	edited	Ki Chao	CC BY-SA 4.0	added condition on $\eta_y$
Dec 4, 2023 at 19:14	comment	added	Ki Chao		@copper.hat: I updated the question as I seem to forgot a term in the definition of $f$.
Dec 4, 2023 at 19:13	history	edited	Ki Chao	CC BY-SA 4.0	updated the definition of $f$
Dec 4, 2023 at 18:26	comment	added	copper.hat		It has no minimum, you have $f >0$ but $\inf f = 0$.
Dec 4, 2023 at 16:30	history	asked	Ki Chao	CC BY-SA 4.0