Embed these “worst case” constructions using a random orthogonal map, and if the dimension is sufficiently high, then any distribution over initializations will likely be nearly orthogonal to all directions of interest, so the analysis works as if you initialize at 0.
Embed these “worst case” constructions using a random orthogonal map, and if the dimension is sufficiently high, then any distribution over initializations will likely be nearly orthogonal to all directions of interest, so the analysis works as if you initialize at 0.
btw according to the cash prize proposed at COLT, we now owe Zihan (1.03-1)*50=1.5$ 😂
btw according to the cash prize proposed at COLT, we now owe Zihan (1.03-1)*50=1.5$ 😂