参考文献

Abernethy, Jacob, et al. "Optimal strategies and minimax lower bounds for online convex games." Proceedings of the 21st annual conference on learning theory, 2008.
Auer, Peter. "Using confidence bounds for exploitation-exploration trade-offs." Journal of Machine Learning Research, 3.Nov (2002): 397-422.
Bouneffouf, Djallel. "Finite-time analysis of the multi-armed bandit problem with known trend." 2016 IEEE Congress on Evolutionary Computation (CEC), IEEE, 2016.
Boyd, Stephen, and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.
Bubeck, Sébastien, Ronen Eldan, and Yin Tat Lee. "Kernel-based methods for bandit convex optimization." Journal of the ACM (JACM), 68.4 (2021): 1-35.
Devroye, Luc, László Györfi, and Gábor Lugosi. A probabilistic theory of pattern recognition. Vol. 31, Springer Science & Business Media, 2013.
Feller, William. "An introduction to probability theory and its applications." 1971.
Flaxman, Abraham D., Adam Tauman Kalai, and H. Brendan McMahan. "Online convex optimization in the bandit setting: gradient descent without a gradient." arXiv preprint cs/0408007, 2004.
Hazan, Elad, Amit Agarwal, and Satyen Kale. "Logarithmic regret algorithms for online convex optimization." Machine Learning, 69.2 (2007): 169-192.
Ismailov, Vugar E. "A three layer neural network can represent any multivariate function." Journal of Mathematical Analysis and Applications, 523.1 (2023): 127096.
Kearns, Michael J., and Umesh Vazirani. An introduction to computational learning theory. MIT press, 1994.
Lai, Tze Leung, and Herbert Robbins. "Asymptotically efficient adaptive allocation rules." Advances in applied mathematics, 6.1 (1985): 4-22.
McAllester, David A. "PAC-Bayesian stochastic model selection." Machine Learning, 51.1 (2003): 5-21.
Mohri, Mehryar. "Foundations of machine learning." 2018.
Nakkiran, Preetum, et al. "Deep double descent: Where bigger models and more data hurt." Journal of Statistical Mechanics: Theory and Experiment, 2021.12 (2021): 124003.
Penot, Jean-Paul. "On regularity conditions in mathematical programming." Optimality and Stability in Mathematical Programming (1982): 167-199.
Robbins, Herbert. "Some aspects of the sequential design of experiments." 1952: 527-535.
Thompson, William R. "On the likelihood that one unknown probability exceeds another in view of the evidence of two samples." Biometrika, 25.3-4 (1933): 285-294.
Wainwright, Martin J. High-dimensional statistics: A non-asymptotic viewpoint. Vol. 48, Cambridge university press, 2019.
Wang, Guanghui, Shiyin Lu, and Lijun Zhang. "Adaptivity and optimality: A universal algorithm for online convex optimization." Uncertainty in Artificial Intelligence, PMLR, 2020.
Yun, Chulhee, et al. "Are Transformers universal approximators of sequence-to-sequence functions?" International Conference on Learning Representations.
Zhang, Lijun, Shiyin Lu, and Zhi-Hua Zhou. "Adaptive online learning in dynamic environments." Advances in neural information processing systems, 31 (2018).
Zhang, Lijun, Tie-Yan Liu, and Zhi-Hua Zhou. "Adaptive regret of convex and smooth functions." International Conference on Machine Learning, PMLR, 2019.
Zinkevich, Martin. "Online convex programming and generalized infinitesimal gradient ascent." Proceedings of the 20th international conference on machine learning (ICML-03), 2003.

参考文献 ​

参考文献