Abstract
Quadratic programming (QP) seeks to solve optimization problems involving quadratic functions that can include complex boundary constraints. QP in the unrestricted form is NP-hard; but when restricted to the convex case, it becomes tractable. Active set and interior point methods are used to solve convex problems, and in the nonconvex case various heuristics or relaxations are used to produce high-quality solutions in finite time. Learning to optimize (L2O) is an emerging approach to design solvers for optimization problems. We develop an L2O approach that uses reinforcement learning to learn a stochastic policy to switch between pre-existing optimization algorithms to solve QP problem instances. In particular, our agent switches between three simple optimizers: Adam, gradient descent, and random search. Our experiments show that the learned optimizer minimizes quadratic functions faster and finds better-quality solutions in the long term than do any of the possible optimizers switched between. We also compare our solver with the standard QP algorithms in MATLAB and find better performance in fewer function evaluations.
Original language | English |
---|---|
Pages (from-to) | 1553-1568 |
Number of pages | 16 |
Journal | Proceedings of Machine Learning Research |
Volume | 157 |
State | Published - 2021 |
Externally published | Yes |
Event | 13th Asian Conference on Machine Learning, ACML 2021 - Virtual, Online Duration: Nov 17 2021 → Nov 19 2021 |
Funding
This material is based upon work supported by the U.S. Department of Energy (DOE), Office of Science, Office of Advanced Scientific Computing Research, under Contract DE- AC02-06CH11357. We gratefully acknowledge the computing resources provided by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory. This material is based upon work supported by the U.S. Department of Energy (DOE), Office of Science, Office of Advanced Scientific Computing Research, under Contract DEAC02-06CH11357. We gratefully acknowledge the computing resources provided by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.
Funders | Funder number |
---|---|
U.S. Department of Energy | |
Office of Science | |
Advanced Scientific Computing Research | DE- AC02-06CH11357 |
Argonne National Laboratory |
Keywords
- Learning to Optimize
- Quadratic Programming
- Reinforcement Learning