TY - GEN
T1 - Deep Q-Learning for Dry Stacking Irregular Objects
AU - Liu, Yifang
AU - Shamsi, Seyed Mahdi
AU - Fang, Le
AU - Chen, Changyou
AU - Napp, Nils
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/27
Y1 - 2018/12/27
N2 - We propose a reinforcement learning approach for automatically building dry stacked (i.e. no mortar) structures with irregular objects. Stacking irregular objects is a challenging problem since each assembly action can be drawn from a continuous space of poses for an object, and several local geometric and physical considerations strongly affect the stability. To tackle this challenge, we concentrate on a simplified 2D version of the problem. We present a reinforcement learning algorithm based on deep $Q$-learning, where the learned $Q$-function, which maps state-action pairs into expected long-term rewards, is represented by a deep neural network. As the action space is continuous the $Q$-network is trained by sampling a finite number of actions that consider both geometric and physical constraints to approximate the target $Q$-values, Experiments show that the proposed method outperforms previous heuristics-based planning, leading to super construction with objects containing a significant amount of variations. We validate the generated stacking plans by executing them using a robot arm and manufactured, irregular objects.
AB - We propose a reinforcement learning approach for automatically building dry stacked (i.e. no mortar) structures with irregular objects. Stacking irregular objects is a challenging problem since each assembly action can be drawn from a continuous space of poses for an object, and several local geometric and physical considerations strongly affect the stability. To tackle this challenge, we concentrate on a simplified 2D version of the problem. We present a reinforcement learning algorithm based on deep $Q$-learning, where the learned $Q$-function, which maps state-action pairs into expected long-term rewards, is represented by a deep neural network. As the action space is continuous the $Q$-network is trained by sampling a finite number of actions that consider both geometric and physical constraints to approximate the target $Q$-values, Experiments show that the proposed method outperforms previous heuristics-based planning, leading to super construction with objects containing a significant amount of variations. We validate the generated stacking plans by executing them using a robot arm and manufactured, irregular objects.
UR - https://www.scopus.com/pages/publications/85062962132
U2 - 10.1109/IROS.2018.8593619
DO - 10.1109/IROS.2018.8593619
M3 - Conference contribution
AN - SCOPUS:85062962132
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 1569
EP - 1576
BT - 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018
Y2 - 1 October 2018 through 5 October 2018
ER -