TY - GEN
T1 - TOWARDS ENABLING DEEP LEARNING-BASED QUESTION-ANSWERING FOR 3D LIDAR POINT CLOUDS
AU - Shinde, Rajat C.
AU - Durbha, Surya S.
AU - Potnis, Abhishek V.
AU - Talreja, Pratyush
AU - Singh, Gaganpreet
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - Remote sensing lidar point cloud dataset embeds inherent 3D topological, topographical and complex geometrical information which possess immense potential in applications involving machine-understandable 3D perception. The lidar point clouds are unstructured, unlike images, and hence are challenging to process. In our work, we are exploring the possibility of deep learning-based question-answering on the lidar 3D point clouds. We are proposing a deep CNN-RNN parallel architecture to learn lidar point cloud features and word embedding from the questions and fuse them to form a feature mapping for generating answers. We have restricted our experiments for the urban domain and present preliminary results of binary question-answering (yes/no) using the urban lidar point clouds based on the perplexity, edit distance, evaluation loss, and sequence accuracy as the performance metrics. Our proposed hypothesis of lidar question-answering is the first attempt, to the best of our knowledge, and we envisage that our novel work could be a foundation in using lidar point clouds for enhanced 3D perception in an urban environment. We envisage that our proposed lidar question-answering could be extended for machine comprehension-based applications such as rendering lidar scene descriptions and content-based 3D scene retrieval.
AB - Remote sensing lidar point cloud dataset embeds inherent 3D topological, topographical and complex geometrical information which possess immense potential in applications involving machine-understandable 3D perception. The lidar point clouds are unstructured, unlike images, and hence are challenging to process. In our work, we are exploring the possibility of deep learning-based question-answering on the lidar 3D point clouds. We are proposing a deep CNN-RNN parallel architecture to learn lidar point cloud features and word embedding from the questions and fuse them to form a feature mapping for generating answers. We have restricted our experiments for the urban domain and present preliminary results of binary question-answering (yes/no) using the urban lidar point clouds based on the perplexity, edit distance, evaluation loss, and sequence accuracy as the performance metrics. Our proposed hypothesis of lidar question-answering is the first attempt, to the best of our knowledge, and we envisage that our novel work could be a foundation in using lidar point clouds for enhanced 3D perception in an urban environment. We envisage that our proposed lidar question-answering could be extended for machine comprehension-based applications such as rendering lidar scene descriptions and content-based 3D scene retrieval.
KW - 3D urban perception
KW - Deep learning
KW - Lidar question-answering
KW - Towards scene retrieval
UR - http://www.scopus.com/inward/record.url?scp=85126021972&partnerID=8YFLogxK
U2 - 10.1109/IGARSS47720.2021.9553785
DO - 10.1109/IGARSS47720.2021.9553785
M3 - Conference contribution
AN - SCOPUS:85126021972
T3 - International Geoscience and Remote Sensing Symposium (IGARSS)
SP - 6936
EP - 6939
BT - IGARSS 2021 - 2021 IEEE International Geoscience and Remote Sensing Symposium, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2021
Y2 - 12 July 2021 through 16 July 2021
ER -