Abstract
Effective assisted living environments must be able to perform inferences on how their occupants interact with one another as well as with surrounding objects. To accomplish this goal using a vision-based automated approach, multiple tasks such as pose estimation, object segmentation and gaze estimation must be addressed. Gaze direction provides some of the strongest indications of how a person interacts with the environment. In this paper, we propose a simple neural network regressor that estimates the gaze direction of individuals in a multi-camera assisted living scenario, relying only on the relative positions of facial keypoints collected from a single pose estimation model. To handle cases of keypoint occlusion, our model exploits a novel confidence gated unit in its input layer. In addition to the gaze direction, our model also outputs an estimation of its own prediction uncertainty. Experimental results on a public benchmark demonstrate that our approach performs on par with a complex, dataset-specific baseline, while its uncertainty predictions are highly correlated to the actual angular error of corresponding estimations. Finally, experiments on images from a real assisted living environment demonstrate that our model has a higher suitability for its final application.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 279-288 |
| Number of pages | 10 |
| ISBN (Electronic) | 9781728165530 |
| DOIs | |
| State | Published - Mar 2020 |
| Externally published | Yes |
| Event | 2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020 - Snowmass Village, United States Duration: Mar 1 2020 → Mar 5 2020 |
Publication series
| Name | Proceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020 |
|---|
Conference
| Conference | 2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020 |
|---|---|
| Country/Territory | United States |
| City | Snowmass Village |
| Period | 03/1/20 → 03/5/20 |
Funding
Finally, evaluation on frames collected from a real assisted living facility demonstrate that our model has a higher suitability for IADL analysis in realistic scenarios, where images cover wider areas and subjects are visible at different scales and poses. Acknowledgements Part of this work has been carried out at the Machine Learning Genoa (MaLGa) center, Università di Genova (IT) thanks to the students mobility supported by Erasmus+ K107. We acknowledge the NVIDIA Corporation for the donation of a GPU used for this research.