Infants’ developing environment: integration of computer vision and human annotation to quantify where infants go, what they touch, and what they see

Han, D. and Aziere, N. and Wang, T. and Ossmy, Ori and Krishna, A. and Wang, H. and Shen, R. and Todorovic, S. and Adolph, K. (2024) Infants’ developing environment: integration of computer vision and human annotation to quantify where infants go, what they touch, and what they see. In: UNSPECIFIED (ed.) 2024 IEEE International Conference on Development and Learning (ICDL). IEEE. ISBN 9798350348569.

Text
Han_IEEE_2024.pdf - Published Version of Record
Restricted to Repository staff only
Download (2MB)

Official URL: https://doi.org/10.1109/ICDL61372.2024.10644441

Abstract

Infants learn through interactions with the environment. Thus, to understand infants’ early learning experiences, it is critical to quantify their natural learning input—where infants go, what they touch, and what they see. Wearable sensors can record locomotor and hand movements, but cannot recover the context that prompted the behaviors. Egocentric views from head cameras and eye trackers require annotation to process the videos and miss much of the surrounding context. Third-person video captures infant behavior in the entire scene but may misrepresent the egocentric view. Moreover, third-person video requires machine or human annotation to make sense of the behaviors, and either method alone is sorely lacking. Computer-vision is not sufficiently reliable to quantify much of infants’ complex, variable behavior, and human annotation cannot reliably quantify 3D coordinates of behavior without laborious hand digitization. Thus, we pioneered a new system of behavior detection from third-person video that capitalizes on the integrated power of computer vision and human annotation to quantify infants’ locomotor, manual, and egocentric visual interactions with the environment. Our system estimates a real infant’s interaction with a physical environment during free play by projecting a “virtual” infant in a “virtual” 3D environment with known coordinates of all furniture, objects, and surfaces. Our methods for using humanin-the-loop computer vision have broad applications for reliable quantification of locomotor, manual, and visual behaviors outside the purview of standard algorithms or human annotation alone.

Metadata

Item Type:	Book Section
School:	Birkbeck Faculties and Schools > Faculty of Science > School of Psychological Sciences
Research Centres and Institutes:	Brain and Cognitive Development, Centre for (CBCD)
Depositing User:	Ori Ossmy
Date Deposited:	17 Jun 2025 12:56
Last Modified:	20 Sep 2025 09:32
URI:	https://eprints.bbk.ac.uk/id/eprint/55742

Statistics

DownloadsShow export options

Activity Overview

6 month trend

1Download

6 month trend

84Hits

Additional statistics are available via IRStats2.

Archive Staff Only (login required)

Edit/View Item