BIROn - Birkbeck Institutional Research Online

    Infants’ developing environment: integration of computer vision and human annotation to quantify where infants go, what they touch, and what they see

    Han, D. and Aziere, N. and Wang, T. and Ossmy, Ori and Krishna, A. and Wang, H. and Shen, R. and Todorovic, S. and Adolph, K. (2024) Infants’ developing environment: integration of computer vision and human annotation to quantify where infants go, what they touch, and what they see. In: UNSPECIFIED (ed.) 2024 IEEE International Conference on Development and Learning (ICDL). IEEE. ISBN 9798350348569.

    [img] Text
    Han_IEEE_2024.pdf - Published Version of Record
    Restricted to Repository staff only

    Download (2MB)

    Abstract

    Infants learn through interactions with the environment. Thus, to understand infants’ early learning experiences, it is critical to quantify their natural learning input—where infants go, what they touch, and what they see. Wearable sensors can record locomotor and hand movements, but cannot recover the context that prompted the behaviors. Egocentric views from head cameras and eye trackers require annotation to process the videos and miss much of the surrounding context. Third-person video captures infant behavior in the entire scene but may misrepresent the egocentric view. Moreover, third-person video requires machine or human annotation to make sense of the behaviors, and either method alone is sorely lacking. Computer-vision is not sufficiently reliable to quantify much of infants’ complex, variable behavior, and human annotation cannot reliably quantify 3D coordinates of behavior without laborious hand digitization. Thus, we pioneered a new system of behavior detection from third-person video that capitalizes on the integrated power of computer vision and human annotation to quantify infants’ locomotor, manual, and egocentric visual interactions with the environment. Our system estimates a real infant’s interaction with a physical environment during free play by projecting a “virtual” infant in a “virtual” 3D environment with known coordinates of all furniture, objects, and surfaces. Our methods for using humanin-the-loop computer vision have broad applications for reliable quantification of locomotor, manual, and visual behaviors outside the purview of standard algorithms or human annotation alone.

    Metadata

    Item Type: Book Section
    School: Birkbeck Faculties and Schools > Faculty of Science > School of Psychological Sciences
    Research Centres and Institutes: Brain and Cognitive Development, Centre for (CBCD)
    Depositing User: Ori Ossmy
    Date Deposited: 17 Jun 2025 12:56
    Last Modified: 01 Sep 2025 00:30
    URI: https://eprints.bbk.ac.uk/id/eprint/55742

    Statistics

    Activity Overview
    6 month trend
    1Download
    6 month trend
    62Hits

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item
    Edit/View Item