Skip to content

Addressee

Addressee is a

What BabyHuBERT is not

  • Not the voice type classifier — it is a generalistic model that extracts richer representations from longform recordings. To extract voice type segments refer to VTC2.0.

What BabyHuBERT is

tdlr

BabyHuBERT is a model trained on 13 000 hours of adult and child speech from child-centered long-form audio recordings across 40 languages using the HuBERT training recipe and it's base architecture.

Description

Addressee is a

Advanced description

Addressee is a

Ethics statement

Ethics surrounding BabyHuBERT data

To know more about the BabyHuBERT ethics please refer to the BabyHuBERT page for more information.

How to access model

  1. Read the License
  2. Go on the BabyHuBERT repo : coml/BabyHuBERT
  3. Fill the required fields and accept the license
  4. Download the model

Derived models

Model Task
VTC2.0 Voice Type Classification
Addressee Addressee classification
BabAR Phoneme recognition

How to cite

@misc{charlot2026babyhubertmultilingualselfsupervisedlearning,
    title={BabyHuBERT: Multilingual Self-Supervised Learning for Segmenting Speakers in Child-Centered Long-Form Recordings}, 
    author={Théo Charlot and Tarek Kunze and Maxime Poli and Alejandrina Cristia and Emmanuel Dupoux and Marvin Lavechin},
    year={2026},
    eprint={2509.15001},
    archivePrefix={arXiv},
    primaryClass={eess.AS},
    url={https://arxiv.org/abs/2509.15001}, 
}