Addressee¶

Addressee is a

What BabyHuBERT is not¶

Not the voice type classifier — it is a generalistic model that extracts richer representations from longform recordings. To extract voice type segments refer to VTC2.0.

What BabyHuBERT is¶

tdlr

BabyHuBERT is a model trained on 13 000 hours of adult and child speech from child-centered long-form audio recordings across 40 languages using the HuBERT training recipe and it's base architecture.

Description¶

Addressee is a

Advanced description¶

Addressee is a

Ethics statement¶

Ethics surrounding BabyHuBERT data

To know more about the BabyHuBERT ethics please refer to the BabyHuBERT page for more information.

How to access model¶

Read the License
Go on the BabyHuBERT repo : coml/BabyHuBERT
Fill the required fields and accept the license
Download the model

Derived models¶

Model	Task
VTC2.0	Voice Type Classification
Addressee	Addressee classification
BabAR	Phoneme recognition

How to cite¶

@misc{charlot2026babyhubertmultilingualselfsupervisedlearning,
    title={BabyHuBERT: Multilingual Self-Supervised Learning for Segmenting Speakers in Child-Centered Long-Form Recordings}, 
    author={Théo Charlot and Tarek Kunze and Maxime Poli and Alejandrina Cristia and Emmanuel Dupoux and Marvin Lavechin},
    year={2026},
    eprint={2509.15001},
    archivePrefix={arXiv},
    primaryClass={eess.AS},
    url={https://arxiv.org/abs/2509.15001}, 
}