Addressee¶
Addressee is a
What BabyHuBERT is not¶
- Not the voice type classifier — it is a generalistic model that extracts richer representations from longform recordings. To extract voice type segments refer to VTC2.0.
What BabyHuBERT is¶
tdlr
BabyHuBERT is a model trained on 13 000 hours of adult and child speech from child-centered long-form audio recordings across 40 languages using the HuBERT training recipe and it's base architecture.
Description¶
Addressee is a
Advanced description¶
Addressee is a
Ethics statement¶
Ethics surrounding BabyHuBERT data
To know more about the BabyHuBERT ethics please refer to the BabyHuBERT page for more information.
How to access model¶
- Read the License
- Go on the BabyHuBERT repo : coml/BabyHuBERT
- Fill the required fields and accept the license
- Download the model
Derived models¶
| Model | Task |
|---|---|
| VTC2.0 | Voice Type Classification |
| Addressee | Addressee classification |
| BabAR | Phoneme recognition |
How to cite¶
@misc{charlot2026babyhubertmultilingualselfsupervisedlearning,
title={BabyHuBERT: Multilingual Self-Supervised Learning for Segmenting Speakers in Child-Centered Long-Form Recordings},
author={Théo Charlot and Tarek Kunze and Maxime Poli and Alejandrina Cristia and Emmanuel Dupoux and Marvin Lavechin},
year={2026},
eprint={2509.15001},
archivePrefix={arXiv},
primaryClass={eess.AS},
url={https://arxiv.org/abs/2509.15001},
}