Pantagruel: Unified Self-Supervised Encoders for French Text and Speech

ArXi:2601.05911v2 Announce Type: replace We release Pantagruel models, a new family of self-supervised encoder models for French text and speech. Instead of predicting modality-tailored targets such as textual tokens or speech units, Pantagruel learns contextualized target representations in the feature space, allowing modality-specific encoders to capture linguistic and acoustic regularities effectively.