[D] Offering licensed Indian language speech datasets (with explicit contributor consent)

Hi everyone,

I run a small data initiative where we collect speech datasets in multiple Indian languages directly from contributors who provide explicit consent for their recordings to be used and licensed.

We can provide datasets with either exclusive or non-exclusive rights depending on the use case. The goal is to make ethically sourced speech data available for teams working on ASR, TTS, voice AI, or related research.

If anyone here is working on speech models and might be looking for Indian language audio data, feel free to reach out. Happy to share more details about the datasets and collection process.

— Divyam
Founder, DataCatalyst
datacatalyst.in

submitted by /u/Trick-Praline6688
[link] [comments]

Leave a Comment