

My friend is volunteer for recording the dataset. Why we don’t use common voice sentences? There are near 7000 sentences in Farsi, which is reviewed and does not have numbers. I just made a pull request for adding Farsi in num2words library. This can be done by multiple people in parallel at least, but it’s an important step

I usually start with a set of 2000-5000 phonetically rich sentences, and then ask volunteers to help filter out ones that don’t make sense, are offensive somehow, or are something that a real native speaker would never say. With this dataset, I’ll be happy to train models for both MozillaTTS and my Larynx fork.ĮDIT: Forgot to add one more step: filtering sentences. After that, we’ll need to find a volunteer with a good microphone and a lot of patience. Once I can convert numbers to words, I can filter the OSCAR Farsi sentences and find a small set of sentences (usually < 2000) that will provide good phoneme pair examples. Would either of you ( or be able to help me add support? I use the num2words library to convert digits into words (1 -> one), and it doesn’t support Farsi yet. I’ve already added Farsi phonemes to my gruut-ipa library, and I’ve located a large corpus of sentences in OSCAR. But first, I need to develop a set of sentences that have good phoneme coverage. So I will need to collect recordings from a volunteer. I might be able to use it in the future for Farsi speech to text in Rhasspy, but it’ll need a lot of pre-processing. Unfortunately, it doesn’t have enough data from a single speaker.
#Google speech to text online persian full#
I was able to find some Farsi speech data: I contacted the author of the MirasVoice corpus and got the full set. I’ve made some progress, but I need help now (see below) Sorry I haven’t been very responsive, I should have some more time now during the holidays.
