similar problems with /ʒ/ ("genre" comes out as ?ehnuh where <?> is a consonant described as a "voiced alveolar fricative stop" with a hint of velar thrown in). probably because these sounds combine be less than 1% of all sounds and might not be present more than a handful of times in the training set. I might have to think about partitioning differently or augmenting the data set to even out the distribution
@aparrish I tried to make that noise but it was a challenge.
Follow friends and discover new ones. Publish anything you want: links, pictures, text, video. This server is run by the main developers of the Mastodon project. Everyone is welcome as long as you follow our code of conduct!