![]() ![]() We built up the tools and processes not only to generate the material, but to audit and correct it when we need to. Of course, we also needed to make sure it was correct, and to get the information to the right place, for millions of users. With these capabilities, we built a factory to generate all the viseme timings we needed for our course content. Each sound is mapped onto a visual representation, or viseme, in a set we designed based on linguistic features. To create accurate animations, we generate the speech, run it through our in-house speech recognition and pronunciation models, and get the timing for each word and phoneme (speech sound). When we built the voices for text-to-speech, the solution we used didn’t give us pronunciations and timing for what was being said–but we have a rich speech technology ecosystem that we set up for language learning. To make the mouth movements, we need to know what is happening in the speech in fine detail. We knew Rive was the right tool to bring lip syncing to life! A peek inside Rive! The magic of speech technology The State Machine’s powerful system is what allowed this project to be feasible on a grand scale. ![]() ![]() That means it allowed us to programmatically control which animation states are called, how they are called, and how they transition and blend together. It seemed to solve so many of our problems: The file sizes were compact and plugged in neatly with Duolingo’s app architecture, and the handoff from animator to engineer was seamless.īut what stood out to us was Rive’s State Machine: a visual representation of the logic that connects the animations (“states”) together. Rive is a web-based tool for making real-time interactive animations and designs, similar to a game engine. This is how we learned about Rive! What is Rive? We thought the answer might lie in an alternative to a game engine -something that can help us take a limited number of assets and turn those into a virtually unlimited number of combinations. Plus, we wanted to ensure that our animation quality was not compromised in the process! We needed something scalable to account for any possible combination of mouth shapes for each character to correspond with the sounds, while keeping the file size small enough to run on Android, iOS, and the Web. We teach more than 40 languages across more than 100 courses, each containing thousands of sentences and lessons, so manually animating the lip movement for our ten World Characters was completely out of the question. We wanted to ensure our characters were lively, engaging study buddies for our learners! First: That’s a lot of mouth movements to animate! After developing the individual voices for each of the characters, we thought about how to bring them to life – beyond the animations with idle behavior that we had. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |