DeepVocal Voicebank Creation Tutorial - Information
So you want to make a DeepVocal voicebank. Lovely! This guide is specfically for Japanese voicebanks, though a lot of the information here can be used for other languages most likely. I made this tutorial because the documentation that I could find online was significantly lacking.
Some things to say first:
I will sometimes refer to DeepVocal (both the application and the technilogy) as DV, as well as referring to DeepVocal ToolBox as DV TB/DVTB.
DeepVocal is a discontinued product. The development team disappeared (I don’t really know why), and there are no longer updates. If you need help, the first thing to do is (of course) looking it up on the internet, but you might not find anything helpful. I don’t know if there are any DeepVocal-specific Discord servers, but I do know that a lot of servers (for example, Idoloid) have DV-specific channels. There is an official PDF manual here (hosted on Google Drive), but I don’t find it especially helpful since it doesn’t have that much information. Additionally, the DeepVocal homepage is here (it doesn’t have much information, just the downloads)
Using DeepVocal and DeepVocal ToolBox is a challenge. Sometimes it feels like the software is fighting you, sometimes it doesn’t. It is a very awkward process, but I find it very rewarding when you finaly hear your voicebank. Both DeepVocal and DeepVocal ToolBox only run on Windows.
Also, please note that this tutorial assumes that you have (very basic) knowledge on how UTAU voicebanks are made. If you are confused by a term, please use Salem Wasteland’s UTAU Vocabulary page. It also may have some typos.
This is very important. If you find any errors or need help, please email me (kouga-p@proton.me) so that I can try to help you and then update this tutorial accordingly. I want this to be as followable as possible, and you telling me if there are inaccuracies is incredibly helpful!
Next Step: Install Programs