Could anyone on the Vital team (or power users in the know) provide some insight into the encoding methods used for the samples and wave_data values in the presets?
I am building an AI model to design sounds based on user input, but am blocked by this step. After knowing the encoding method I can build a data pipeline to allow users to describe the sounds they want in natural language, and have that sound generated as a Vital preset. Any help would be appreciated!
I’ve been trying to figure this out for a while though, no luck though. It’s probably somewhere in the source code, but I’m not c++ savy enough to find it and Vital has no documentation at all.
I’m guessing it’s a audio file without all the extra metadata and just the audio content, but I haven’t put in the effort of figuring out how to extract that into a useable format.
Good luck in your search!