Hey, @sunsnail, I’m not a Surge user, but I see what you’re dealing with.(Before I launch into being cranky, thank you for your patronage.)
My wavetable .wav files are 32-bit integer .wav files. Pretty much every other commercial wavetable synth accepts them just fine. I see that Surge is expecting 32-bit floating-point files. This isn’t really a ME problem, but a THEM problem, if you know what I mean. If you’re serious about wavetable synthesis, I’d encourage you to get something more professional. But I digress…
Anyway, here’s a test. Go get the zip file here and let me know if you can import it properly:
Let me know what you find.
I have, in fact, converted the entire KRC Mathwaves product (as it exists right now) to a floating point version, but it’s 14 Gigabytes spread across 84 files. Try the example file linked above and let me know your results.
Do the files inside work? I really want to make you satisfied as a user, but this is a format that seems kind of wonky. If there are any other “Surge” users here, speak up if you’d like official support for this.
@keith Thank you so much for taking the time to help and potentially offer a solution! I know that this is more than I could expect from most.
Surge is a synth that I really enjoy using - it falls under the FOSS category which is something that I do appreciate in a piece of software and I have enjoyed the community. It is unfortunate that your files do not work natively! (Which is no fault of yours!)
I have downloaded the provided zip and I am attempting to import ‘RAIM 00010’. And… it works!.. kind of.
I am able to import the file into Surge as you would expect. However, the morph function does not change smoothly between the ‘frames’ of the wavetable and instead jumps abruptly. The file also just plays back as a oneshot - from the documentation I believe that this means there are no loop points present in the file. I was reading about there needing to be correct tagging in the wav file itself, here;
It looks like this is all possible when starting from single cycle waveforms which you do have available in the collection files. Or, alternatively, using the provided GitHub Python script to tag the files with the size so that surge can interpret them correctly.
I would be happy to try and do this myself - but would gladly welcome your help and computer science background at how to batch this out more efficiently!
I know surge is not the most popular synth, but I do believe this would be a great addition to your product portfolio to have as a drag-and-drop compatibility.
I was able to use the script add-surge-metadata.py in the Surge github to modify one of the RAIM wavetables that you provided. This allows surge to interpolate in the table - however, the wavetable does not seem to be correct. I need to do some more digging to see why. It looks like it is just repeating two halves of the first ‘frame’ over and over again. If they were to be considered A and B;
Frame 1 - A
Frame 2 - B
Frame 3 - A
Frame 4 - B
. . .
Last Frame - C
Here is the resulting wavetable
This is awesome! However, now I have to ask, how would I go about creating a python script that will loop through all files in a directory and apply the metadata script to them? I believe that I will have to make use of os and os.path modules.
Hey @sunsnail, thanks for the pointers to the documentation. That’s a little bit whack how they’ve implemented wavetables, but good to see that you’re able to run their Python scripts!
Scripting was going to be my suggestion for how to best do this. I can create a script that can process an entire folder hierarchy of files, converting them to Surge’s funky format. (Might not be today as I’ll be wrestling with a Thanksgiving turkey, but soon! )
Here is the approach that I have been able to get work. This creates a .wav wavetable that will import into Surge, Vital, Serum, and Modwave and playback identically (as close a the DSP coding allows)
Start from your single cycle waveforms (SCWF) and place the ones that you will use to create the final wavetable in their own directory
1a. using a terminal I confirmed with soxi *wav that all SCWF in the directory were the same length (2048)
In terminal append together the SCWF with sox to create one larger file.
2a. For example; sox w_1.wav w_2.wav w_3.wav final_w.wav
2b. this creates file final_w.wav in the same directory
run the script normalize-i16-to-i15.py -n half final_w.wav final_w_norm.wav to normalize the wavetable and prevent clipping in Surge (possibly others?)
run add-surge-metadata.py -s 2048 -i final_w_norm.wav to produce the final wavetable that can be loaded into Surge and others with proper playback and interpolation!
curiously, this resulting table will not work in Modwave.
I don’t know if it is preferable to have a .wav as the final table or to just simplify the process (As can be done with one script from the surge github) and create a .wt as the final format.
That’s where I am at, at least! Looks like it is possible and indeed quite scriptable to create tables that will behave themselves in surge.
You should def license that AI tech to a synth company like Kilohearts or Arturia at some point if it’s ready for big time stuff. It would be cool in Vital too, but getting Tytel is a hard thing to do and that would prolly take like ten years to implement. It only costs the time to send one email to try though.
Heck, why not to have a mathematical wt generator in a synth at the same time. Dunno why synth makers don’t do that atm.
Yeah, while my particular models aren’t designed for “real-time” waveform generation (that is, I haven’t attempted to adapt them to work as an “oscillator”), one could do that. That goal was the premise of a research paper from a couple years ago, so people have had that idea for a while.
While it takes a large amount of memory to train my models (I’ve been training them in Google Colab Pro on an A100 GPU and they utilize nearly all available system and GPU RAM), VAEs are a form of compression in a sense. So, while the training data is around 500,000 32-bit, 2048 -sample waveforms along with a similar number of bytes for their frequency-domain representations (FFTs), the resulting model weights are some where between just 60 and 90 Megabytes.
And though it’s handy (and time efficient) to train on a giant GPU, sampling from the model does happen faster than real-time on a modern CPU and probably wouldn’t take much longer on, say, a compute unit like the Raspberry Pi (as found in the Korg modwave).
So this sort of technology will surely be used in software or hardware synths at some point.
@keith I have some questions about the tables in Polygonal Series Variations.
It appears that all tables in the same ‘category’ and of the same ‘offset’ in the directory except for the random_offset_polygonal_polygon category are identical to one another. I heard this just by ear as I was auditioning through and took the files into audacity to check it out.
Sure enough if you place two files from the same category and offset - but differing in polygon ‘number’ and invert one of them you will achieve a perfect null.
For example;
negative_polygonal_polygon_6_offset_3_1 and negative_polygonal_polygon_3_offset_3_1 are identical. This would hold true for any *offset_3_1 in the negative polygonal category.
Is this by design or is this something gone awry in computation? Not nitpicking - just curious!
Yeah, @sunsnail there are some duplicates in the polygonal series. There was an error in the original generation and so there’s at least one duplicate set of zero offset polygonal waveforms. In fact one set is (I believe) 16-bit and the other is 32-bit. If you’re able to cancel them by phase inversion that just shows you that the diff between 32-bit and 16-bit audio isn’t relevant at these sorts of scales.
(Note: all the wavetables are 32-bit wav files as this is what modwave expects, but the source waveforms were actually computed once as 16-but and then again as 32-bit. They don’t sound audibly different, as I believe you’ve discovered.)
Other than that, there are very few duplicate waveforms and probably zero Yeah, there are some duplicates in the polygonal series. There was an error in the original generation and so there’s at least one duplicate set of zero offset polygonal waveforms. In fact one set is probably 16-not and the other is 32-bit. If you’re able to cancel them by phase inversion that just shows you that the diff between 32-bit and 16-bit audio isn’t relevant at these sorts of scales.
(Note: all the wavetables are 32-bit wav files as this is what modwave expects, but the source waveforms were actually computed once as 16-but and then again as 32-bit. They don’t sound audibly different, as I believe you’ve discovered.)
Other than that, there are very few duplicate waveforms and likely zero other duplicate wavetables unless I accidentally copied some folder to an incorrect location. There are a small number of fully zero or otherwise DC-line waveforms as some of the functions yield those results at certain iterations. That’s not a bug, silence is in fact one possible waveform, right?
Interesting! Thank you for sharing. Would you mind explaining a little more about the naming terminology (or point me to an existing resource) regarding how you went about applying polygonal numbers in the audio domain? Looking at the SCWFs. I see that within fractional polygonal polygon 3 offset 0 there are renders by base and then by harmonic. I can hear what the harmonic changes (naturally) but what exactly does the base signify? Additionally, what does the offset signify / how is it brought into the audio domain both in this directory and others (centered_ngonal, etc.)
Hey @sunsnail, I mention this a bit in the intro video. The various “series” type wavetables (such as the polygonal series and harmonic series, fib series, powers-of-two, etc.) wavetables are constructed by summing partials (harmonics) with a 1/n amplitude relationship with a maximum number of harmonics of increasing number. And then, for variety, we advance the base frequency of the waveform to 2 and 3 and do the same computation again.
So, “base 2” (versus “base 1”) simply means that the waveform is shifted an octave up.
That is, we’re departing from the world of single-cycle waveforms into the realm of double-cycle (and triple-cycle in the case of “base 3” waveforms). Please note that at this resolution shifting the base waveform frequency up is perceived as a timbral change, not as a pitch change.
I will provide you with a script to generate “useful” versions of the wavetables for Surge, but I’d encourage you once again to get a modern wavetable synth and just enjoy the wavetables I’ve created as they’re meant to be explored. (Like Korg modwave which is the undisputed champion of wavetable softsynths today.)
As for “offset”: It’s literally just a harmonic shift parameter. A given harmonic is simply shifted up or down (say from harmonic 2 to 3 for a shift of 1). And then we listen to hear what the result is. We’re not scienceing the shit out of stuff here, we are arting the shit out of stuff here, if you get me.
As for “centered” polygonal numbers versus “regular” polygonal numbers: Here’s what centered polygonal numbers are: Centered polygonal number - Wikipedia
Waveforms/wavetables based on those numbers have a similar sound to the “regular” polygonal numbers. It’s really the relationships between partials that governs the “sound” of the waveforms, not so much the absolute value of the partials themselves. The same goes for the “pyramidal” (and similar) numbers. All of these abstract concepts that relate to “how many shapes of shape n can we pack into shape m” sound similar. They are also very specific and so they’re not particularly useful for subtractive synthesis. They do have a characteristic sound, however, and I love that particular sound.
Ah yes! I can hear how each of those processes are being applied now. I noticed the 3-part appearance to each of the tables when loading them into Vital - neat to see the reason behind it being that way.
I can hear how harmonics are introduced as the ‘sides’ of the polygon are increased. What is so neat to me, is that by taking polygon 7 - which has this beautiful harmonic 7th sound to it - and then selecting offset 7 as the wavetable it has this neat ringing hollow from the inside-out harmonic seventh/dominant quality to it with only one note being played! Really neat stuff.
What is it mathematically that causes such a drastic shift in timbre when going from offset 0 to offset 1 within a given polygonal group? For instance polygon 3 offset 0 vs offset 1. offset zero has this kind of purity to it that none of the other offsets do. It is such a neat timbre shift and my ears can’t quite make out what is causing it - I’m sure its something quite simple mathematically!
I probably haven’t explained “base frequency” enough: If we are creating single-cycle waveforms (as we should when constructing waveforms for use with a wavetable synthesizer), what is the frequency of the waveform? Well, BY DEFINITION, the waveform has a frequency of 1 Hz. (One waveform cycle per second.) That is, the waveform represents the shape of ONE CYCLE, regardless of what musical note frequency we are trying to obtain (e.g., A440, or 440 hertz.). So, for most computations, we should use a base frequency of 1. But we can make the timbre brighter by shifting everything up some number of octaves (to base frequency 2 or 3). Going higher than this shifts the harmonics out of the range of human hearing into the hypersonic and so don’t help. Note that naive synthesizers WILL render audible audio in such cases, but what that is is just foldover. Serum is guilty of this as are most other softsynths (not modwave native however – it simply won’t render the audio, as is appropriate).
Interesting! Is this why the waves for a given polygon at harmonic 12 and 18, sound the same as 9? Because the sounds that they would be producing are inaudible?
Yeah, this sort of thing is why I’m obsessed with the polygonal waveforms. They have a certain something about them that isn’t quantifiable (at present). These are simply mysterious. But the original creator of those was just like, “Oh, here’s a thing. And here’s some variations.” and thought nothing of it. I can tell you that it’s something to do with even harmonics, but that’s about it
Yes, basically. At some point you’ll just be creating the same waveforms, with overtones that nobody can hear. But they are quite special waveforms, don’t you think?
Simply mysterious… I like it. As you said, we don’t have to science it to death - we can be happy just doing art shit every now and then.
I have just begun to dig into the VAE tables - incredible work here Keith! I can’t wait to explore the sounds and to learn more about your processes! These tables are all very musical. I must say I appreciate that as compared to some other commercial packs. I don’t know if is the numeric basis behind it or what but they are all just so useable. You’ve nailed it with this.
I also want to share more of my conversion for Surge - as I do think that it would be beneficial and allow for greater reach in terms of audience. But that’ll be a topic for later. I’m happy to visit in the forum thread - but if you feel it clutters it feel free to shoot me a PM and we can link up!