Here is the approach that I have been able to get work. This creates a .wav wavetable that will import into Surge, Vital, Serum, and Modwave and playback identically (as close a the DSP coding allows)
Start from your single cycle waveforms (SCWF) and place the ones that you will use to create the final wavetable in their own directory
1a. using a terminal I confirmed with soxi *wav that all SCWF in the directory were the same length (2048)
In terminal append together the SCWF with sox to create one larger file.
2a. For example; sox w_1.wav w_2.wav w_3.wav final_w.wav
2b. this creates file final_w.wav in the same directory
run the script normalize-i16-to-i15.py -n half final_w.wav final_w_norm.wav to normalize the wavetable and prevent clipping in Surge (possibly others?)
run add-surge-metadata.py -s 2048 -i final_w_norm.wav to produce the final wavetable that can be loaded into Surge and others with proper playback and interpolation!
curiously, this resulting table will not work in Modwave.
I don’t know if it is preferable to have a .wav as the final table or to just simplify the process (As can be done with one script from the surge github) and create a .wt as the final format.
That’s where I am at, at least! Looks like it is possible and indeed quite scriptable to create tables that will behave themselves in surge.
You should def license that AI tech to a synth company like Kilohearts or Arturia at some point if it’s ready for big time stuff. It would be cool in Vital too, but getting Tytel is a hard thing to do and that would prolly take like ten years to implement. It only costs the time to send one email to try though.
Heck, why not to have a mathematical wt generator in a synth at the same time. Dunno why synth makers don’t do that atm.
Yeah, while my particular models aren’t designed for “real-time” waveform generation (that is, I haven’t attempted to adapt them to work as an “oscillator”), one could do that. That goal was the premise of a research paper from a couple years ago, so people have had that idea for a while.
While it takes a large amount of memory to train my models (I’ve been training them in Google Colab Pro on an A100 GPU and they utilize nearly all available system and GPU RAM), VAEs are a form of compression in a sense. So, while the training data is around 500,000 32-bit, 2048 -sample waveforms along with a similar number of bytes for their frequency-domain representations (FFTs), the resulting model weights are some where between just 60 and 90 Megabytes.
And though it’s handy (and time efficient) to train on a giant GPU, sampling from the model does happen faster than real-time on a modern CPU and probably wouldn’t take much longer on, say, a compute unit like the Raspberry Pi (as found in the Korg modwave).
So this sort of technology will surely be used in software or hardware synths at some point.
@keith I have some questions about the tables in Polygonal Series Variations.
It appears that all tables in the same ‘category’ and of the same ‘offset’ in the directory except for the random_offset_polygonal_polygon category are identical to one another. I heard this just by ear as I was auditioning through and took the files into audacity to check it out.
Sure enough if you place two files from the same category and offset - but differing in polygon ‘number’ and invert one of them you will achieve a perfect null.
For example;
negative_polygonal_polygon_6_offset_3_1 and negative_polygonal_polygon_3_offset_3_1 are identical. This would hold true for any *offset_3_1 in the negative polygonal category.
Is this by design or is this something gone awry in computation? Not nitpicking - just curious!
Yeah, @sunsnail there are some duplicates in the polygonal series. There was an error in the original generation and so there’s at least one duplicate set of zero offset polygonal waveforms. In fact one set is (I believe) 16-bit and the other is 32-bit. If you’re able to cancel them by phase inversion that just shows you that the diff between 32-bit and 16-bit audio isn’t relevant at these sorts of scales.
(Note: all the wavetables are 32-bit wav files as this is what modwave expects, but the source waveforms were actually computed once as 16-but and then again as 32-bit. They don’t sound audibly different, as I believe you’ve discovered.)
Other than that, there are very few duplicate waveforms and probably zero Yeah, there are some duplicates in the polygonal series. There was an error in the original generation and so there’s at least one duplicate set of zero offset polygonal waveforms. In fact one set is probably 16-not and the other is 32-bit. If you’re able to cancel them by phase inversion that just shows you that the diff between 32-bit and 16-bit audio isn’t relevant at these sorts of scales.
(Note: all the wavetables are 32-bit wav files as this is what modwave expects, but the source waveforms were actually computed once as 16-but and then again as 32-bit. They don’t sound audibly different, as I believe you’ve discovered.)
Other than that, there are very few duplicate waveforms and likely zero other duplicate wavetables unless I accidentally copied some folder to an incorrect location. There are a small number of fully zero or otherwise DC-line waveforms as some of the functions yield those results at certain iterations. That’s not a bug, silence is in fact one possible waveform, right?
Interesting! Thank you for sharing. Would you mind explaining a little more about the naming terminology (or point me to an existing resource) regarding how you went about applying polygonal numbers in the audio domain? Looking at the SCWFs. I see that within fractional polygonal polygon 3 offset 0 there are renders by base and then by harmonic. I can hear what the harmonic changes (naturally) but what exactly does the base signify? Additionally, what does the offset signify / how is it brought into the audio domain both in this directory and others (centered_ngonal, etc.)
Hey @sunsnail, I mention this a bit in the intro video. The various “series” type wavetables (such as the polygonal series and harmonic series, fib series, powers-of-two, etc.) wavetables are constructed by summing partials (harmonics) with a 1/n amplitude relationship with a maximum number of harmonics of increasing number. And then, for variety, we advance the base frequency of the waveform to 2 and 3 and do the same computation again.
So, “base 2” (versus “base 1”) simply means that the waveform is shifted an octave up.
That is, we’re departing from the world of single-cycle waveforms into the realm of double-cycle (and triple-cycle in the case of “base 3” waveforms). Please note that at this resolution shifting the base waveform frequency up is perceived as a timbral change, not as a pitch change.
I will provide you with a script to generate “useful” versions of the wavetables for Surge, but I’d encourage you once again to get a modern wavetable synth and just enjoy the wavetables I’ve created as they’re meant to be explored. (Like Korg modwave which is the undisputed champion of wavetable softsynths today.)
As for “offset”: It’s literally just a harmonic shift parameter. A given harmonic is simply shifted up or down (say from harmonic 2 to 3 for a shift of 1). And then we listen to hear what the result is. We’re not scienceing the shit out of stuff here, we are arting the shit out of stuff here, if you get me.
As for “centered” polygonal numbers versus “regular” polygonal numbers: Here’s what centered polygonal numbers are: Centered polygonal number - Wikipedia
Waveforms/wavetables based on those numbers have a similar sound to the “regular” polygonal numbers. It’s really the relationships between partials that governs the “sound” of the waveforms, not so much the absolute value of the partials themselves. The same goes for the “pyramidal” (and similar) numbers. All of these abstract concepts that relate to “how many shapes of shape n can we pack into shape m” sound similar. They are also very specific and so they’re not particularly useful for subtractive synthesis. They do have a characteristic sound, however, and I love that particular sound.
Ah yes! I can hear how each of those processes are being applied now. I noticed the 3-part appearance to each of the tables when loading them into Vital - neat to see the reason behind it being that way.
I can hear how harmonics are introduced as the ‘sides’ of the polygon are increased. What is so neat to me, is that by taking polygon 7 - which has this beautiful harmonic 7th sound to it - and then selecting offset 7 as the wavetable it has this neat ringing hollow from the inside-out harmonic seventh/dominant quality to it with only one note being played! Really neat stuff.
What is it mathematically that causes such a drastic shift in timbre when going from offset 0 to offset 1 within a given polygonal group? For instance polygon 3 offset 0 vs offset 1. offset zero has this kind of purity to it that none of the other offsets do. It is such a neat timbre shift and my ears can’t quite make out what is causing it - I’m sure its something quite simple mathematically!
I probably haven’t explained “base frequency” enough: If we are creating single-cycle waveforms (as we should when constructing waveforms for use with a wavetable synthesizer), what is the frequency of the waveform? Well, BY DEFINITION, the waveform has a frequency of 1 Hz. (One waveform cycle per second.) That is, the waveform represents the shape of ONE CYCLE, regardless of what musical note frequency we are trying to obtain (e.g., A440, or 440 hertz.). So, for most computations, we should use a base frequency of 1. But we can make the timbre brighter by shifting everything up some number of octaves (to base frequency 2 or 3). Going higher than this shifts the harmonics out of the range of human hearing into the hypersonic and so don’t help. Note that naive synthesizers WILL render audible audio in such cases, but what that is is just foldover. Serum is guilty of this as are most other softsynths (not modwave native however – it simply won’t render the audio, as is appropriate).
Interesting! Is this why the waves for a given polygon at harmonic 12 and 18, sound the same as 9? Because the sounds that they would be producing are inaudible?
Yeah, this sort of thing is why I’m obsessed with the polygonal waveforms. They have a certain something about them that isn’t quantifiable (at present). These are simply mysterious. But the original creator of those was just like, “Oh, here’s a thing. And here’s some variations.” and thought nothing of it. I can tell you that it’s something to do with even harmonics, but that’s about it
Yes, basically. At some point you’ll just be creating the same waveforms, with overtones that nobody can hear. But they are quite special waveforms, don’t you think?
Simply mysterious… I like it. As you said, we don’t have to science it to death - we can be happy just doing art shit every now and then.
I have just begun to dig into the VAE tables - incredible work here Keith! I can’t wait to explore the sounds and to learn more about your processes! These tables are all very musical. I must say I appreciate that as compared to some other commercial packs. I don’t know if is the numeric basis behind it or what but they are all just so useable. You’ve nailed it with this.
I also want to share more of my conversion for Surge - as I do think that it would be beneficial and allow for greater reach in terms of audience. But that’ll be a topic for later. I’m happy to visit in the forum thread - but if you feel it clutters it feel free to shoot me a PM and we can link up!
NOW: One thing that’s very interesting is that machine learning can learn those waveforms and spit novel variations back to us. That’s shown in the various “VAE” wavetables. Browse those and you’ll discover the same ringy tones in the dataset. Again, it’s still confusing as to why these are so compelling, but even just a hint of them is interesting.
It’s a sonic mystery at the moment. At some point one just has to accept that “these waveforms sound cool” and leave it at that.
Howdy, @sunsnail, I haven’t forgotten about your request for help with bulk conversion of wavetables to a Surge format. I just haven’t had the time to dig into their Python scripts but I’m going to try and spend some time on that tomorrow. Sorry for the delay.
Here’s a new video where I talk about the new AI-generated wavetables. Basically, I’ve been building various variational autoencoder (VAE) models that are trained on a superset of the Mathwaves waveforms:
Oh, and there’s still a discount on the full collection, at least until tomorrow using code CYBER23 (or by following this link).
Very nice to see another demo with a little bit more delving in to how each collection came about!
Have you considered for a future orbit set - 28:11 - to include a fitness marker in the model that will look for a certain amount of deviation or difference between the two selected dimensions? I noticed that too, some orbits are essentially just a single cycle because of the lack of dimensional change. Might be neat to guarantee/codify a model that will always exhibit audible change!