I work at a game studio - we’ve seen other studios come under fire for using AI voice generators, but seeming as Vital is Text-to-Wavetable could it be argued reasonably?
I presume the answer is no, given the following (scraped from Wikipedia and tech sites):
hey Adrien, thanks - I suppose the answer then depends on whether there is any ability of that algorithm to perform any problem solving, whether its ability to use a pre-set accent and pronunciation to create text is in any limited way dynamic or otherwise intelligent.
The service renders one text string to audio. Its only parameter is the text language (English, French, etc.). Not even male or female. There is no dynamism, no intelligence, no problem solving. It is a plain old text-to-speech algorithm.
If I recall correctly, Vital actually makes an API call to Google’s Text-To-Speech engine in order to generate an audio-file and then return it. If this is considered AI is a steep slope but the technology was already invented in the 80s if not earlier. One of the best examples is SAM (Software Automated Mouth) which was a program developed on the Commodore64.
In this case it entirely depends on the implementation of Google, whether it actually uses an LLM in the background for it or not, but the classic/old-school approach is purely software driven
It’s an interesting question, and I think you’re touching on a broader debate surrounding the ethical and legal implications of using AI-generated content in various industries, including gaming.
When it comes to AI voice generators and whether something like Vital (Text-to-Wavetable) could be reasonably argued as a “non-AI” solution, it does depend on how you define “AI” in the context you’re working with.
If you consider strong AI (i.e., autonomous systems with general intelligence), then tools like Vital wouldn’t fall into that category because they don’t exhibit autonomous problem-solving or decision-making. It’s more about mimicking or generating outputs based on pre-programmed algorithms, which aligns more with what’s considered weak AI.
While weak AI doesn’t have the same level of autonomy as strong AI, it’s still capable of imitating human-like behaviors (like creating voices or music) within narrow parameters. In that sense, it might be reasonable to argue that using Vital or similar tools doesn’t fall under the same ethical concerns as fully autonomous AI systems, especially when it’s clearly being used to replicate or synthesize human-like outputs rather than create them entirely independently.
However, the ethical issues surrounding AI-generated content—especially in industries like gaming—aren’t just about whether the AI is “strong” or “weak.” There are deeper concerns about intellectual property, the value of human labor, and transparency in how these technologies are being used. If AI is used to replace human voice actors, for example, that raises questions about fair compensation, consent, and the value of human artistry.
Ultimately, it’s all about transparency and how the technology is integrated into the creative process. If AI-generated content is clearly credited and ethically implemented (e.g., not replacing human creators unfairly), it may avoid some of the controversies. But if it’s being used to bypass fair compensation or obscure its use, that’s where the problem lies.
I think your instinct is correct in that it’s not just about whether it’s “strong” or “weak” AI, but how it’s used and the implications of that use.
I’d argue it’s not AI since we only started calling this kind of stuff “AI” in the last couple years, text to voice has been around since the 00s I remember getting it on my flip phone lol