Zakar
banner
zakar.bsky.social
Zakar
@zakar.bsky.social
25 followers 60 following 370 posts
RPG + Adventure game fan || He/Him || Opinions are my own L2D Art + Rig: 从小就能吃_from Bilibili http://youtube.com/@Zakarith http://twitch.tv/zakarith
Posts Media Videos Starter Packs
Pinned
I'm Zakarith! A little fox spirit trying to figure out the whole "streaming" thing. I hope you'll be able to help me.

L2D Art + Rig: 从小就能吃_from Bilibili

#vtuber #envtuber #vtuberen #vtubers #vtuberuprising #envtubers
It'll have the audio file I downloaded from the TTS I just did with the Fenrir voice.

The plosive that's noticeable happens at the 40 second mark.

The script is the script from this video: www.youtube.com/watch?v=qkV5...

I will say, plosives are NOT common with the TTS voice it seems like.
Creating My First Wonderland Component 3. Building An Interactive Interface
YouTube video by Genshin Impact - Miliastra Wonderland
www.youtube.com
So this isn't a video example or anything like that, but the Fenrir voice for Geminis TTS model can have plosives noticeable. I most noticed it when it would say "point", with the plosives not happening at other points.

aistudio.google.com/app/generate...

Do you mind if I DM you a Gdrive link? 1/2
I've got none on hand I'm sorry to say, which I know is the worst answer to give. If I remember to take a look after I'm done working+deal with some personal life stuff later I'll see if I can find one.

It's hard to find one quickly when all the results are about removing said pops and plosives tbh
TTS models can have this happen with the vocoder AI model they use.

I'm pretty sure we're talking about the same thing with TTS and AI audio. Modern TTS uses two AI trained models to work, so they are AI generated voices.

TTS nowadays is AI generated audio.
Modern TTS does not use concatentative or parametric synthesis anymore which is the older ways.

Modern TTS uses 2 models, a model that turns text into Mel-Frequency Cepstral Coefficients, and then another that takes those to create audio.
For what it's worth some TTS models do include the ability to includes stutters, even on a specific word if you'd like. It sounds like one of the newer "higher end" models you can get on HuggingFace.

They really should have just hired someone for the VO.
Basically all modern TTS are AI models already, so the "it's just Text to Speech" point doesn't work anyway.
Finally have hot water in my house again. Haven't really had access since Friday.
Dual wield fists.
I have now found an unrelated, cast iron water pipe leak as well. So two water issues.

We always make sure we have enough $$$ on hand for emergencies like this, but we just bought a new driveway so this sucks 😭
17th game beaten this year: Final Fantasy Tactics
Our water heater has officially broken and I gotta deal with that 🙃
He's just so sweet and loves affection 🥺
Maybe it runs on Eastern Time.
I got a leak for Legends ZA on Facebook when I opened it up to message some extended family. 😭
My neighbor has begun loud construction on their house and it shakes my house sometimes ;-;
Can my dog not wander in and rip the stankiest fart known to exist 😭🤢😭
Probably my favorite brand of chocolate bar, top tier choice.
The description already says what the games are. MH Stories 3, PRAGMATA, Onimusha, and RE9.
WOOOO causing lifelong trauma!
Ever since I got back from my trip and have been sick, my car Tyke has been all over me with snuggles. She's forced her way onto my arm daily.
I checked that person's profile, it just reads like a shitpost bit they did. They've done a lot of shitposts.
What she said is definitely right in some contexts, but not at all in the discussion about Generative AI itself which makes it a baffling comparison.
Alright let's see what all the fuss is about.