Microsoft’s VASA-1 can deepfake a person with one photo and one audio track

simplejack , 1 month ago

Microsoft’s research teams always makes some pretty crazy stuff. The problem with Microsoft is that they absolutely suck at translating their lab work into consumer products. Their labs publications are an amazing archive of shit that MS couldn’t get out the door properly or on time. Example - multitouch gesture UIs.

As interesting as this is, I’ll bet MS just ends up using some tech that Open AI launches before MS’s bureaucratic product team can get their shit together.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

werefreeatlast , 1 month ago

Freddie, this is your mom. Look all I want for my birthday is for you to please start using teams new. It's so much better than teams classic. I alread... Microsoft already installed it for you. Okay honey? And could you also start using a microsoft.com account so you can get financially hooked like all the Gmail users? It's pretty smart. Don't you want to be smart like Jonny? Tata!

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

antlion , 1 month ago

Since it’s trained on celebrities, can it do ugly people or would it try to make them prettier in animation?

The teeth change sizes, which is kinda weird, but probably fixable.

It’s not too hard to notice for an up close face shot, but if it was farther away it might be hard - the intonation and facial expressions are spot on. They should use this to re-do all the digital faces in Star Wars.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

venusaur , 1 month ago

One photo? That’s incredible.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

clark , 1 month ago

Yeah. Incredibly horrific.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

T00l_shed , 1 month ago

Yes I hate what AI is becoming capable of. Last year everyone was laughing at the shitty fingers, but were quickly moving past that. I'm concerned that in the near future it will be hard to tell truth from fiction.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

redcalcium , 1 month ago

Combine this with an LLM with speech-to-text input and we could create a talking paintings like in harry potter movies. Heck, hang it on a door and hook it with smart lock to recreate the dorm doors in harry potter and see if people can trick it to open the door.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

NotMyOldRedditName , 1 month ago

Any sufficiently advanced technology is indistinguishable from magic.

Harry Potter wasn't a fantasy movie, it was a SciFi and we just didn't know it.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

venoft , 1 month ago

It was midichlorians all along.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

dumbass , 1 month ago

You're a Jedi 'Arry!

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

popekingjoe , 1 month ago

Imma wot?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

FlyingSquid , 1 month ago

I like your optimism where this doesn't result in making everything worse.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

chatokun , 1 month ago

I was actually discussing this very idea with my brother, who went to the Wizarding World of Harry Potter at Universal Studios, Orrrlandooooo recently and while he enjoyed himself, said it felt like not much is new in theme parks nowadays. Adding in AI driven pictures you could actually talk to might spice it up.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

slaacaa , 1 month ago

“At long last, we have created the Torment Nexus from classic sci-fi novel Don't Create The Torment Nexus”

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Dasus , 1 month ago

One use of this I'm in favour of is recreating Majel Barret's voice as an AI for computer systems.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

kromem , 1 month ago

This project doesn't recreate or simulate voices at all.

It takes a still photograph and created a lip synched video of that person saying the paired full audio clip.

There's other projects that simulate voices.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

trolololol , 1 month ago

Yep it's part of it to generate the sound track

One of the videos show the voice changing in mid sentence

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

kromem , 1 month ago

No, it isn't. In that clip they are taking two different sound clips as they are switching faces. It's not changing the 'voice' of saying some phrase on the fly. It's two separate pre-recorded clips.

Literally from the article:

It does not clone or simulate voices (like other Microsoft research) but relies on an existing audio input that could be specially recorded or spoken for a particular purpose.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

dhork , 1 month ago

Vasa? Like, the Swedish ship that sank 10 minutes after it was launched? Who named that project?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Jimmycakes , 1 month ago

They developed an ai to name all future ai. Ironically it is unnamed.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

dumbass , 1 month ago

There are a lot of flying vehicles named after birds who famously plummet to the ground at breakneck speeds.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

hakunawazo , 1 month ago

No, like the crispbread.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

AnAnonymous , 1 month ago

Paranoia vibes starting in 3, 2, 1..

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

ArmoredThirteen , 1 month ago

These vids are just off enough that I think doing a bunch of mushrooms and watching them would be a deeply haunting experience

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

return2ozma OP , 1 month ago

The first video her bottom teeth shift around.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Dozzi92 , 1 month ago (edited 1 month ago)

So essentially the music video for Drugs by Ratatat.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

BetaDoggo_ , 1 month ago

The "why would they make this" people don't understand how important this type of research is. It's important to show what's possible so that we can be ready for it. There are many bad actors already pursuing similar tools if they don't have them already. The worst case is being blindsided by something not seen before.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

spiderman , 1 month ago

how important this type of research

I hope they also figure a way to find the bad actors who might use their tools for harmful purposes. You can't just create something for "research" purposes like this and not find a way to stop bad actors using these for harmful purposes.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

ReallyActuallyFrankenstein , 1 month ago

I mean, I know it's scary, but I'll admit it is impressive, even when I watched it with jaded "every day is another AI breakthrough" exhaustion.

The subtle face movements, eyebrow expression, everything seems to correctly infer how the face would articulate those specific words. When you think of how many decades something like this would be in the uncanny valley even with a team of trained people hand -tweaking the image and video, and this is doing it better in nearly every way, automatically, with just an image? Insane.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Speculater , 1 month ago

And you can run it on a single 4090, that's crazy.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

mPony , 1 month ago

uh, are graphics cards supposed to be 2500 bucks? (I play boardgames)

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

techt , 1 month ago

Crypto did unfortunate things to the space.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Speculater , 1 month ago

What I don't understand is why they didn't go back down when crypto moved to proof of stake. Fuck AMD and Nvidia for price fixing.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

kromem , 1 month ago

It's pretty wild that this is the tech being produced by the trillion dollar company who has already been granted a patent on creating digital resurrections of dead people from the data they left behind.

So we now already have LLMs that could take what you said and say new things that seem like what you would have said, take a voice sample of you and create new voice synthesis of that text where it sounds a lot like you were actually saying it, and can take a photo of you and make a video where you legit look like you are saying that voice sample with facial expressions and all.

And this could be done for anyone who has a social media profile with a few dozen text posts, a profile photo, and a 15 second sample of their voice.

I really don't get how every single person isn't just having a daily existential crisis questioning the nature of their present reality given what's coming.

Do people just think the current trends aren't going to continue, or just don't think about the notion that what happens in the future could in fact have been their own nonlocal past?

It reminds me of a millennia old saying by a group that were claiming we were copies in the images of original humans: "you do not know how to examine the present moment."

Edit - bonus saying on the topic: "When you see your likeness, you are happy. But when you see your images that came into being before you and that neither die nor become visible, how much you will have to bear!"

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Maeve , 1 month ago

A long time ago, someone from a not free country wrote a white paper on why we should care about privacy, because written words can be edited to level false accusations (charges) with false evidence. This chills me to the bone.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Sanctus , 1 month ago

"You shot that man, citizen. Here is video evidence. Put your hands against the wall."
- and more coming to you soon!

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

tal , 1 month ago

I'd be less-concerned about the impact on not-free countries than free countries. Dictator Bob doesn't need evidence to have the justice system get rid of you, because he controls the justice system.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

Kiosade , 1 month ago

This is turning into some Mistborn shit. “Don’t trust writing not written on metal”

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

I_Miss_Daniel , 1 month ago

Feed it Microsoft Merlin. What will happen?

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

thefartographer , 1 month ago

The pores don't stretch, but the teeth and irises sure do!

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...

T00l_shed , 1 month ago

I'm sure they will fix that before you know it.

Reply

Report

Activity

Open original URL

Copy original URL

Copy Mbin URL

Loading...