Welcome to Incremental Social! Learn more about this project here!
Check out lemmyverse to find more communities to join from here!

Court Bans Use of 'AI-Enhanced' Video Evidence Because That's Not How AI Works

A judge in Washington state has blocked video evidence that’s been “AI-enhanced” from being submitted in a triple murder trial. And that’s a good thing, given the fact that too many people seem to think applying an AI filter can give them access to secret visual data.

rustyfish ,
@rustyfish@lemmy.world avatar

For example, there was a widespread conspiracy theory that Chris Rock was wearing some kind of face pad when he was slapped by Will Smith at the Academy Awards in 2022. The theory started because people started running screenshots of the slap through image upscalers, believing they could get a better look at what was happening.

Sometimes I think, our ancestors shouldn’t have made it out of the ocean.

someguy3 ,

Enhance. [Click click click.]

Enhance. [Click click click.]

...Enhance. [Click...click...click...]

emptyother ,
@emptyother@programming.dev avatar

How long until we got upscalers of various sorts built into tech that shouldn't have it? For bandwidth reduction, for storage compression, or cost savings. Can we trust what we capture with a digital camera, when companies replace a low quality image of the moon with a professionally taken picture, at capture time? Can sport replays be trusted when the ball is upscaled inside the judges' screens? Cheap security cams with "enhanced night vision" might get somebody jailed.

I love the AI tech. But its future worries me.

Jimmycakes ,

It will wild out for the foreseeable future until the masses stop falling for it in gimmicks then it will be reserved for the actual use cases where it's beneficial once the bullshit ai stops making money.

pearsaltchocolatebar ,

Lol, you think the masses will stop falling for it in gimmicks? Just look at the state of the world.

someguy3 ,

Dehance! [Click click click.]

meco03211 ,

Just print the damn thing!

aeronmelon ,

That scene gets replayed in my mind three or four times a month.

MudMan ,
@MudMan@fedia.io avatar

Not all of those are the same thing. AI upscaling for compression in online video may not be any worse than "dumb" compression in terms of loss of data or detail, but you don't want to treat a simple upscale of an image as a photographic image for evidence in a trial. Sport replays and hawkeye technology doesn't really rely on upscaling, we have ways to track things in an enclosed volume very accurately now that are demonstrably more precise than a human ref looking at them. Whether that's better or worse for the game's pace and excitement is a different question.

The thing is, ML tech isn't a single thing. The tech itself can be used very rigorously. Pretty much every scientific study you get these days uses ML to compile or process images or data. That's not a problem if done correctly. The issue is everybody is both assuming "generative AI" chatbots, upscalers and image processers are what ML is and people keep trying to apply those things directly in the dumbest possible way thinking it is basically magic.

I'm not particularly afraid of "AI tech", but I sure am increasingly annoyed at the stupidity and greed of some of the people peddling it, criticising it and using it.

GenderNeutralBro ,

AI-based video codecs are on the way. This isn't necessarily a bad thing because it could be designed to be lossless or at least less lossy than modern codecs. But compression artifacts will likely be harder to identify as such. That's a good thing for film and TV, but a bad thing for, say, security cameras.

The devil's in the details and "AI" is way too broad a term. There are a lot of ways this could be implemented.

DarkenLM ,

I don't think AI codecs will be anything revolutionary. There are plenty of lossless codecs already, but if you want more detail, you'll need a better physical sensor, and I doubt there's anything that can be done to go around that (that actually represents what exists, not an hallucination).

foggenbooty ,

It's an interesting thought experiment, but we don't actually see what really exists, our brains essentially are AI vision, filling in things we don't actually perceive. Examples are movement while we're blinking, objects and colors in our peripheral vision, the state of objects when our eyes dart around, etc.

The difference is we can't go back frame by frame and analyze these "hallucinations" since they're not recorded. I think AI enhanced video will actually bring us closer to what humans see even if some of the data doesn't "exist", but the article is correct that it should never be used as evidence.

GenderNeutralBro ,

There are plenty of lossless codecs already

It remains to be seen, of course, but I expect to be able to get lossless (or nearly-lossless) video at a much lower bitrate, at the expense of a much larger and more compute/memory-intensive codec.

The way I see it working is that the codec would include a general-purpose model, and video files would be encoded for that model + a file-level plugin model (like a LoRA) that's fitted for that specific video.

Hexarei ,
@Hexarei@programming.dev avatar

Nvidia's rtx video upscaling is trying to be just that: DLSS but you run it on a video stream instead of a game running on your own hardware. They've posited the idea of game streaming becoming lower bit rate just so you can upscale it locally, which to me sounds like complete garbage

Natanael ,

I think there's a possibility for long format video of stable scenes to use ML for higher compression ratios by deriving a video specific model of the objects in the frame and then describing their movements (essentially reducing the actual frames to wire frame models instead of image frames, then painting them in from the model).

But that's a very specific thing that probably only work well for certain types of video content (think animated stuff)

jeeva ,

I don't think loss is what people are worried about, really - more injecting details that fit the training data but don't exist in the source.

Given the hoopla Hollywood and directors made about frame-interpolation, do you think generated frames will be any better/more popular?

GenderNeutralBro ,

In the context of video encoding, any manufactured/hallucinated detail would count as "loss". Loss is anything that's not in the original source. The loss you see in e.g. MPEG4 video usually looks like squiggly lines, blocky noise, or smearing. But if an AI encoder inserts a bear on a tricycle in the background, that would also be a lossy compression artifact in context.

As for frame interpolation, it could definitely be better, because the current algorithms out there are not good. It will not likely be more popular, since this is generally viewed as an artistic matter rather than a technical matter. For example, a lot of people hated the high frame rate in the Hobbit films despite the fact that it was a naturally high frame rate, filmed with high-frame-rate cameras. It was not the product of a kind-of-shitty algorithm applied after the fact.

Mango ,

Han shot first.

elephantium ,
@elephantium@lemmy.world avatar

Over Greedo's dead body.

Mango ,

Correct!

Buelldozer ,
@Buelldozer@lemmy.today avatar

AI-based video codecs are on the way.

Arguably already here.

Look at this description of Samsungs mobile AI for their S24 phone and newer tablets:

AI-powered image and video editing

Galaxy AI also features various image and video editing features. If you have an image that is not level (horizontally or vertically) with respect to the object, scene, or subject, you can correct its angle without losing other parts of the image. The blank parts of that angle-corrected image are filled with Generative AI-powered content. The image editor tries to fill in the blank parts of the image with AI-generated content that suits the best. You can also erase objects or subjects in an image. Another feature lets you select an object/subject in an image and change its position, angle, or size.

It can also turn normal videos into slow-motion videos. While a video is playing, you need to hold the screen for the duration of the video that you want to be converted into slow-motion, and AI will generate frames and insert them between real frames to create a slow-motion effect.

dojan ,
@dojan@lemmy.world avatar

Probably not far. NVidia has had machine learning enhanced upscaling of video games for years at this point, and now they've also implemented similar tech but for frame interpolation. The rendered output might be 720p at 20FPS but will be presented at 1080p 60FPS.

It's not a stretch to assume you could apply similar tech elsewhere. Non-ML enhanced, yet still decently sophisticated frame interpolation and upscaling has been around for ages.

MrPoopbutt ,

Nvidias game upscaling has access to game data and also training data generated by gameplay to make footage that is appealing to the gamers eye and not necessarily accurate. Security (or other) cameras don't have access to this extra data and the use case for video in courts is to be accurate, not pleasing.

Your comparison is apples to oranges.

dojan ,
@dojan@lemmy.world avatar

No, I think you misunderstood what I'm trying to say. We already have tech that uses machine learning to upscale stuff in real-time, but I'm not that it's accurate on things like court videos. I don't think we'll ever get to a point where it can be accurate as evidence because by the very nature of the tech it's making up detail, not enhancing it. You can't enhance what isn't there. It's not turning nothing into accurate data, it's guessing based on input and what it's been trained on.

Prime example right here, this is the objectively best version of Alice in Wonderland, produced by BBC in 1999, and released on VHS. As far as I can tell there was never a high quality version available. Someone used machine learning to upscale it, and overall it looks great, but there are scenes (such as the one that's linked) where you can clearly see the flaws. Tina Majorino has no face, because in the original data, there wasn't enough detail to discern a face.

Now we could obviously train a model to recognise "criminal activity", like stabbing, shooting, what have you. Then, however, you end up with models that mistake one thing for another, like scratching your temple turning into driving while on the phone, now if instead of detecting something, the model's job is to fill in missing data we have a recipe for disaster.

Any evidence that has had machine learning involved should be treated with at least as much scrutiny as a forensic sketch, while while they can be useful in investigations, generally don't carry much weight as evidence. That said, a forensic sketch is created through collaboration with an artist and a witness, so there is intent behind those. Machine generated artwork lacks intent, you can tweak the parameters until it generates roughly what you want, but it's honestly better to just hire an artist and get exactly what you want.

Buelldozer ,
@Buelldozer@lemmy.today avatar

Security (or other) cameras don’t have access to this extra data

Samsung's AI on their latest phones and tablets does EXACTLY what @MrPoopbutt is describing. It will literally create data including parts of scenes and even full frames, in order to make video look better.

So while a true security camera may not be able to do it there's now widely available consumer products that WILL. You're also forgetting that even Security Camera footage can be processed through software so footage from those isn't immune to AI fiddling either.

MrPoopbutt ,

Would that not fall under the "enhanced" evidence that is banned by this court decision?

Bread ,

The real question is could we ever really trust photographs before AI? Image manipulation has been a thing long before the digital camera and Photoshop. What makes these images we see actually real? Cameras have been miscapturing image data for as long as they have existed. Do the light levels in a photo match what was actually there according to the human eye? Usually not. What makes a photo real?

emptyother ,
@emptyother@programming.dev avatar

They can. But theres a reasonable level of trust that a security feed has been kept secure and not tampered with by the owner if he doesnt have a motive. But what if not even the owner know that somewhere in their tech chain, maybe the camera, maybe the screen, maybe the storage device, maybe all 3, the image was "improved". No evidence of tampering. We'll have the police blaming Count Rugen for a bank robbery he didnt do, but the camera clearly shows a six fingered man!

CileTheSane ,
@CileTheSane@lemmy.ca avatar

It's already being used for things it shouldn't be.

elephantium ,
@elephantium@lemmy.world avatar

Cheap security cams with “enhanced night vision” might get somebody jailed.

Might? We've been arresting the wrong people based on shitty facial recognition for at least 5 years now. This article has examples from 2019.

On one hand, the potential of this type of technology is impressive. OTOH, the failures are super disturbing.

Downcount , (edited )

If you ever encountered an AI hallucinating stuff that just does not exist at all, you know how bad the idea of AI enhanced evidence actually is.

turkalino ,
@turkalino@lemmy.yachts avatar

Everyone uses the word "hallucinate" when describing visual AI because it's normie-friendly and cool sounding, but the results are a product of math. Very complex math, yes, but computers aren't taking drugs and randomly pooping out images because computers can't do anything truly random.

You know what else uses math? Basically every image modification algorithm, including resizing. I wonder how this judge would feel about viewing a 720p video on a 4k courtroom TV because "hallucination" takes place in that case too.

Downcount ,

There is a huge difference between interpolating pixels and inserting whole objects into pictures.

turkalino ,
@turkalino@lemmy.yachts avatar

Both insert pixels that didn't exist before, so where do we draw the line of how much of that is acceptable?

Downcount ,

Look it this way: If you have an unreadable licence plate because of low resolution, interpolating won't make it readable (as long as we didn't switch to a CSI universe). An AI, on the other hand, could just "invent" (I know, I know, normy speak in your eyes) a readable one.

You will draw yourself the line when you get your first ticket for speeding, when it wasn't your car.

turkalino ,
@turkalino@lemmy.yachts avatar

Interesting example, because tickets issued by automated cameras aren't enforced in most places in the US. You can safely ignore those tickets and the police won't do anything about it because they know how faulty these systems are and most of the cameras are owned by private companies anyway.

"Readable" is a subjective matter of interpretation, so again, I'm confused on how exactly you're distinguishing good & pure fictional pixels from bad & evil fictional pixels

Downcount ,

Being tickets enforced or not doesn't change my argumentation nor invalidates it.

You are acting stubborn and childish. Everything there was to say has been said. If you still think you are right, do it, as you are not able or willing to understand. Let me be clear: I think you are trolling and I'm not in any mood to participate in this anymore.

turkalino ,
@turkalino@lemmy.yachts avatar

Sorry, it's just that I work in a field where making distinctions is based on math and/or logic, while you're making a distinction between AI- and non-AI-based image interpolation based on opinion and subjective observation

pm_me_your_thoughts ,

Okay, I'm not disagreeing with you about the fact that its all math.

However, interpolation or pixels is simple math. AI generated is complex math and is only as good as its training data.

The licence example is a good one. In interpolation, it'll just find some average, midpoint, etc and fill the pixel. In AI gen, if the training set had your number plate 999 times in a set of 1000, it will generate your numberplate no matter whose plate you input. to use it as evidence would need it to be far more deterministic than the probabilistic nature of AI gen content.

abhibeckert , (edited )

You can safely ignore those tickets and the police won’t do anything

Wait what? No.

It's entirely possible if you ignore the ticket, a human might review it and find there's insufficient evidence. But if, for example, you ran a red light and they have a photo that shows your number plate and your face... then you don't want to ignore that ticket. And they generally take multiple photos, so even if the one you received on the ticket doesn't identify you, that doesn't mean you're safe.

When automated infringement systems were brand new the cameras were low quality / poorly installed / didn't gather evidence necessary to win a court challenge... getting tickets overturned was so easy they didn't even bother taking it to court. But it's not that easy now, they have picked up their game and are continuing to improve the technology.

Also - if you claim someone else was driving your car, and then they prove in court that you were driving... congratulations, your slap on the wrist fine is now a much more serious matter.

Natanael ,

License plates is an interesting case because with a known set of visual symbols (known fonts used by approved plate issuers) you can often accurately deblur even very very blurry text (but not with AI algorithms, but rather by modeling the blur of the cameras and the unique blur gradients this results in for each letter). It does require a certain minimum pixel resolution of the letters to guarantee unambiguity though.

Blackmist ,

I mean we "invent" pixels anyway for pretty much all digital photography based on Bayer filters.

But the answer is linear interpolation. That's where we draw the line. We have to be able to point to a line of code and say where the data came from, rather than a giant blob of image data that could contain anything.

Catoblepas ,

What’s your bank account information? I’m either going to add or subtract a lot of money from it. Both alter your account balance so you should be fine with either right?

becausechemistry ,

It’s not AI, it’s PISS. Plagiarized information synthesis software.

recapitated ,

Just like us!

Catoblepas ,

Has this argument ever worked on anyone who has ever touched a digital camera? “Resizing video is just like running it through AI to invent details that didn’t exist in the original image”?

“It uses math” isn’t the complaint and I’m pretty sure you know that.

FlyingSquid ,
@FlyingSquid@lemmy.world avatar

normie-friendly

Whenever people say things like this, I wonder why that person thinks they're so much better than everyone else.

turkalino ,
@turkalino@lemmy.yachts avatar

Normie, layman... as you've pointed out, it's difficult to use these words without sounding condescending (which I didn't mean to be). The media using words like "hallucinate" to describe linear algebra is necessary because most people just don't know enough math to understand the fundamentals of deep learning - which is completely fine, people can't know everything and everyone has their own specialties. But any time you simplify science so that it can be digestible by the masses, you lose critical information in the process, which can sometimes be harmfully misleading.

Krauerking ,

Or sometimes the colloquial term people have picked up is a simplified tool for getting the right point across.

Just because it's guessing using math doesn't mean it isn't hallucinating in a sense the additional data. It did not exist before and it willed it into existence much like a hallucination while being easy for people to catch onto quickly as not trustworthy thanks to previous definitions and understanding of the word.

Part of language is finding the right words to use so that people can quickly understand topics even if it means giving up nuance but absolutely it should be based on getting them to the right conclusion even if in a simplified form which doesn't always happen when there is bias. I think this one works just fine.

Hackerman_uwu ,

LLMs (the models that “hallucinate” is most often used in conjunction with) are not Deep Learning normie.

turkalino ,
@turkalino@lemmy.yachts avatar

https://en.m.wikipedia.org/wiki/Large_language_model

LLMs are artificial neural networks

https://en.m.wikipedia.org/wiki/Neural_network_(machine_learning)

A network is typically called a deep neural network if it has at least 2 hidden layers

Hackerman_uwu ,

I’m not going to bother arguing with you but for anyone reading this: the poster above is making a bad faith semantic argument.

In the strictest technical terms AI, ML and Deep Learning are district, and they have specific applications.

This insufferable asshat is arguing that since they all use fuel, fire and air they are all engines. Which’s isn’t wrong but it’s also not the argument we are having.

@OP good day.

turkalino ,
@turkalino@lemmy.yachts avatar

When you want to cite sources like me instead of making personal attacks, I’ll be here 🙂

Hackerman_uwu ,

I said good day.

turkalino ,
@turkalino@lemmy.yachts avatar

Ok but before you go, just want to make sure you know that this statement of yours is incorrect:

In the strictest technical terms AI, ML and Deep Learning are district, and they have specific applications

Actually, they are not the distinct, mutually exclusive fields you claim they are. ML is a subset of AI, and Deep Learning is a subset of ML. AI is a very broad term for programs that emulate human perception and learning. As you can see in the last intro paragraph of the AI wikipedia page (whoa, another source! aren't these cool?), some examples of AI tools are listed:

including search and mathematical optimization, formal logic, artificial neural networks, and methods based on statistics, operations research, and economics

Some of these - mathematical optimization, formal logic, statistics, and artificial neural networks - comprise the field known as machine learning. If you'll remember from my earlier citation about artificial neural networks, "deep learning" is when artificial neural networks have more than one hidden layer. Thus, DL is a subset of ML is a subset of AI (wow, sources are even cooler when there's multiple of them that you can logically chain together! knowledge is fun).

Anyways, good day :)

cucumberbob ,

It’s not just the media who uses this term. According to this study which I’ve had a very brief skim of, the term “hallucination” was used in literature as early as 2000, and in Table 1, you can see hundreds of studies from various databases which they then go on to analyse the use of “hallucination” in.

It’s worth saying that this study is focused on showing how vague the term is, and how many different and conflicting definitions of “hallucination” there are in the literature, so I for sure agree it’s a confusing term. Just it is used by researchers as well as laypeople.

Hackerman_uwu ,

Tangentially related: the more people seem to support AI all the things the less it turns out they understand it.

I work in the field. I had to explain to a CIO that his beloved “ChatPPT” was just autocomplete. He become enraged. We implemented a 2015 chatbot instead, he got his bonus.

We have reached the winter of my discontent. Modern life is rubbish.

Kedly ,

Bud, hallucinate is a perfect term for the shit AI creates because it doesnt understand reality, regardless if math is creating that hallucination or not

Malfeasant ,

computers can't do anything truly random.

Technically incorrect - computers can be supplied with sources of entropy, so while it's true that they will produce the same output given identical inputs, it is in practice quite possible to ensure that they do not receive identical inputs if you don't want them to.

Hackerman_uwu ,

IIRC there was a random number generator website where the machine was hookup up to a potato or some shit.

abhibeckert ,

computers aren’t taking drugs and randomly pooping out images

Sure, no drugs involved, but they are running a statistically proven random number generator and using that (along with non-random data) to generate the image.

The result is this - ask for the same image, get two different images — similar, but clearly not the same person - sisters or cousins perhaps... but nowhere near usable as evidence in court:

https://lemmy.world/pictrs/image/f15e3a04-5f33-4104-a011-ecb27fbb8ca0.png

https://lemmy.world/pictrs/image/57d4f726-6c07-4734-82dd-175101168a5b.png

Gabu ,

Tell me you don't know shit about AI without telling me you don't know shit. You can easily reproduce the exact same image by defining the starting seed and constraining the network to a specific sequence of operations.

Natanael ,

But if you don't do that then the ML engine doesn't have the introspective capability to realize it failed to recreate an image

Gabu ,

And if you take your eyes off of their sockets you can no longer see. That's a meaningless statement.

blind3rdeye ,

The point is that the AI 'enhanced' photos have nice clear details that are randomly produced, and thus should not be relied on. Are you suggesting that we can work around that problem by choosing a random seed manually? Do you think that solves the problem?

autotldr Bot ,

This is the best summary I could come up with:


A judge in Washington state has blocked video evidence that’s been “AI-enhanced” from being submitted in a triple murder trial.

And that’s a good thing, given the fact that too many people seem to think applying an AI filter can give them access to secret visual data.

Lawyers for Puloka wanted to introduce cellphone video captured by a bystander that’s been AI-enhanced, though it’s not clear what they believe could be gleaned from the altered footage.

For example, there was a widespread conspiracy theory that Chris Rock was wearing some kind of face pad when he was slapped by Will Smith at the Academy Awards in 2022.

Using the slider below, you can see the pixelated image that went viral before people started feeding it through AI programs and “discovered” things that simply weren’t there in the original broadcast.

Large language models like ChatGPT have convinced otherwise intelligent people that these chatbots are capable of complex reasoning when that’s simply not what’s happening under the hood.


The original article contains 730 words, the summary contains 166 words. Saved 77%. I'm a bot and I'm open source!

  • All
  • Subscribed
  • Moderated
  • Favorites
  • technology@lemmy.world
  • incremental_games
  • meta
  • All magazines