Google is embedding inaudible watermarks right into its AI generated music

L4sBot@lemmy.world · 8 months ago

Google is embedding inaudible watermarks right into its AI generated music

Stern@lemmy.world · 8 months ago

People are listening to AI generated music? Someone on Bluesky put (paraphrased slightly) it best-

If they couldn’t put time into creating it I’m not going to put time into listening to it.

tahoe@lemmy.world · 8 months ago

I think I’d rather listen to some custom AI generated music than the same royalty free music over and over again.

In both cases they’re just meant to be used in videos and stuff like that, you’re not supposed to actually listen to them.

interceder270@lemmy.world · edit-2 8 months ago

Fun fact: Steve1989MREInfo uses all of his original music for his videos.

tahoe@lemmy.world · 8 months ago

This is the ultimate YouTuber power move. Exurb1a and RetroGamingNow do it too!

ChunkMcHorkle@lemmy.world · edit-2 8 months ago

A number of Youtubers do . . . and some of it’s even good, lol. John at Plainly Difficult and Ahti at AT Restorations are two that use their own music that I can think of off the top of my head.

Marin_Rider@aussie.zone · 8 months ago

Sam with Geowizard. actually quite a few “big” channels do which is awesome

t0fr@lemmy.ca · 8 months ago

People are using AI tools to do crazy stuff with music right now. It’s pretty great

Human performance but AI voice: https://www.youtube.com/watch?v=gbbUWU-0GGE

Carl Wheezer covers: https://www.youtube.com/watch?v=65BrEZxZIVQ

PipedLinkBot@feddit.rocks · 8 months ago

Here is an alternative Piped link(s):

https://www.piped.video/watch?v=gbbUWU-0GGE

https://www.piped.video/watch?v=65BrEZxZIVQ

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

Piecemakers@lemmy.world · 8 months ago

You tell 'em, bot. 🙌🏽

ikidd@lemmy.world · 8 months ago

Can it be much different from the mass-market auto-tuned pap that gets put out today?

Dizzy Devil Ducky@lemm.ee · 8 months ago

The singers of that music actually have to use their voice to sing into a mic compared to someone on a computer typing in a prompt.

As much as I dislike modern pop music, I will definitely say they put in more work than the people who rely solely on an AI that will do all the work based on a prompt.

SweatyFireBalls@lemmy.world · 8 months ago

My own feelings on the matter aside (fuck google and all that) this has been something chased after for a long time. The famous composer Raymond Scott dedicated the back end of his life trying to create a machine that did exactly this. Many famous musical creators such as Michael Jackson were fascinated by the machine and wanted to use it. The problem was is he was never “finished”. The machine worked and it could generate music, it’s immensely fascinating in my opinion.

If you want more information in podcast format check out episode 542 of 99% invisible or here https://www.thelastarchive.com/season-4/episode-one-piano-player

They go into the people who opposed Scott and why they did, and also talk about the emotion behind music and the artists, and if it would even work. Because the most fascinating part of it all was that the machine was kind of forgotten and it no longer works. Some currently famous musicians are trying to work together to restore it.

The question then is, if someone created their life’s work and modern musicians spend an immense amount of time restoring the machine, when the machine creates music does that mean no one spent time on it? I enjoy debating the philosophy behind the idea in my head, especially since I have a much more negative view when a modern version of this is done by Google.

WillFord27@lemmy.world · 8 months ago

I feel like the machine itself would be the art in that case, not necessarily what it creates. Like if someone spent a decade making a machine that could cook FLAWLESS BEEF WELLINGTON, the machine would be far more impressive and artistic than the products it made

daltotron@lemmy.world · 8 months ago

i mean, where do you draw the line necessarily between the machine and what it creates? the machine itself is totally useless without inputs and outputs, not to say art needs utility. the beef wellington machine is only notable on its ability to conjure beef wellington, otherwise it’s just a nothing machine. which is still kind of cool, I guess, but the beef wellington machine not making beef wellington is kind of a disregard for the core part of the machine, no?

Corhen@lemmy.world · 8 months ago

That was a great episode of 99PI. Would love the machine restored.

IIRC, It’s not so much that it made music, but that it would create loops through iteration to inspire people. He wanted it to make full busic but it was never close to that

SweatyFireBalls@lemmy.world · 8 months ago

Yeah I think you’re right, and it was apparently actually random. The longer it would play a loop the more it would iterate. Such a cool thing to exist

emberwit@feddit.de · 8 months ago

You will still listen to it, watching movies, advertisements, playing video games…

Queen HawlSera@lemm.ee · 8 months ago

This is the worst timeline

Lemminary@lemmy.world · 8 months ago

This is the worst time line so far.

Sanity_in_Moderation@lemmy.world · 8 months ago

Not yet.

Piecemakers@lemmy.world · 8 months ago

Ok, boomer.

How’s that microwave dinner taste? Like an A for effort? Yeah, I bet.

Inmate@lemmy.world · edit-2 8 months ago

deleted by creator

Inmate@lemmy.world · 8 months ago

deleted by creator

LittleHermiT@lemmus.org · 8 months ago

A spectrum analysis and bandpass filter should take care of that.

Agent641@lemmy.world · 8 months ago

chuckles contemptfully in Audacity

uriel238@lemmy.blahaj.zone · 8 months ago

So we’ll just need another AI to remove the watermarks… which I think already exists.

sviper@programming.dev · 8 months ago

Don’t even need AI. Basic audio editing works.

some pirate@lemmy.dbzer0.com · 8 months ago

Lately in youtube I’m constantly been bombarded with ai garbage music passed as a normal unknown bands and it’s getting really annoying. What will happen when there’s an actual new band but everyone ignores them because you would think it’s just ai?

Shayeta@feddit.de · 8 months ago

ai garbage music

actual new band but everyone ignores them because you would think it’s just ai

I think you answered your own question.

Squizzy@lemmy.world · 8 months ago

Omg the AI voice describing a short is infuriating.

“This man was minding his own business not knowing he was about to change this child’s life…Watch how his interaction is measured…”

Dots Do not recommend this channel again

mx_smith@lemmy.world · 8 months ago

This raises the question of will AI style be the next big trend? Imagine if real painters started painting oil paintings that look uncanny and surreal like an Ai generated art, weird hands, or weird eyes. Imagine if a real quartet decided to play an AI generated piece of music.

crispy_kilt@feddit.de · 8 months ago

The Audacity!

Hehe.

AutoTL;DR@lemmings.world · 8 months ago

This is the best summary I could come up with:

Audio created using Google DeepMind’s AI Lyria model, such as tracks made with YouTube’s new audio generation features, will be watermarked with SynthID to let people identify their AI-generated origins after the fact.

In a blog post, DeepMind said the watermark shouldn’t be detectable by the human ear and “doesn’t compromise the listening experience,” and added that it should still be detectable even if an audio track is compressed, sped up or down, or has extra noise added.

President Joe Biden’s executive order on artificial intelligence, for example, calls for a new set of government-led standards for watermarking AI-generated content.

According to DeepMind, SynthID’s audio implementation works by “converting the audio wave into a two-dimensional visualization that shows how the spectrum of frequencies in a sound evolves over time.” It claims the approach is “unlike anything that exists today.”

The news that Google is embedding the watermarking feature into AI-generated audio comes just a few short months after the company released SynthID in beta for images created by Imagen on Google Cloud’s Vertex AI.

The watermark is resistant to editing like cropping or resizing, although DeepMind cautioned that it’s not foolproof against “extreme image manipulations.”

The original article contains 230 words, the summary contains 195 words. Saved 15%. I’m a bot and I’m open source!

SuckMyWang@lemmy.world · edit-2 8 months ago

it does this by converting the audio into a 2d visualisation that shows how the spectrum of frequencies evolves in a sound over time

Old school windows media player has entered the chat

Seriously fuck off with this jargon, it doesn’t explain anything

Terminarchs@slrpnk.net · 8 months ago

That’s actually an accurate description of what is happening: an audio file turned into a 2d image with the x axis being time, the y axis being frequency and color being amplitude.

RufusLoacker@feddit.it · 8 months ago

That’s literally a spectrograph

Terminarchs@slrpnk.net · 8 months ago

Spectrogram*

Viking_Hippie@lemmy.world · 8 months ago

Your mom’s literally a spectrograph.

SuckMyWang@lemmy.world · 8 months ago

I know, it’s like the old windows media player visualisations.

FishFace@lemmy.world · 8 months ago

Sounds like a bad journalist hasn’t understood the explanation. A spectrogram contains all the same data as was originally encoded. I guess all it means is that the watermark is applied in the frequency domain.

datavoid@lemmy.ml · edit-2 8 months ago

Also this isn’t new by any stretch… Aphex Twin would like a word

FishFace@lemmy.world · 8 months ago

Well, encoding stuff in the spectrogram isn’t new, sure. But encoding stuff into an audio file that is inaudible but robust to incidental modifications to the file is much harder. Aphex Twin’s stuff is audible!

SuckMyWang@lemmy.world · edit-2 8 months ago

I would like to know what it is that makes it so robust. The article explains very little. Is it in the high frequencies? Higher than the human ear can hear? Compression will effect that plus that’s going to piss dogs off. Could be something with the phasing too. Filters and effects might be able to get rid of the water mark

FishFace@lemmy.world · 8 months ago

I don’t know what frequencies are annoying for dogs but I’m guessing it’s above 24kHz so no sound file or sound system is going to be able to store or produce it anyway.

There will certainly be some way to get rid of the watermark. But it might nevertheless persist through common filters.

Napain@lemmy.ml · 8 months ago

thats like putting a watermark besides the Bill.if it is inaudible then you can just delete it

reksas@sopuli.xyz · 8 months ago

I wonder if being able to generate music will make people less interested in actually bothering to learn how to do it themselves. Having ai tool makes many things so much easier and you need to have only rudimentary understanding of the subject.

Meowoem@sh.itjust.works · 8 months ago

Yeah, like most people don’t realise but until about 1900 most piano music was played by humans, of course there were no pianists after the invention of the pianola with its perforated rolls of notes and mechanical keys.

It’s sad, drums were things you hit with a stick once but Mr Theramin ensured you never see a drummer anymore, while Mr Moog effectively ended bass and rhythm guitars with the synthesizer…

It’s a shame it would be fun to go see a four piece band performing live but that’s impossible now no one plays instruments anymore.

People are never going to stop learning to play instruments, if anything they’ll get inspired by using AI to make music and it’ll get them interested in learning to play, they’ll then use ai tools to help them learn and when they get to be truly skilled with their instrument they’ll meet up with some awesomely talented friends to form a band which creates painfully boring and indulgent branded rock.

WillFord27@lemmy.world · 8 months ago

Those are a bit of false equivalencies, because all of them still required human input to work. AI generated music can be entirely automated, just put in a prompt and tell it to generate 10 and it’ll do the rest for you. Set up enough servers and write enough prompts and you can have hundreds of distinct and unique pieces of AI music put online every minute.

Realistically, putting aside sentimental value, there isn’t a single piece of music that humans have made that an AI couldn’t make. But I hope your optimism turns out to be right :/

daltotron@lemmy.world · 8 months ago

I sort of think this is looking at it wrong. That’s looking at music more like a product to be consumed, rather than one which is to be engaged with on the basis that it engenders human creativity and meaning. That’s sort of why this whole debate is bad at conception, imo. We shouldn’t be looking at AI as a thing we can use just to discard music from human hands, or art, or whatever, we should be looking at it as a nice tool that can potentially increase creativity by getting rid of the shit I don’t wanna deal with, or making some part of the process easier. This is less applicable to music, because you can literally just burn a CD of riffs, riffs, and more riffs (buckethead?), but for art, what if you don’t wanna do lineart and just wanna do shading? Bad example because you can actually just download lineart online, or just paint normal style, lineless or whatever. But what if you wanna do lineart without shading and making “real” or “whole” art? Bad example actually you can just sketch shit out and then post it, plenty of people do. But you get the point, anyways.

Actually, you don’t get the point because I haven’t made one. The example I always think of is klaus. They used AI, or neural networks, or deep learning or matrix calculation or whatever who cares, to automate the 3 dimensional shading of the 2d art, something that would be pretty hard to do by hand and pretty hard to automate without AI. To do it well, at least. That’s an easy example of a good use. It’s a stylistic choice, it’s very intentional, it distinguishes the work, and it does something you couldn’t otherwise just do, for this production, so it has increased the capacity of the studio. It added something and otherwise didn’t really replace anyone. It enabled the creation of an art that otherwise wouldn’t have been, and it was an intentional choice that didn’t add like bullshit, it allowed them to retain their artistic integrity. You could do this with like any piece of art, so you desired. I think this could probably be the case for music as well, just as T-pain uses autotune (or pitch correction, I forget the difference) to great effect.

WillFord27@lemmy.world · 8 months ago

I like these examples. Taken to the extreme, I would still consider a piece of ai generated sheet music played by a human musician to be art, but I guess it’s all subjective in the end. For music specifically, I’ve always been more into the emotional side of it, so as long as the artist is feeling then I can appreciate it.

daltotron@lemmy.world · 8 months ago

For sure it would be art, there are a bunch of ways to interpret what’s going on there. Maybe the human adds something through the expression of the timing of how they play the piece, so maybe it’s about how a human expresses freedom in the smallest of ways even when dictated to by some relatively arbitrary set of rules. Maybe it’s about how both can come together to create a piece of music harmoniously. Maybe it’s about the inversion of the conventional structure of how you would compose music and then it would be spread on like, hole punched paper to automated pianos, how now the pianos write the songs and the humans play them. Maybe it’s about how humans are oppressed by the technology they have created. Maybe it’s about all of that, maybe it’s about none of that, maybe some guy just wanted to do it cause it was cool.

I think that’s kind of why I think. I don’t dislike AI stuff, but I think people think about it wrong. Art is about communication, to me. A photo can be of purely nature, and in that way, it is just natural, but the photographer makes choices when they frame the picture. What perspective are they showing you? How is the shot lit? What lens? yadda yadda. Someone shows you a rock on the beach. Why that rock specifically? With AI, I can try to intuit what someone typed in, in order to get the output of a picture from the engine, I can try to deliberate what the inputs were into the engine, I can even guess which outputs they rejected, and why they went with this one over those. But ultimately I get something that is more of a woo woo product meant to impress venture capital than something that’s made with intention, or presented with intention. I get something that is just an engine for more fucking internet spam that we’re going to have to use the same technology to try and filter out so I can get real meaning and real communication, instead of the shadows of it.

WillFord27@lemmy.world · 8 months ago

I believe it will depend on a couple different factors. Putting keywords into a generator isn’t the same as laying your hands on an instrument, being able to physically play it yourself. However, if the result is so perfect and beautiful that a person could have never possibly come up with it on their own, it might be discouraging (but I can’t really see that happening)

RagingRobot@lemmy.world · 8 months ago

Maybe but people who are good at things already can use it as a tool to be better. You can combine the skills you do have with ai for the skills you don’t have to make something you never could have before.

I like to make games and for me this means I could make my own game music. I just don’t have the skills to do that on my own and make it sound good. But with ai I could get music that matches the quality of my other work.

Draconic NEO@lemmy.world · 8 months ago

So basically it’s security through obscurity, since once people know they can and will edit it out, especially those who want to use it for deception.