General AI Discussion

Ghouls do not decay.

A heads up, there is now talk from OpenAI and Google on the idea of their image generation algorithms applying a watermark to all generated work.

The rationale is that it will crack down on bogus news stories, but at the same time it will actually act as another safeguard for concerned artists and designers (as a visible mark on the image will make it useless for commercial purposes).

1 Like

AI-generated images are already non-copyrightable, so they’re already useless for commercial use. Also, watermarks have a very low success rate, they were already easy to remove 20 years ago

The dataset is the data that is being used for the training. The weights that get modified are the neural network or the model. What was described sounds more like reinforcement learning based on human feedback, but that is just a guess.

What you are describing is pretty accurate (with wrong terms). And that’s not just for TensorFlow, but that is how all neural networks are trained (also in PyTorch, which is another popular tool).
Of course when it comes to generative neural networks, there is usually quite a bit more, but overall, that is still how they are trained.

Isn’t that just for images which were generated base on text prompts (in the US)?

1 Like

It is, at the moment, really unclear. The case you describe is for sure true, there’s definitely a wider scope but it’s quite nebulous what the wider scope is right now

1 Like

There is so much going on that I wasn’t sure whether I missed something (which would not be unusual for me :smiley: )

1 Like

Given what a mess things are right now, it would probably be wisest for companies to steer clear of AI generation in commercial settings until things are more clear. But, IANAL, and I only use AI generation in the context of GitHub CoPliot to take the drudgery out of writing boilerplate,. so nothing I say means much of anything :slight_smile:

When they add watermarks, there are going to be people training neural networks to remove them (which is easier than training a generative neural network from scratch).
That’s a pretty dumb idea that could only convince politicians, that it would have any value and people who just want that something is done. Everyone who wants to work around that, will simply do it.

3 Likes

Yet it doesn’t stop big* companies, like Netflix, Gearbox, Microsoft, Adobe or Unity (to name just a few) jumping head first and deploying those kind of technologies in the wild…

(*and a lot of small ones too)

Edit: though some are more cautious too! Like recently Valve/Steam mentioned above (though idk how widespread it is as there are tons of games using AI generated images on Steam already)

1 Like

I doubt that they can do a robust watermark that would survive even camera raw filter in PS…

Yea, I think that’s basically it - make politicians feel that they did something with this “new and potentially dangerous technology”. Though, tbf even a simple meta-data based marking of images will probably be too hard for 90% of population to circumvent…

3 Likes

Sure, but the goal is to make the technology safer…

1 Like

For an AI to make anything, it needs exact litteral copies of artwork.
One picture could be made of million pictures.
If one of those million pictures made is a picture on Mickey mouse, using that picture would require paying thousands only for that mickey mouse image.
In my example, every single copy of every single artists would need to be bought to be legal and I think it makes sense.

Copying a style of an artist is a thing.
Building and image out of hundred of literal copies of pictures of an artist is another.

1 Like

Well, I fo think artists should credit others that influenced em but of course it is not mandatory.

Getting inspired is a thing.
Making literal copy of Mickey mouse was illegal last time I watched.
Why using literal copies to make a remix of em should be different? It’s even worse.

Also i don’t think anyone but pro AI would say an image interpreted by a human’s senses filter is the same as a computer exactly copying 1’s and 0’s of a jpeg.
I think AI is cool but people generating stuff from copyrighted inages should pay astronomical ammounts of money to the original artists work that was used by the AI model.

You use Snoop dog’s copyrighted songs to make your song? No problem.
But Snoop has to give consent or you get copystriked.

1 Like

In my world and the one that I hope will be in the near future, you can freely use literal copies of copyrighted art, but you will need to pay to use those copies used to generate your stuff.

If the model you use is using pre-paid art by consenting artists I guess you just have to pay the AI generator, if they charge.
But that AI generator company will have to acquire a lot of art to build it’s model and I doubt much artists will be very enthusiast to sell their art to such purpose.

It isn’t though in the case of what we are talking about. With diffusion models there is no copying involved as the model doesn’t store original images.

It’s output that matters in many copyright-related cases btw. That was the gist of what the judge recently said dismissing the case against stability ai.

Also there’s generative fill in Photoshop beta now, where Adobe claims it has all the rights to the training data. OpenAI claims the same thing about Dalle2.

Plus: I was talking about copyrighting of style. And image generator can produce art in the style of X without being trained on the X directly (look up what’s CLIP guidance, but also one can acquire license to images in the style of X made by someone else).

3 Likes

When neural networks have been trained on images, it is really difficult to make it generate an exact copy of them (with very few exceptions).

You may be glad to hear that neural networks are not copying ones and zeros.

But those networks are not capable (with exceptions) to generate exact copies…

2 Likes

I thought diffusion models used training data based on images, 1’s and 0’s.

yes, but they don’t store those 1’s and 0’s to spit them back out on the other side. those 1’s and 0’s are instead used to (over time, with many many inputs) modify how the neural nets internals are routed, which are then used to come up with a string of 1’s and 0’s based on .all. the inputs it’s received over training, and ofttimes, with a feedback loop of it’s output as well (how they continue to “learn”)

1 Like

But the model is indeed using a data set of literal copies.
No human error filter, exact copies.
Even the most genius human can’t copy exactly a jpeg 100%.
And if an human could just do it, that would be considered plagiarism.

Before AI no one really cared about an artist mixing Mickey Mouse with Mario bros.
Cause it takes crazy amount of work and training to do that.
Now that AI does that in miliseconds, maybe yes, maybe we should stick a trial to humans copying others style/remixing work.
But why do so?
People in general seem to appreciate something getting remixed by a human being, fellow artists in general don’t care as long as it is not plagiarism or copyright infringement.
There are existing laws for that already.

If humans collectively decide an AI using a literal copy in a dataset is the same as « inspiration », well so be it…
But if 99% of artists believe this is insane and unfair, I hope non-artists get the message, especially lawmakers.

As for references, you need ro buy them to use them in production. I don’t know about personal use or as inspiration butas for myself, I try to use copyright free stuff or ask my company to buy game/books/movies everytime I need something and I think this is the correct way of doing things. If not, you are stealing.

The thing that personally bugs me is the data being used in an AI model is 100% a literal copy. Idc if the end product is 0% similar.
100% of the original data was used to generate the 0% remix.