Jikes! AI experiments

Helmut_S · March 19, 2024, 1:16am

Well, I’m doing the unthinkable: I’m starting another scrapbook, one where I’ll share my AI (yes: A-f*ckin-I!!) experiments, and possibly my thoughts about the respective outcome.

Why am I doing this, despite all my arguing against AI?
Well, first,I don’t argue against AI, only against AI in corporate hands, in state hands, or, in general, as a tool to gain power - or as a tool for these f*ckin early adopters, opportunity hunters and other *ssholes who hope to gain advantage over others, become rich quicker on the backs of others, and in general love their game of musical chairs.

Second: because those who say it is here to stay, they are right. There’s no point ignoring it, I (we artists, and best all others too, I think) have to find out what’s possible, get familiar with it, and maybe find uses for it.

I have always seen myself as a narrator primarily. I’ve got lost in the endless technicalities of 3D and economic necessities unfortunately, but maybe, maybe, a tool that allows to put focus away of endless technical details and back to story and narration, maybe such a tool could turn out even being a good thing for me in the end, who knows?

So I was thinking about potential workflows - workflows which would preserve those parts of digital movie making I like, while it cuts short what I don’t. One such possibility, I believe, could be a art-directed animation where I could leave meticulous visual detail to the AI, while still being able to art direct character appearance, acting, movements.

I’m using a local Stable Diffusion (XL) installation and ComfyUI (node based UI for SD), and here’s one of my attempts to direct SDXL, by feeding it hand drawn animation:

To be very honest with you guys - and let’s not be too nitpicky about the number of tails and paws, or the moving trees - it’s strange to admit, but I like what it does.

PS:
I’m open for AI related discussion here, however I’d like to restrict it to pro’s and con’s of genAI regarding the artistic/movie making side. For all the rest there’s the other thread …

bandages · March 31, 2024, 10:52pm

I started playing with SD (A1111) about five days ago. I found it relatively easy to get started making good pics. They’re not necessarily the pics I want to make, which is the biggest difference between SD and Blender, but they’re good pics. Better than any pics I’ve made in Blender. (Check out the “Blender” checkpoints on Civitai and cringe, and realize that’s what we’ve mostly been making in Blender.)

Every job I’ve ever had, I’ve always had the opinion that if I my job goes away, that’s a good thing, because my job fucking sucks. Playing with Blender is no different. I like playing with Blender, but no matter what happens, I can still play with Blender. But Blender jobs still suck, because clients suck. If professional art goes away, which of course it won’t, I really don’t mind, because there’s no reason people can’t indulge in amateur art.

Especially with 3D art, it seems to me that it has always been a moving target anyways. Pro 3D is about adapting to new tools. AI art is another new tool.

I originally started playing with SD in the hopes of using some of its tools in Blender. But it turns out that a lot of the tools around AI are really, really poor. Here’s OpenPose interpretation (img->pose) of an A-pose:

It’s just a rest pose, but OpenPose is totally asymmetrical, face is lopsided, etc. No good.

Depth and normal maps are similar. They are weak approximations, only good when you’re ignoring them half the time (as ControlNet is built to do.)

And although I’ve seen some use of SD for texturing, it has been really, really bad texturing. Looks good in SD; doesn’t look good in Blender.

Now I’m thinking, maybe I’m better off using Blender to make better SD art instead of vice versa. While there are indications that there are a few people doing this (Civitai has 2 different Blender->OpenPose rigs), Googling really doesn’t get me much.

But, I can also see that AI art isn’t all the way there yet. There are a lot of really beautiful pics out there; there are a few good videos; there are no long videos. Animation capability of AI art is still lacking.

Not to mention the cultural revulsion to AI art. So while you’ll see people using AI art, you won’t see them admitting to it. For like ten years from now, I think. That’s how strong the kneejerk has been. So in-house concept art that no member of the public will ever see? That’s all AI now, and one good concept artist, with good AI technique, is taking all the jobs. But front-facing art? Nope, we’ll just recreate AI art by hand using skilled artists to avoid any controversy. Artists are cheaper than controversy.

And the idea that AI art requires zero expertise seems, immediately, like BS to me. People without any experience or education make crap AI art. But!-- it sure takes a lot less experience than Blender does. And, it’s in a totally different domain. I feel like everything I’ve ever learned about 3D is useless for AI art. It’s a bit more like learning how to communicate with an alien who doesn’t understand adjectives.

JWC · March 31, 2024, 11:05pm

I am excited to see where this goes!

Helmut_S · April 2, 2024, 12:09pm

Hey @bandages, welcome to the game.

Confirmed, at least if we’re talking about non-random results that have to match certain specifications.

As of now, AI is able to bypass some of 3D’s quirks and super-complex workflows, but on the other hand, introduces its own, utterly different quirks. 3D trouble usually is tech related, some specific detail you can’t achieve because the respective feature is missing, or buggy … shadows in Eevee. hair sim, fluids, tissue, true displacement.

AI animation on the other hand has content-related quirks as it seems. Whatever guidance you attempt, it can only handle what it has been trained for, the sort of abstraction and generalization a human artist would be capable to develop … just doesn’t happen.

Like, characters can’t do a pirouette (because characters obviously have a face, always, on the backs of their heads or if it breaks their necks), can’t appear turned upside down, and they can’t enter or leave the stage. Basically, what SD(XL) video seems to be able to handle well is frontal shots of people looking into the camera - the sort of thing AI “artists” share on YT … hence the sort of thing that goes into training data.

In a way, AI (video) is just a continuation on all the digital stuff before: it promises unbelievable results, but once you start trying, presents you with random, maddening weaknesses, forcing you to into the dense jungle of hacks and workaraounds, to devise strategies on how to cheat it into doing what you want it to do.

I wonder what Sora will be capable of … but then, Sora is fully controlled by OpenAI, and I don’t intend to hand over control to them, of what I make.

Anyway, here are some more examples (actually, I did those before the wolf experiment I posted above), posting them to give some more impressions on what works … and what doesn’t.

Here’s the first attempt on the wolf. For those interested in detail, I did that using LCM Lora to speed animation rendering up:

"A wolf in the forest, leaves, flowers, light rays, vibrant colors, 8k, rich detail, Disney, Pixar, animated movie, modisn disney",
        "stylized, Pixar, Disney, 3D animated movie, modisn disney"

For the 2nd attempt, I disabled LCM Lora and switched to Euler ancestral sampling. Also, I changed the prompt, after some research, in an attempt to pin the trees at their spots.

"A wolf in the forest, leaves, flowers, light rays, (no background noise, no movement), vibrant colors, 8k, rich detail, Disney, Pixar, animated movie, modisn disney",
        "stylized, Pixar, Disney, 3D animated movie, modisn disney"

Notice, how SD creates the character in jumpy random poses whenever there’s no wolf present in the input video. This seems to be a consistent problem, I have briefly looked into prompt travelling (= animated prompts changing over time) to remedy, but unsuccessfully so far.

The one below shows what SDXL is capable of regarding details already.

Nude woman shot 1, crouching in the jungle (NSFW): http://thalion-graphics.de/video/sdxl-inthejungle.mp4

Rig a character to achieve that quality of skin deformation, wrinkles, muscles and all. And set up that hair!
Proportions are somewhat morphing, but partly that’s in the input video.
Still, it also shows it can’t deal with body parts dropping out of visibility even briefly, there are issues with her hand while reaching over.

This one shows both strengths and weaknesses: high quality as long as the movement works and SD maintains a grasp on the character’s body, however it completely fails to extrapolate properly, in space and time, what it can’t see in the input:

Nude woman shot 2, struggling in the swamp (NSFW): http://thalion-graphics.de/video/sdxl-swampshot1.mp4

Also, notice in both these clips SD refuses to adopt the character’s acting as delivered in the input, but tones it down into something that’s in line with the model’s training data, which, in all the cases I’ve shown, is an SDXL checkpoint trained for Disney/Pixar style characters. In many cases the character blatantly refuses to look anywhere else but directly at the camera …

bandages · April 10, 2024, 10:58pm

Yeah, I’m slowly coming to the conclusion that AI art isn’t necessarily easier than traditional art, but what is easy in traditional art is hard with AI art, and what is hard in traditional art is easy with AI art. I think there has to be some way to use AI for what it’s good at and use something else for what it’s bad at.

I think, when we have that figured out, we’ll have the potential to make a high quality short animation in, let’s say, a week. A single person doing that. To me, that potential is absolutely huge.

I’m still learning, did a few animation tests (my computer isn’t great, I have to let it chug overnight.) AnimateDiff sounds like a promising technique that I haven’t examined personally yet. I’m also interested in animating in Blender with some weird models and doing img2img techniques, generating masks from unusual Blender renders-- hopefully there’s some way to get SD to fix the Blender animation problems that you want fixed, without “fixing” the animation components you don’t want fixed (like, the direction that the eyes are pointing.)

As far as leaving the frame, could you just let everything be in the frame and then crop (and upscale) it afterwards?