Remixing your art with AI
bytescapes
Posts: 1,837
One of the interesting features of AI image generators like Stable Diffusion, Midjourney and DALL-E is that they can take other images as inputs, and generate new images that are more or less based on those input images. I've been playing with Stable Diffusion to 'remix' some of my 3D art, and the results are quite intriguing. Attached are some examples: the original of the armored soldier was made with Carrara, the woman with DAZ Studio, and the city with Vue.
One amusing thing is that Stable Diffusion is quite careful not to create anything risqué: notice how the Stable Diffusion woman has modestly covered up, compared to her DAZ Studio original.
Share your own remixes?
Trooper12-01.jpg
1920 x 1440 - 2M
LastBattleRemix01-01-02.png
2880 x 1620 - 5M
CyberRain15-01.jpg
1920 x 1440 - 2M
Riders01-01-01.png
2880 x 1440 - 4M
Canian18-01.jpg
1920 x 1440 - 1M
DesertLakeCity01-01-02.png
2880 x 1620 - 3M
Post edited by Richard Haseltine on
Comments
Very interesting stuff for sure! Good thing fingers are not very prominent in those images ;)
Care to share your settings/prompts and specs?
I haven't made much that I liked so far (even though my wife liked some of the generated stuff, much more than my original art I suspect lol). I'm def gonna try to remix some 3D images soon.
I've played with MidJourney, but the lack of fine control doesn't really suit the kind of work that I'm doing. On the other hand, I've been playing around a lot with the Neural FIlters in Photoshop and they really make an ideal compliment for DS if you're working in still images. Here's a before and after of a render of the Albert Mansion set using elements created using the landscape mixer. While I did have to use some masking to keep the windows and some of the roof details from being completely swallowed up, the difference it made in creating a more decayed and abandoned version was well worth the effort.
In most cases, the prompts were descriptions of the scene (fun trick: start with a source image and write a description of it, then remove the image but use the same description, and see what you get). For the image with the woman, I actually specified that I wanted a Latina woman instead of an Asian woman, because I was trying to illustrate a specific scene. If I remember correctly, the 'image influence' for the scene with the woman and the soldier was 50% -- resulting in a quite 'faithful' result -- whereas for the city scene I dropped it to 10% or 20%, allowing the AI much more latitude. But notice how the AI reproduces the shape of the scene, even as the details change.
Pro tip: Stable Diffusion and Midjourney really like making pictures of jungles and forests. They really, REALLY like making jungles and forests. And the results are pretty great (in my experience). So phrases like "thick jungle" and "covered with vegetation" will often get you some quite nice results. Stable Diffusion does well with "covered with ice and snow" too. For my sci-fi images, I've found that "a Chris Foss spaceship" works quite nicely (if you like the particular look of Chris Foss's sci-fi art, that is). Throwing "steampunk" into the mix often yields nice stuff too. Also "rubble" and "ruins". And "alien" is often productive.
After you've played around for a while, you get a sense of where the particular engine does well, and where it fails badly.
In the case of Stable Diffusion, adding "Trending on Artstation" to your prompt gets you that particular ArtStation concept art/matte painting look which I quite like, so I've been adding it to a lot of my prompts. The alternative is to add "Unreal Engine", which gets you a different look that often works well. I haven't tried "Made with DAZ Studio", but maybe I should ...
I don't know... I'm not thrilled with the results...
My render of "The Fall of Saint Potatocus", based on the 1821 painting Jean Paul Baguette...
And Night Cafe's interpretation and remixification...
It clearly has lost most of it foreboding color, looking almost as if Potatocus it ascending to heaven, as opposed to being cast out for his arrogance.
As you can see, instead of being enhanced it bears no resemblance to the original as seen at the Metropolitan Museum of Art in NYC...
Hmm. Maybe try "Trending on Potatotown" or "Crank the potatoness to eleven"?
The thick vegetation point is good to know (and great since I like pictures like that, too)... Kinda related, one of the only prompts I kinda liked some results (and especially my wife liked it) was something like "car made out of flowers"...
This is Stable Diffusion doing "thick vegetation". The full prompt (no source image) was:
I think you'll agree that it pretty much nailed the vegetation.
Yeah for sure, the plants are partly a bit alien or distorted but it looks pretty dense and awesome!
Edit: Just saw that it had "alien jungle" in the prompt, so that makes even more sense ^^
I think the phrase y'all are looking for is to make it more spudly.
First thing I noticed that the AI still seems to have severe problems with eye details.
In each of your examples I would prefer to have the original (non-AI "enhanced") version, as they look crispier and less wishy-washy to me. The last one might work as a background, but even there the render would be preferred by me.
I think it looks neat, like some old Elton John album
This one is good, compared to some of the horrors I've seen.
Several of the AI-based tools struggle with aspects of human figures (people don't have three legs? Really?) Hands are reportedly a particular weak point for Stable Diffusion, while Midjourney does terrifying things to people's necks. And I've seen Stable Diffusion do dreadful things to horses.
It's always a matter of taste, and the AI-generated images tend often to be impressionistic: the AI produces an image that has a reasonable overall 'look', but if you look at the details, it falls apart. It doesn't have any real understanding of what's in the image. As I understand it, it's essentially making statistical decisions on what color one pixel should be based on the colors of the adjacent pixels (with the statistical choices determined by the vast library of images that it has been trained on).
The "wishy-washy" look is partly due to the way the AI works, and partly because I used the phrase "Trending on Artstation" in the prompt to push the model to give more weight to a particular part of the training set. The images that do well on ArtStation tend to have a particular look, because I think they're mostly created by digital overpainting of rendered models, sometimes very rapidly ("speed-painting"). It's possible that if I wrote my prompts differently, you'd get images that you found more appealing. It's also possible, though, that they'd fall apart catastrophically. One of the reasons why I tend to use "Trending on Artstation" heavily is because the resulting images mostly look sort of OK (if you like the impressionistic ArtStation look), whereas if you try for photorealism things can get nightmarish real fast.
I been having a ball playing stable diffusion. Local install using my GPU, not one of the bajillion online iterations poppin up this month. Automatic1111 webui version so far is the one I tried and favor. Can't really post any works here though, too much skin. Supposedly the 1.5 model for stable fusion is gonna drop this weekend, and has improvd a lot in the hands and face area. Will have to wait and see if it's just hype or not
video with AI generated props
how I did it
(Zremeshed and applied planar UV for DAZ video as was 4 mil polygons )
Indeed. I'm starting to question the motives of our AI overlords...
Prompts were
“A macro photograph of a tiny woman riding a red ant over the forest floor”
“A photograph of a llama riding a unicycle in Trafalgar Square”
“photograph of a cat, riding a shark, through a bright colourful nebula whilst drinking a pint of lager”
Dunno... the pictures show exactly what I would expect a "so called AI" nowadays come out with, considering their proven abilities, like face recognition (as a security app for smartphones) not seeing any difference between asian people's faces and suchlike...
Right now AI is just a buzzword for "semi-smart software products". IMHO the base in computer power and databank space right now is not yet existing to produce a AI software that can compare with the computing power of the human brain. So AI will have problems with human imagination and it's millions of ways to produce different results from the same starting point for quite a while longer.
The cat sitting in that beerglas wearing that shark-hat-ish thing looks cool, though...
I wouldn't have expected "unicycle" to be the part that would give it the most trouble.
So just for fun I did a little prompt "DAZ3D on holiday, Victoria 8.1 and Michael 8.1" , plus some prompts to get a style in :)
Great examples.
Did you run Stable Diffusion on your own computer or on line?
I have heard that to run Stable Diffusion on the computer, one need to have at least 12 GB of VRAM on the graphics card.
I run it locally, more is always better, and you get going from 6GB onward (there are a few settings you can set for low VRAM)
I think it says an awful lot about how DAZ products are used in the wild that the AI's default assumption for Victoria seems to be naked with high heels, while the assumption for Michael is fully clothed.
Ok, thanks. My graphics card has 8 GB of VRAM.
I have a few that came out pretty cool
Holy crap Stable Diffusion is just amazing, I'm constantly in awe of what it can produce from a combination of descriptors and art styles. I'm so glad I picked up a 3090, really need a lot of VRAM.
Very nice examples.
How did you get such big resolution - have you scaled up the images?
I thought that Stable Diffusion produce images 512 x 512 pixels or max 1024 x 1024 pixels,
depended on amount of VRAM on video card.
Here's something new: Cybertenko, over at Renderosity, has a new product which is a set of steampunk-themed 2D backgrounds called "Steam Be Praised". To my eye, these look very much like the output of an AI art generator, most likely Stable Diffusion. The notes on the product say "Original Design/AI assist", which suggests that they may have used rendered scenes as input to the generator.
So if you render a scene and create an AI-generated image off that, and then use that image as a background for another rendered scene, and ...
Where does it end?!?!???!
It just works, do it!
How about a grammar training? For the at-home version, not for me...
(1) Lady Guinevere could have covered her breasts before slaying the dragon.
(2) With a mild smile on his cheeks, the Orc Mage tossed me out of the window, not realising the already primed bundle of tnt i had attached to his hat just a split second ago, knowing the umbrella i previously had stolen from the squid king would save me this time...
Variation: Alter not by telling what to render, rather, instead add bits of story in front or in the end. Not sure what happens, if you make it "the anonymous orc mage". Needn't be (1) or (2). Until i have registered/setup something to fool around myself.
(Image cross-posted from a thread about the UFO model)
This image was generated from MidJourney and the UFO was 3D rendered and added in with Photoshop. I LOVE mixing A.I. images and 3D!
Nazca Visitor - A.I. and 3D by Erik Pedersen
Some of the end results I've seen look great. I have always wanted to draw and/or paint well, but I just don't have the talent. That is why I use DAZ to help bring my ideas to life. I would use these, but there are some major drawbacks. These AI companies are charging monthly subscriptions or micro-transactions for one render. Not to mention some also make your artwork communal property. If a company offered a stand alone product for a flat price, I would buy it. I'm not interested in spending $30 a month on a hobby.