Model used:
ICBINP
Sampling method: Euler a
Steps: 32
1. Study
You need you study the girl you trying to generate:
This girl has the following features:
1. Purple, medium length hair
2. Purple top, with cleavage
3. Purple shorts
4. Black knee high socks
5. Purple shoes
Then, here comes the hard part, is to experiment what's the best way to express those feature in words AI model could understand.
2. Experiment prompts and generate using text2image
The following is the prompt I've got after a bunch of experiments:
Positive:
a beautiful girl with medium hair, purple hair, (purple strapless tube top), (arm sleeves:1.5), strapless, cleavage, full body, large breast, midriff, purple denim shorts, belt, standing in a harbor, black socks, sneakers, sea view , looking at viewer, 24mm, 4k textures, soft cinematic light
Negative:
nsfw, nudity, plastic, Deformed, blurry, bad anatomy, bad eyes, crossed eyes, disfigured, poorly drawn face, mutation, mutated, extra limb, ugly, poorly drawn hands, missing limb, blurry, floating limbs, disconnected limbs, malformed hands, blur, out of focus, long neck, long body, ((((mutated hands and fingers)))), (((out of frame))), blender, doll, cropped, low-res, close-up, poorly-drawn face, out of frame double, two heads, blurred, ugly, disfigured, too many fingers, deformed, repetitive, black and white, grainy, extra limbs, bad anatomy, umbrella
And after countless generations, I have something like this:
(The purple-colored denim shorts were particularly hard because the model usually make blue colored ones)
This would serve as a base image to work with as it is close enough to the ideal image.
Note:
In the above prompt, you might notice something like (prompt:1.5), or ((prompt)), those means how much you emphasis this particular feature.
Only use those if you don't see it generating without the brackets.
3. Further polish using img2img
Now back to the good old spot the difference.
Ask yourself, what is missing from this image?
For me, it was
1. weird looking sleeves part
2. white strips on the top and sleeves
2. black socks
To let the model understand that was what I need,
I simply used a simple brush tool and paint over those parts (I used
Krita, but honestly
Paint would work too).
The end result was like this:
Now, using img2img, you can generate another image based on your "doodle".
To explain further, I need to mention a very important concept,
Denoising strength
Denoising strength
It is a number ranging from 0 to 1, while 0 is identical to original image, 1 completely overwrites the original image.
See an example of generation based on the above image:
| | |
Denoising strength: 0.3 | Denoising strength: 0.6 | Denoising strength: 0.9 |
As you can see, 0.3 will left over some ugly paint bit from the original image.
While 0.9 will completely overwrites the original pose.
You need to adjust this parameter accordingly in order to get the best results.
A general guide, if you need things to change a lot, go for 0.75.
Otherwise, don't go below 0.4
img2img vs inpaint
Now let's get into the img2img part.
There are two main ways to do img2img, as below
i) Simple img2img
The concept of this is simple, it takes one image as input and changes the whole image based on your prompt:
| |
Original image | Diagnosing Strength: 0.5 |
As you can see, not only the girl, it will the entire image too.
ii) Inpaint
Inpaint means you only want
part of the image to be changed.
It is achieved by highlighting the area you wish to "write over":
And the below the example result:
| |
Original | Diagnosing Strength: 0.5 |
As you see, only the area you highlighted were changes, but not the other parts.
Img2Img Prompt
In general, it is completely ok to use the same prompt as your text2img prompts.
Only need to insert new prompts if you need to generate something completely new to the image.
So I over-simplified this, but in fact it tooks me hours, but the following is the resulting image I generated:
Now, this image is 90% of my liking, but only one thing missing...
That's right,
Daggers!
Generate an item using inpaint
One thing to note here,
prepare for hell if you generate something like whips.
I know a lot of zako hold whips, I know how tempting to generate whips.
But take it from me,
DON'T TRY IT!
It looks super ugly and no AI model could properly paint one completely.
Normally I avoid picking a girl with weapon as much as possible, as AI model is, well, suck at generating hand-held weapons.
But since the weapon for this girl is simple enough, a dagger, I decided to give it a go.
Open the painting tool again, now let's draw an ugly looking dagger on her hand:
(Surprising rough, right?)
Here comes the real work, now only use the prompt:
holding a dagger
Then paint over her hand,
and choose "Only masked" below.
Choose about Denoising Strength 0.5 is ok, you start seeing image like follows on image generations:
Just do a bunch more and here is the end result:
Pretty neat, right? (I fixed her sock in the end too)
So hope this little guide can get you started on how to generate a zako using Stable Diffusion.
Let's get painting boys!