Image tricks with stable diffusion and clip-interrogator

When I got stable-diffusion to work, I really wanted to see what the AI would do on its own, letting it describe a given picture and re-create the picture from this description. How detailed can or should a description be?

First install the „clip-interrogator“:

python3 -m pip install --upgrade pip setuptools wheel
sudo apt install -y rustc cargo
pip install clip-interrogator

And here’s the sample python file. I got an error when I didn’t reassign/clear the „pipe“.

import torch
from torch import autocast
from diffusers import StableDiffusionPipeline
from PIL import Image
from clip_interrogator import Config, Interrogator

print("### Starting Stable Diffusion Pipeline")
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
pipe.to("cuda")

prompt = "a dark environment, two warriors standing on a chess field with swords drawn"
# prompt = "a laptop sitting on top of a wooden table"
steps = 50
width = 512
height = 512

print("### Creating images with Stable Diffusion")
with autocast("cuda"):
  for i in range(1):
    output = pipe(prompt, width=width, height=height, num_inference_steps=steps)
    image = output["images"][0]
    file = prompt.replace(" ", "_").replace(",", "")
    image.save(f"{file}-{i}.png")

pipe = "" # Destroy?
print("### Initializing Interrogator")
ci = Interrogator(Config(clip_model_name="ViT-L-14/openai"))
file = prompt.replace(" ", "_").replace(",", "")
for i in range(1):
  print("Loading file ", i)
  image = Image.open(f"{file}-{i}.png").convert('RGB')
  print(ci.interrogate(image))

AngInf

Angewandte Informatik in der Praxis

Image tricks with stable diffusion and clip-interrogator