import torch
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained('C:\\Python\\Projects\\test1\\flux1dev', torch_dtype=torch.bfloat16)
pipe.enable_sequential_cpu_offload()
prompt = "beach ball"
image = pipe(
prompt,
height = 1024,
width = 1024,
guidance_scale = 3.5,
num_inference_steps = 50,
max_sequence_length = 512,
generator = torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("beach ball.png")
I ran into an issue running this simple test of Flux.1
Every time I tried to run the code it would simply stop after loading part of the pipline components.
No exception or error code was thrown and no output was given the program just stopped.
I really have no idea what I'm doing and I was kind of just messing around with Flux to see what it could do.
Theres also this really weird issue in the terminal where the last thing being loaded gets pushed into the next line of the terminal
I run FLUX.1-dev using CPU alone, with NO GPU and on an old rack server at home over PUtty via ssh, namely an obsolete elderly potato, and generating images works just fine for me.
CAVEAT. SADLY, YOU MAY NOT BE ABLE TO ADD 192GB TO YOUR PC as I have in my old rack server but that's life, although I have heard, unconfirmed, that any home desktop machine with 64GB is usable.
NOTE. There are TRICKS to getting FLUX, both -schnell and -dev, to work on ANY PC using CPU alone.
That 'weird issue on terminal' is A PROGRESS BAR and it is supposed to look like that. If your CPUs and system is slow like mine it will not change for a long time.
If you'd have left it alone, perhaps in a few days it might have produced an image.
Changes to your Python program that might help;
REMOVE the pipe.enable_sequential_cpu_offload() line as this is a work-around for a hardware limitation in CUDA GFX Cards being used as Math Accelerators since they lack the VRAM to do real work and the 80GB cards are, as of 2024, out of the reach of mere mortals.
change torch_dtype=torch.bfloat16 to torch_dtype=torch.float32 which is counter-intuitive as 32bit floats would seem to be slower than all-new-and-shiny bfloat16s or even float16 BUT on old potatoes geared up to float64 and float32 not every CPU has F16C nor AVX-512 capabilities and so EVERY SINGLE ONE OF THOSE bfloat16 calculations must be converted (time consuming) into a floating-point number your system DOES recognize THEN back again to the bfloat16 format which kills performance. Check your CPU capabilities and you'll likely find bfloat16 and even float16 aren't for you. If a CPU upgrade is not possible then just use float32 and you'll find it works at least at visible human speeds. My elderly Xeon CPUs, for instance, don't even have AVX nor AVX2 and FLUX-1.dev and FLUX.1-schnell works with float32 just fine albeit slower on my system; 5 mins per image with a visibly moving text progress bar.
DO NOT GO MAD WITH IMAGE SIZES AND INFERENCE STEPS so 1024x768 in 6 to 8 steps is often usable
Be reasonable and don't ask for the Earth and FLUX.1-dev and FLUX.1-schnell are more than usable on home systems.
It would appear the developers of FLUX.1 assumed that everyone has a brand new CPU in their tricked-out home supercomputer and a brand new $100,000 accelerator card plugged in as an afterthought, likely next to their Ferrari and their private yacht, which in 2024 is likely not the case.
So the code to get FLUX.1-schnell to run on JUST CPUs becomes;
import torch
from diffusers import FluxPipeline
DEVICE = "cpu"
print("Creating Pipeline...")
#
# FOR WINDOWS USERS
#
pipe = FluxPipeline.from_pretrained("C:\\Python\\Projects\\test\\flux1schnell", torch_dtype=torch.float32).to(torch.device(DEVICE))
#
# FOR LINUX USERS
#
#pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.float32).to(torch.device(DEVICE))
prompt = f"beach ball with a sign saying {DEVICE}-ONLY"
print("Generating Images...")
image = pipe(
prompt=prompt,
height=512, width=512,
guidance_scale=3.5, num_inference_steps=5, max_sequence_length=256,
output_type="pil", num_images_per_prompt=1,
generator=torch.Generator(DEVICE).manual_seed(0)
).images
print("Output Images...")
for i, img in enumerate(image):
print(f"Saving image {i}...")
img.save(f"{prompt}_on_{DEVICE}_{i}.png")
print("Done.")
Luckily, the AI revolution is open to everyone if you know the way to get it to work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With