Any of you have a self-hosted AI "hub"? (e.g. for LLM, stable-diffusion, ...)

robber · 7 months ago

Any of you have a self-hosted AI "hub"? (e.g. for LLM, stable-diffusion, ...)

JackGreenEarth@lemm.ee · 7 months ago

Is that 128GB of VRAM? Because normal RAM doesn’t matter unless you want to run the model on the CPU, which is much slower.

Greg Clarke@lemmy.ca · 7 months ago

That’s 128GB RAM, the GPU has 24GB VRAM. Ollama has gotten pretty smart with resource allocation. Smaller models can fit soley on my VRAM but I can still run larger models on RAM.

JackGreenEarth@lemm.ee · 7 months ago

Any tips on how to get stable diffusion to do that? I’m running it through Krita’s AI Image Generation plugin, and with my 6GB VRAM and 16GB RAM, the VRAM is quite limited if I want to inpaint larger images, I keep getting ‘out of VRAM’ errors. How do I make it switch to RAM when VRAM is full? Or with Jan for that matter, how can I get it to partially use RAM and partially VRAM so I can get it to run models larger than 7B?