Everyone is so thrilled with llama.cpp, but I want to do GPU accelerated text generation and interactive writing. What’s the state of the art here? Will KoboldAI now download LLaMA for me?