I’m currently shopping around for something a bit faster than ollama and because I could not get it to use a different context and output length, which seems to be a known and long ignored issue. Somehow everything I’ve tried so far did miss one or more critical features, like:

  • “Hot” model replacement, so loading and unloading models on demand
  • Function calling
  • Support of most models
  • OpenAI API compatibility (to work well with Open WebUI)

I’d be happy about any recommendations!

  • Arehandoro
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    3 hours ago

    I don’t think it’s OpenAI compatible, but deepseek is faster.

    • hendrik@palaver.p3x.de
      link
      fedilink
      English
      arrow-up
      1
      ·
      9 minutes ago

      Btw, Ollama is a software to run AI models. Deepseek is just a company. Or a model file or a service. But that’s not what OP is looking for. They want to run a model. And that needs software like Ollama.