Vicuna-33B-1-3-SuperHOT-8K-GPTQ

notfromhere@lemmy.one · 1 year ago

Vicuna-33B-1-3-SuperHOT-8K-GPTQ

notfromhere@lemmy.one · 1 year ago

I read the guy’s blog post on SuperHOT and it sounded like it didn’t increase perplexity and kept perplexity super low with large contexts. I could have read it wrong but I thought it wasn’t supposed to increase perplexity.

simple@lemmy.mywire.xyz · 1 year ago

The increase in perplexity is very small, but there is still some with 8K content. But it seems like with 2K its much larger. I could be misunderstanding something myself. But my little test with 2K context does suggest there’s something going on with 2K contexts on SuperHOT models

Vicuna-33B-1-3-SuperHOT-8K-GPTQ

Vicuna-33B-1-3-SuperHOT-8K-GPTQ

TheBloke/Vicuna-33B-1-3-SuperHOT-8K-GPTQ · Hugging Face