LLaMA.cpp: A GPT3-level LLM that can run on your desktop

will_a113 to

AIEnglish · 1 year ago

LLaMA.cpp is a project on GitHub that implements inferencing of a LLaMA model in pure C/C++. The performance is pretty amazing given the limited hardware it can run on (even a Pi, if you have patience), and the author gives an explanation of how that’s even possible (hint: memory bandwidth).

You must log in or register to comment.

Chat