Our first test involves running the LLaMA payload – discussed in LXF304, O it creates a ChatGPT-like bot that runs locally. Due to the extremely large size of the model, which has to be obtained via BitTorrent, this task cannot be completed solely on the Raspberry Pi – instead, a workstation is required to handle some of the more computationally intensive preparation tasks.
Furthermore, be aware that the quantisation steps have to be performed once again even if you’ve done them in the past. The LLaMA framework receives frequent upgrades – trying to use a recent version with an ancient quantisation leads to errors.
Excursus: swap it
Since advanced operating systems became available on all kinds of workstations, expanding the working memory