LLaMa allows Large Language Models (of comparable quality to the Generative Pre-trained Transformer) on commodity hardware (IE on a single GPU, although even more impressively, I believe pared-down versions like LLaMa 7B have even been run on an RPi and on a smartphone).
The Hackaday article is here, and more informative than anything I’d write.
I presume we’re all proponents of having some modest degree of autonomy ourselves when it comes to domestic software & hardware, and although developed by Meta/Facebook, this is a model trained on publicly available text, and for which the code and model weights can be obtained (see the hackaday article for details).
As with all processing of inputs, there are huge security advantages to locally processing as much of them as practicable.
As with anything pertaining to that which a lay reader might think of as AI, I’d like to append a note to explain why I’d describe this as Machine-Learning rather than AI (and, at that, the word ‘Learning’ is somewhat of a lazy anthropomorphisation/pretending-something-is-a-human. I know, I know, I probably sound like my grandmother telling me ‘car’ is a vulgar contraction of ‘motor-car’…). The following text is from this web-page:
How LLMs Work:
LLMs like GPT-3 are deep neural networks—that is, neural networks with many layers of “neurons” connected by billions of weighted links. Given an input text “prompt”, at essence what these systems do is compute a probability distribution over a “vocabulary”—the list of all words (or actually parts of words, or tokens) that the system knows about. The vocabulary is given to the system by the human designers. GPT-3, for example, has a vocabulary of about 50,000 tokens.
For simplicity, let’s forget about “tokens” and assume that the vocabulary consists of exactly 50,000 English words. Then, given a prompt, such as “To be or not to be, that is the”, the system encodes the words of the prompt as real-valued vectors, and then does a layer-by-layer series of computations, whose penultimate result is 50,000 real numbers, one for each vocabulary word. These numbers are (for obscure reasons) called “logits”. The system then turns these numbers into a probability distribution with 50,000 probabilities—each represents the probability that the corresponding word is the next one to come in the text. For the prompt “To be or not to be, that is the”, presumably the word “question” would have a high probability. That is because LLMs have learned to compute these probabilities by being shown massive amounts of human-generated text. Once the LLM has generated the next word—say, “question”, it then adds that word to its initial prompt, and recomputes all the probabilities over the vocabulary. At this point, the word “Whether” would have very high probability, assuming that Hamlet, along with all quotes and references to that speech, was part of the LLMs training data.