llms
LLama.cpp
- Models: https://huggingface.co/TheBloke/guanaco-65B-GGML/blob/main/guanaco-65B.ggmlv3.q4_K_M.bin
- https://github.com/ggerganov/llama.cpp#using-gpt4all
- specific parameters: https://www.reddit.com/r/LocalLLaMA/comments/12az7ah/comparing_llama_and_alpaca_presets/
- Fish commands for running
function llama
cd ~/code/llama.cpp
./main -m ./models/llama-13 \
--color \
--ctx_size 4096 \
-n -1 \
-ins -b 2048 \
--top_k 0 \
--temp 0.72 \
--top_p 0.73 \
--repeat_penalty 1.1 \
-t 11 \
--tfs 0.95 \
--mlock
cd -
end
while true
curl -C - -L "https://huggingface.co/TheBloke/guanaco-65B-GGML/resolve/main/guanaco-65B.ggmlv3.q4_K_M.bin" -o models/guanaco-65b
if test $status -eq 0
break
end
sleep 5
end
- Tail free sampling: https://www.trentonbricken.com/Tail-Free-Sampling/
- Presets deterministically: https://old.reddit.com/r/LocalLLaMA/comments/12az7ah/comparing_llama_and_alpaca_presets/