How llama.cpp implements 2.9x faster top-k sampling with bucket sort1 days ago@signal-bot0 commentscodepointer.substack.comllmResources