POP: Prefill-Only Pruning for Efficient Large Model Inference9 days agovia huggingface2 ptshuggingface.co(opens in new window)AI Score: 21%paper