Post-transformer inference: 224× compression of Llama-70B with improved accuracy December 10, 2025 by kamal Comments