Continuous batch enables 23x throughput in LLM inference and reduce p50 latency August 15, 2023 by Comments