SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference

Category
Year/Month
2023-07
Status
Publications
Code