Papers tagged “efficiency”
4 papers ·
All papers →
QLoRA: Efficient Finetuning of Quantized LLMs
2023
NeurIPS
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
2022
NeurIPS
LoRA: Low-Rank Adaptation of Large Language Models
2021
ICLR
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
2019
ICML