28 min listen
Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference
Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference
ratings:
Length:
33 minutes
Released:
Oct 27, 2023
Format:
Podcast episode
Description
Large Language Models (LLMs) have sparked significant interest in their generative capabilities, leading to the development of various commercial applications. The high cost of using the models drives application builders to maximize the value of generation under a limited inference budget. This paper presents a study of optimizing inference hyperparameters such as the number of responses, temperature and max tokens, which significantly affects the utility/cost of text generation. We design a framework named EcoOptiGen which leverages economical hyperparameter optimization and cost-based pruning. Experiments with the GPT-3.5/GPT-4 models on a variety of tasks verify its effectiveness. EcoOptiGen is implemented in the `autogen' package of the FLAML library: \url{https://aka.ms/autogen}.
2023: Chi Wang, Susan Liu, A. Awadallah
https://arxiv.org/pdf/2303.04673.pdf
2023: Chi Wang, Susan Liu, A. Awadallah
https://arxiv.org/pdf/2303.04673.pdf
Released:
Oct 27, 2023
Format:
Podcast episode
Titles in the series (100)
FABRIC: Personalizing Diffusion Models with Iterative Feedback: In an era where visual content generation is increasingly driven by machine learning, the integration of human feedback into generative models presents significant opportunities for enhancing user experience and output quality. This study explores st... by Papers Read on AI