Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

FromPapers Read on AI

Start listening View podcast show

Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

FromPapers Read on AI

ratings:

Length:

21 minutes

Released:

Jul 19, 2023

Format:

Podcast episode

Description

Despite the dominance and effectiveness of scaling, resulting in large networks with hundreds of billions of parameters, the necessity to train overparametrized models remains poorly understood, and alternative approaches do not necessarily make it cheaper to train high-performance models. In this paper, we explore low-rank training techniques as an alternative approach to training large neural networks. We introduce a novel method called ReLoRA, which utilizes low-rank updates to train high-rank networks. We apply ReLoRA to pre-training transformer language models with up to 350M parameters and demonstrate comparable performance to regular neural network training. Furthermore, we observe that the efficiency of ReLoRA increases with model size, making it a promising approach for training multi-billion-parameter networks efficiently. Our findings shed light on the potential of low-rank training techniques and their implications for scaling laws.

2023: Vladislav Lialin, Namrata Shivagunde, Sherin Muckatira, Anna Rumshisky

https://arxiv.org/pdf/2307.05695v2.pdf

Released:

Jul 19, 2023

Format:

Podcast episode

Titles in the series (100)

Keeping you up to date with the latest trends and best performing architectures in this fast evolving field in computer science. Selecting papers by comparative results, citations and influence we educate you on the latest research. Consider supporting us on Patreon.com/PapersRead for feedback and ideas.

Skip carousel

Related podcast episodes

Skip carousel

Discover this podcast and so much more

Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

Stack More Layers Differently: High-Rank Training Through Low-Rank Updates

Description

Titles in the series (100)

More Episodes from Papers Read on AI

Related podcast episodes