Warning: mkdir(): No space left on device in /var/www/hottg/post.php on line 59

Warning: file_put_contents(aCache/aDaily/2024-05-29/post/opendatascience/--): Failed to open stream: No such file or directory in /var/www/hottg/post.php on line 72
​​Giraffe: Adventures in Expanding Context Lengths in LLMs @Data Science by ODS.ai 🦜
TG Telegram Group & Channel
Data Science by ODS.ai 🦜 | United States America (US)
Create: Update:

​​Giraffe: Adventures in Expanding Context Lengths in LLMs

Modern Large Language Models (LLMs) have revolutionized our ability to process and understand vast amounts of textual data. Yet, these models, like LLaMA and LLaMA2, often come with a caveat: they're constrained by fixed context lengths, which means they're limited in handling longer sequences of input data at evaluation. This paper tackles that constraint by investigating a variety of methods for "context length extrapolation," which essentially enables these models to understand and work with longer text sequences. Among the techniques explored, the paper introduces an innovative "truncated basis" strategy for altering positional encodings within the attention mechanism, promising a more scalable future for LLMs.

The researchers put their theories to the test with three brand-new evaluation tasks—FreeFormQA, AlteredNumericQA, and LongChat-Lines—providing a more nuanced measure of model performance than the traditionally used metric of perplexity. Their findings? Linear scaling came out on top as the most effective way to extend the context length, but the truncated basis method showed potential for future exploration. To propel the research community even further, the paper releases three game-changing long-context models, named Giraffe, with context lengths ranging from 4k to an astonishing 32k.

Paper link: https://arxiv.org/abs/2308.10882
Code link: https://github.com/abacusai/Long-Context

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-giraffe

#deeplearning #cv #nlp #largelanguagemodel #opensource #largecontext

​​Giraffe: Adventures in Expanding Context Lengths in LLMs

Modern Large Language Models (LLMs) have revolutionized our ability to process and understand vast amounts of textual data. Yet, these models, like LLaMA and LLaMA2, often come with a caveat: they're constrained by fixed context lengths, which means they're limited in handling longer sequences of input data at evaluation. This paper tackles that constraint by investigating a variety of methods for "context length extrapolation," which essentially enables these models to understand and work with longer text sequences. Among the techniques explored, the paper introduces an innovative "truncated basis" strategy for altering positional encodings within the attention mechanism, promising a more scalable future for LLMs.

The researchers put their theories to the test with three brand-new evaluation tasks—FreeFormQA, AlteredNumericQA, and LongChat-Lines—providing a more nuanced measure of model performance than the traditionally used metric of perplexity. Their findings? Linear scaling came out on top as the most effective way to extend the context length, but the truncated basis method showed potential for future exploration. To propel the research community even further, the paper releases three game-changing long-context models, named Giraffe, with context lengths ranging from 4k to an astonishing 32k.

Paper link: https://arxiv.org/abs/2308.10882
Code link: https://github.com/abacusai/Long-Context

A detailed unofficial overview of the paper:
https://andlukyane.com/blog/paper-review-giraffe

#deeplearning #cv #nlp #largelanguagemodel #opensource #largecontext


>>Click here to continue<<

Data Science by ODS.ai 🦜






Share with your best friend
VIEW MORE

United States America Popular Telegram Group (US)


Fatal error: Uncaught TypeError: shuffle(): Argument #1 ($array) must be of type array, null given in /var/www/hottg/post.php:344 Stack trace: #0 /var/www/hottg/post.php(344): shuffle() #1 /var/www/hottg/route.php(63): include_once('...') #2 {main} thrown in /var/www/hottg/post.php on line 344