data-science - Provide.ai

data-science, deep-learning, llm, Machine Learning

The NeurIPS 2023 LLM Efficiency Challenge Starter Guide

Sebastian Raschka, PhD / August 10, 2023

Large language models (LLMs) offer one of the most interesting opportunities for developing more efficient training methods. A few weeks ago, the NeurIPS…

data-science, deep-learning, llm, Machine Learning

Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch

Sebastian Raschka, PhD / July 1, 2023

Peak memory consumption is a common bottleneck when training deep learning models such as vision transformers and LLMs. This article provides a series of…

data-science, deep-learning, llm, Machine Learning

Finetuning Falcon LLMs More Efficiently With LoRA and Adapters

Sebastian Raschka, PhD / June 14, 2023

Finetuning allows us to adapt pretrained LLMs in a cost-efficient manner. But which method should we use? This article compares different…

data-science, deep-learning, llm, Machine Learning

Accelerating Large Language Models with Mixed-Precision Techniques

Sebastian Raschka, PhD / May 11, 2023

Training and using large language models (LLMs) is expensive due to their large compute requirements and memory footprints. This article will explore how…

data-science, deep-learning, llm, Machine Learning

Parameter-Efficient LLM Finetuning With Low-Rank Adaptation (LoRA)

Sebastian Raschka, PhD / April 26, 2023

Pretrained large language models are often referred to as foundation models for a good reason: they perform well on various tasks, and we can use them as a…

data-science, deep-learning, Machine Learning

Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters

Sebastian Raschka, PhD / April 12, 2023

In the rapidly evolving field of artificial intelligence, utilizing large language models in an efficient and effective manner has become increasingly…

data-science, deep-learning, Machine Learning

Finetuning Large Language Models On A Single GPU Using Gradient Accumulation

Sebastian Raschka, PhD / March 28, 2023

Previously, I shared an article using multi-GPU training strategies to speed up the finetuning of large language models. Several of these strategies include…

data-science, deep-learning, Machine Learning

Keeping Up With AI Research And News

Sebastian Raschka, PhD / March 23, 2023

When it comes to productivity workflows, there are a lot of things I’d love to share. However, the one topic many people ask me about is how I keep up with…

data-science, deep-learning, Machine Learning

Some Techniques To Make Your PyTorch Models Train (Much) Faster

Sebastian Raschka, PhD / February 23, 2023

This blog post outlines techniques for improving the training performance of your PyTorch model without compromising its accuracy. To do so, we will wrap a…

data-science, deep-learning, Machine Learning

Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch

Sebastian Raschka, PhD / February 9, 2023

In this article, we are going to understand how self-attention works from scratch. This means we will code it ourselves one step at a time. Since its…