Advanced Machine Learning

11. LLM

18 min · 17 nov 2024
aflevering 11. LLM cover

Beschrijving

This lecture slideshow explores the world of Large Language Models (LLMs), detailing their architecture, training, and application. It begins by explaining foundational concepts like recurrent neural networks (RNNs) and Long Short-Term Memory (LSTM) before moving on to Transformers, the architecture behind modern LLMs. The presentation then discusses pre-training, fine-tuning, and various parameter-efficient techniques for adapting LLMs to downstream tasks. Finally, the slideshow addresses critical challenges facing LLMs, including safety concerns, bias, outdated knowledge, and evaluation difficulties.

Reacties

0

Wees de eerste die een reactie plaatst

Meld je nu aan en word lid van de Advanced Machine Learning community!

Begin hier

2 maanden voor € 1

Daarna € 9,99 / maand · Elk moment opzegbaar.

  • Podcasts die je alleen op Podimo hoort
  • 20 uur luisterboeken / maand
  • Gratis podcasts

Alle afleveringen

11 afleveringen

aflevering 10. Time Series artwork

10. Time Series

Forecasting, the process of predicting future events, is a fundamental element of many disciplines, including economics, meteorology, and social sciences. This text provides an overview of time series analysis, a powerful technique for understanding and forecasting data that evolves over time. The document explores the components of a time series, including trend, cyclical, seasonal, and irregular components. It also outlines quantitative forecasting methods, such as moving averages, exponential smoothing, and autoregressive models, which utilize historical data to make predictions. Finally, the text delves into stationarity, a crucial property for time series data, and discusses the ARIMA model, which is widely used for forecasting non-stationary time series.

17 nov 202423 min
aflevering 09. Seq to Seq artwork

09. Seq to Seq

This source is a lecture on sequence-to-sequence learning (Seq2Seq), a technique for training models to transform sequences from one domain to another. The lecture explores various examples of Seq2Seq problems, including machine translation, image captioning, and speech recognition. It then delves into different types of Seq2Seq problems based on input and output sequence lengths and data types. The presentation continues by introducing various sequence models and their applications, and then focuses on data encoding techniques used for sequence data. Finally, the lecture presents a specific Seq2Seq problem – reversing a sequence – and explores different solutions using multi-layer perceptrons and recurrent neural networks (RNNs), including LSTM models. It concludes by acknowledging the scalability limitations of these approaches and proposing an encoder-decoder model as a potential solution. Suggested questions What are the main types of sequence-to-sequence problems, and how do they differ in terms of input and output sequence lengths and data types? How do different RNN architectures (e.g., simple RNN, GRU, LSTM) address the challenges of processing sequential data, and what are their strengths and weaknesses in handling varying sequence lengths? How does the encoder-decoder architecture overcome the limitations of traditional RNN models in handling long sequences, and how does it contribute to improved performance in sequence-to-sequence tasks?

17 nov 202429 min
aflevering 07. - Generative Adversarial Networks (GANs) artwork

07. - Generative Adversarial Networks (GANs)

The source is a series of lecture notes on Generative Adversarial Networks (GANs). It begins with an introduction to generative models, comparing and contrasting them with discriminative models, and then introduces the concept of adversarial training, explaining how GANs work. The notes then dive into the different architectures and training procedures for GANs, including maximum likelihood estimation, KL divergence, and the minimax game formulation. They explain why GANs are so powerful for generating realistic data and describe some common training problems and their solutions, such as mode collapse and non-convergence. Finally, the notes discuss several GAN extensions, including conditional GANs, InfoGANs, CycleGANs, and LAPGANs, demonstrating their various applications in areas like image-to-image translation, text-to-image synthesis, and face aging.

17 nov 202432 min