14 February 2023
Dear curious minds & travelers. Let’s meet, discuss and learn about language models and power laws. When in Kraków, Poland, in these post-pandemic times, please feel invited to the meetup.
“Large language models based on transformers and trained on nearly internet-sized corpora of text, such as OpenAI's chatGPT, are revolutionizing natural language processing and revive the dream of artificial general intelligence. They have made huge progress within a few years, leaving us largely intellectually unprepared for their arrival. In my talk, I will attack the topic of language models from a mathematician's perspective. I will speak of empirical power laws of learning in these models and I will present a simplistic model of language and learning that exhibits such laws” - Łukasz Dębowski, Instytut Podstaw Informatyki PAN.
Lukasz Dębowski, PhD home.ipipan.waw.pl/l.debowski focuses on information theory, complex systems, discrete stochastic processes and goes further into statistical and neural language models. His latest book "Information Theory Meets Power Laws: Stochastic Processes and Language Models" (Wiley) received the prize of the Committee on Informatics of the Polish Academy of Sciences.
For a warm and entertaining introduction to power laws, try Michael Stevens’, Vsauce: The Zipf Mystery
- Łukasz Dębowski, language_models_and_power_laws.pdf slides
- Accompanied paper: "A Simplistic Model of Neural Scaling Laws: Multiperiodic Santa Fe Processes" https://arxiv.org/pdf/2302.09049.pdf
- Slides with introduction and some quotations on language models
- Michael Stevens’, Vsauce: The Zipf Mystery