About

Representational Capacity of Neural Language Models

RocketChat

Tutorial Description

Language models are currently at the forefront of NLP research due to their remarkable versatility across diverse tasks. However, a large gap exists between their observed capabilities and the explanations proposed by established formal machinery. To motivate a better theoretical characterization of LMs′ abilities and limitations, this tutorial aims to provide a comprehensive introduction to a specific framework for formal analysis of modern LMs using tools from formal language theory (FLT). We present how tools from FLT can be useful in understanding the inner workings and predicting the capabilities of modern neural LM architectures. We will cover recent results using FLT to make precise and practically relevant statements about LMs based on recurrent neural networks and transformers by relating them to formal devices such as finite-state automata, Turing machines, and analog circuits. Altogether, the results covered in this tutorial will allow us to make precise statements and explanations about the observed as well as predicted behaviors of LMs, as well as provide theoretically motivated suggestions on the aspects of the architectures that could be improved.

Slides

Slides are available here, and will be continually updated.

Syllabus

Module 1Why Study Language Models with Formal Languages?
Module 2Background
Module 3Analyzing RNNs with Formal Language Theory
Module 4Analyzing RNNs with Formal Language Theory
Module 5Looking Ahead

Tutorial Organizers

Alexandra Butoi's profile

Alexandra Butoi

PhD Student

ETH Zürich

Robin Chan's profile

Robin Chan

PhD Student

ETH Zürich

Ryan Cotterell's profile

Ryan Cotterell

Assistant Professor

ETH Zürich

Will Merrill's profile

Will Merrill

PhD Student

NYU

Franz Nowak's profile

Franz Nowak

PhD Student

ETH Zürich

Clemente Pasti's profile

Clemente Pasti

PhD Student

ETH Zürich

Lena Strobel's profile

Lena Strobel

PhD Student

Umeå University

Anej Svete's profile

Anej Svete

PhD Student

ETH Zürich