Explaining Transformers Using Model-Based Stochastic Signal Processing

Created on February 12, 2025

2025 · ongoing-work · talks

Carey Bunks will presenting their ongoing work on analysing Transformers

Abstract

This talk presents a novel analysis of Transformer components based on techniques drawn from model-based stochastic signal processing. The proposed framework offers a theoretical foundation useful for mechanistic interpretability, and also suggests practical strategies for improving the performance of Transformer architectures. Some preliminary experimental results will be presented.

References

Enjoy Reading This Article?

Here are some more articles you might like to read next:

Deep Learning is Not So Mysterious or Different

The Biophysical Principles Underlying Computation in Neural Substrates

Approaching Deep Learning through the Spectral Dynamics of Weights

NeurIPS 2024 Recap

Expressive Power of Temporal Message Passing