Explaining Transformers Using Model-Based Stochastic Signal Processing

Carey Bunks will presenting their ongoing work on analysing Transformers

Abstract

This talk presents a novel analysis of Transformer components based on techniques drawn from model-based stochastic signal processing.  The proposed framework offers a theoretical foundation useful for mechanistic interpretability, and also suggests practical strategies for improving the performance of Transformer architectures.  Some preliminary experimental results will be presented.

References




    Enjoy Reading This Article?

    Here are some more articles you might like to read next:

  1. Deep Learning is Not So Mysterious or Different
  2. The Biophysical Principles Underlying Computation in Neural Substrates
  3. Approaching Deep Learning through the Spectral Dynamics of Weights
  4. NeurIPS 2024 Recap
  5. Expressive Power of Temporal Message Passing