Neural Network Compression - The Functional Perspective (+ Extensions)
Israel Mason-Williams will be presenting and discussing the paper Neural Network Compression: The Functional Perspective (+ Extensions) (Mason-Williams, 2024) (Mason-Williams et al., 2024).
Abstract
Compression techniques, such as Knowledge distillation, Pruning, and Quantization reduce the computational costs of model inference and enable on-edge machine learning. The efficacy of compression methods is often evaluated through the proxy of accuracy and loss to understand similarity of the compressed model. This study aims to explore the functional divergence between compressed and uncompressed models. The results indicate that Quantization and Pruning create models that are functionally similar to the original model. In contrast, Knowledge distillation creates models that do not functionally approximate their teacher models. The compressed model resembles the dissimilarity of function observed in independently trained models. Therefore, it is verified, via a functional under- standing, that Knowledge distillation is not a compression method. Thus, leading to the definition of Knowledge distillation as a training regulariser given that no knowledge is distilled from a teacher to a student.
References
- NEURAL NETWORK COMPRESSION: THE FUNCTIONAL PERSPECTIVEIn 5th Workshop on practical ML for limited/low resource settings, 2024
- Knowledge Distillation: The Functional PerspectiveIn NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning, 2024
Enjoy Reading This Article?
Here are some more articles you might like to read next: