Back to All Events

Interpreting Deep Learning Models (PhD Thesis Proposal)

  • Dalhousie University, Mona Campell Building, Room 2110 (map)


Model interpretability is a requirement in many applications in which crucial decisions are made by users relying on a model's outputs. The recent movement for ``algorithmic fairness" also  stipulates explainability, and therefore interpretability of learning models. The most notable is ``a right to explanation" enforced in the widely-discussed provision of the European Union General Data Privacy Regulation (GDPR). And yet the most successful contemporary Machine Learning approaches, the Deep Neural Networks, produce models that are highly non-interpretable. Deep Neural Networks have achieved huge success at a wide spectrum of applications from language modeling and computer vision to speech recognition. However, nowadays, good performance alone is not sufficient to satisfy the needs of practical deployment where interpretability is demanded for cases involving ethics and mission critical applications. The complex models of Deep Neural Networks make it hard to understand and 
 reason the predictions, which hinders its further progress.  

In this thesis proposal, we  attempt to address this challenge by presenting two methodologies that demonstrate superior interpretability results on experimental data.  

The first methodology is named as CNN-INTE. It interprets deep Convolutional Neural Networks (CNN) via meta-learning. In this work, we interpret a specific hidden layer of the deep CNN model on the MNIST image dataset. We use a clustering algorithm in a two-level structure to find the meta-level training data and Random Forest as base learning algorithms to generate the meta-level test data. The interpretation results are displayed visually via diagrams, which clearly indicate how a specific test instance is classified. Our method achieves global interpretability for all the test instances on the hidden layers without sacrificing the accuracy obtained by the original deep CNN model. This means our model is faithful to the original deep CNN model, which leads to reliable interpretations.  

In the second methodology, we apply the Knowledge Distillation technique to distill Deep Neural Networks into decision trees in order to attain good performance and interpretability simultaneously. We formulate the problem at hand as a multi-output regression problem and the experiments demonstrate that the student model achieves significantly better accuracy performance (about 1% to 5%) than vanilla decision trees at the same level of tree depth. The experiments are implemented on the TensorFlow platform to make it scalable to big datasets. To the best of our knowledge, we are the first to distill Deep Neural Networks into vanilla decision trees on multi-class datasets.

In the end, we propose a visualization technique for future work.

Examining Committee:

Dr. Stan Matwin - Faculty of Computer Science (Supervisor)
Dr. Thomas Trappenberg - Faculty of Computer Science (Reader)
Dr. Sageev Oore - Faculty of Computer Science (Reader)
Dr.  Fernando Paulovich - Faculty of Computer Science (External Examiner)