Deep learning has had enormous success on perceptual tasks but still struggles in providing a model for inference. To address this gap, we have been developing Compositional Attention Networks (CANs). The CAN design provides a strong prior for explicitly iterative reasoning, enabling it to support explainable and structured learning, as well as generalization from a modest amount of data. The model builds on the great success of existing recurrent cells such as LSTMs: A CAN is a sequence of a single recurrent Memory, Attention, and Control (MAC) cell, and by careful design imposes structural constraints on the operation of each cell and the interactions between them, incorporating explicit control and soft attention mechanisms into their interfaces. We demonstrate the model’s strength and robustness on the challenging CLEVR dataset for visual reasoning (Johnson et al. 2016), achieving a new state-of-the-art 98.9% accuracy, halving the error rate of the previous best model. More importantly, we show that the new model is more computationally efficient and data-efficient, requiring an order of magnitude less time and/or data to achieve good results. Joint work with Drew Arad.
Bio: Christopher Manning is the Thomas M. Siebel Professor in Machine Learning, Linguistics and Computer Science at Stanford University. He works on software that can intelligently process, understand, and generate human language material. He is a leader in applying Deep Learning to Natural Language Processing, including exploring Tree Recursive Neural Networks, sentiment analysis, neural network dependency parsing, the GloVe model of word vectors, neural machine translation, and deep language understanding. He also focuses on computational linguistic approaches to parsing, robust textual inference and multilingual language processing, including being a principal developer of Stanford Dependencies and Universal Dependencies. Manning is an ACM Fellow, a AAAI Fellow, an ACL Fellow, and a Past President of ACL. He has coauthored leading textbooks on statistical natural language processing and information retrieval. He is a member of the Stanford NLP group (@stanfordnlp) and manages development of the Stanford CoreNLP software.