Lifelong Machine Learning: Second Edition - PDF
Скачать полную версию книги "Lifelong Machine Learning: Second Edition - PDF"
The purpose of writing this second edition is to extend the definition of lifelong learning, to update the content of several chapters, and to add a new chapter about continual learning in deep neural networks, which has been actively researched for the past two to three years. A few chapters are also reorganized to make each of them more coherent.
The project of writing this book started with a tutorial on lifelong machine learning that we gave at the 24th International Joint Conference on Artificial Intelligence (IJCAI) in 2015. At that time, we had worked on the topic for a while and published several papers in ICML, KDD, and ACL. When Morgan & Claypool Publishers contacted us about the possibility of developing a book on the topic, we were excited. We strongly believe that lifelong machine learning (or simply lifelong learning) is very important for the future of machine learning and artificial intelligence (AI). Note that lifelong learning is sometimes also called continual learning or continuous learning in the literature. Our original research interest in the topic stemmed from extensive application experiences in sentiment analysis (SA) in a start-up company several years ago. A typical SA project starts with a client who is interested in consumer opinions expressed in social media about their products or services and those of their competitors. There are two main analysis tasks that an SA system needs to do: (1) discover the entities (e.g., iPhone) and entity attributes/features (e.g., battery life) that people talked about in opinion documents such as online reviews and (2) determine whether the opinion about each entity or entity attribute is positive, negative, or neutral [Liu, 2012, 2015]. For example, from the sentence “iPhone is really cool, but its battery life sucks,” an SA system should discover that the author is (1) positive about iPhone and (2) negative about iPhone’s battery life.
After working on many projects in many domains (which are types of products or services) for clients, we realized that there is a great deal of sharing of information across domains and projects. As we see more and more, new things get fewer and fewer. It is easy to see that sentiment words and expressions (such as good, bad, poor, terrible, and cost an arm and a leg) are shared across domains. There is also a great deal of sharing of entities and attributes. For example, every product has the attribute of price, most electronic products have battery, and many of them also have screen. It is silly not to exploit such sharing to significantly improve SA to make it much more accurate than without using such sharing but only working on each project and its data in isolation. The classic machine learning paradigm learns exactly in isolation. Given a dataset, a learning algorithm runs on the data to produce a model. The algorithm has no memory and thus is unable to use the previously learned knowledge. In order to exploit knowledge sharing, an SA system has to retain and accumulate the knowledge learned in the past and use it to help future learning and problem solving, which is exactly what lifelong learning aims to do.
It is not hard to imagine that this sharing of information or knowledge across domains and tasks is generally true in every field. It is particularly obvious in natural language processing because the meanings of words and phrases are basically the same across domains and tasks and so is the sentence syntax. No matter what subject matter we talk about, we use the same language, although each subject may use only a small subset of the words and phrases in a language. If that is not the case, it is doubtful that a natural language would have ever been developed by humans. Thus, lifelong learning is generally applicable, not just restricted to sentiment analysis.
The goal of this book is to introduce this emerging machine learning paradigm and to present a comprehensive survey and review of the important research results and latest ideas in the area. We also want to propose a unified framework for the research area. Currently, there are several research topics in machine learning that are closely related to lifelong learning, most notably, multi-task learning and transfer learning, because they also employ the idea of knowledge sharing and transfer. This book brings all these topics under one roof and discusses their similarities and differences. We see lifelong learning as an extension to these related paradigms. Through this book, we would also like to motivate and encourage researchers to work on lifelong learning. We believe it represents a major research direction for both machine learning and artificial intelligence for years to come. Without the capability of retaining and accumulating knowledge learned in the past, making inferences about it, and using the knowledge to help future learning and problem solving, achieving artificial general intelligence (AGI) is unlikely.
Two main principles have guided the writing of this book. First, it should contain strong motivations for conducting research in lifelong learning in order to encourage graduate students and researchers to work on lifelong learning problems. Second, the writing should be accessible to practitioners and upper-level undergraduate students who have basic knowledge of machine learning and data mining. Yet there should be sufficient in-depth materials for graduate students who plan to pursue Ph.D. degrees in the machine learning and/or data mining fields.
This book is thus suitable for students, researchers, and practitioners who are interested in machine learning, data mining, natural language processing, or pattern recognition. Lecturers can readily use the book in class for courses in any of these related fields.