Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Deep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Ebook580 pages6 hours

Deep Learning for Natural Language Processing

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Explore the most challenging issues of natural language processing, and learn how to solve them with cutting-edge deep learning!

Inside Deep Learning for Natural Language Processing you’ll find a wealth of NLP insights, including:

    An overview of NLP and deep learning
    One-hot text representations
    Word embeddings
    Models for textual similarity
    Sequential NLP
    Semantic role labeling
    Deep memory-based NLP
    Linguistic structure
    Hyperparameters for deep NLP

Deep learning has advanced natural language processing to exciting new levels and powerful new applications! For the first time, computer systems can achieve "human" levels of summarizing, making connections, and other tasks that require comprehension and context. Deep Learning for Natural Language Processing reveals the groundbreaking techniques that make these innovations possible. Stephan Raaijmakers distills his extensive knowledge into useful best practices, real-world applications, and the inner workings of top NLP algorithms.

About the technology
Deep learning has transformed the field of natural language processing. Neural networks recognize not just words and phrases, but also patterns. Models infer meaning from context, and determine emotional tone. Powerful deep learning-based NLP models open up a goldmine of potential uses.

About the book
Deep Learning for Natural Language Processing teaches you how to create advanced NLP applications using Python and the Keras deep learning library. You’ll learn to use state-of the-art tools and techniques including BERT and XLNET, multitask learning, and deep memory-based NLP. Fascinating examples give you hands-on experience with a variety of real world NLP applications. Plus, the detailed code discussions show you exactly how to adapt each example to your own uses!

What's inside

    Improve question answering with sequential NLP
    Boost performance with linguistic multitask learning
    Accurately interpret linguistic structure
    Master multiple word embedding techniques

About the reader
For readers with intermediate Python skills and a general knowledge of NLP. No experience with deep learning is required.

About the author
Stephan Raaijmakers is professor of Communicative AI at Leiden University and a senior scientist at The Netherlands Organization for Applied Scientific Research (TNO).

Table of Contents
PART 1 INTRODUCTION
1 Deep learning for NLP
2 Deep learning and language: The basics
3 Text embeddings
PART 2 DEEP NLP
4 Textual similarity
5 Sequential NLP
6 Episodic memory for NLP
PART 3 ADVANCED TOPICS
7 Attention
8 Multitask learning
9 Transformers
10 Applications of Transformers: Hands-on with BERT
LanguageEnglish
PublisherManning
Release dateDec 20, 2022
ISBN9781638353997
Deep Learning for Natural Language Processing
Author

Stephan Raaijmakers

Stephan Raaijmakers is professor of Communicative AI at Leiden University and a senior scientist (machine learning, NLP) at TNO.

Related to Deep Learning for Natural Language Processing

Related ebooks

Intelligence (AI) & Semantics For You

View More

Related articles

Reviews for Deep Learning for Natural Language Processing

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Deep Learning for Natural Language Processing - Stephan Raaijmakers

    Deep Learning for Natural Language Processing

    Stephan Raaijmakers

    To comment go to liveBook

    Manning

    Shelter Island

    For more information on this and other Manning titles go to

    www.manning.com

    Copyright

    For online information and ordering of these  and other Manning books, please visit www.manning.com. The publisher offers discounts on these books when ordered in quantity.

    For more information, please contact

    Special Sales Department

    Manning Publications Co.

    20 Baldwin Road

    PO Box 761

    Shelter Island, NY 11964

    Email: orders@manning.com

    ©2022 by Manning Publications Co. All rights reserved.

    No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

    Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

    ♾ Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.

    ISBN: 9781617295447

    brief contents

    Part 1. Introduction

      1   Deep learning for NLP

      2   Deep learning and language: The basics

      3   Text embeddings

    Part 2. Deep NLP   

      4   Textual similarity

      5   Sequential NLP

      6   Episodic memory for NLP

    Part 3. Advanced topics

      7   Attention

      8   Multitask learning

      9   Transformers

    10   Applications of Transformers: Hands-on with BERT

    Bibliography

    contents

    Front matter

    preface

    acknowledgments

    about this book

    about the author

    about the cover illustration

    Part 1. Introduction

      1   Deep learning for NLP

    1.1  A selection of machine learning methods for NLP

    The perceptron

    Support vector machines

    Memory-based learning

    1.2  Deep learning

    1.3  Vector representations of language

    Representational vectors

    Operational vectors

    1.4  Vector sanitization

    The hashing trick

    Vector normalization

      2   Deep learning and language: The basics

    2.1  Basic architectures of deep learning

    Deep multilayer perceptrons

    Two basic operators: Spatial and temporal

    2.2  Deep learning and NLP: A new paradigm

      3   Text embeddings

    3.1  Embeddings

    Embedding by direct computation: Representational embeddings

    Learning to embed: Procedural embeddings

    3.2  From words to vectors: Word2Vec

    3.3  From documents to vectors: Doc2Vec

    Part 2. Deep NLP   

      4   Textual similarity

    4.1  The problem

    4.2  The data

    Authorship attribution and verification data

    4.3  Data representation

    Segmenting documents

    Word-level information

    Subword-level information

    4.4  Models for measuring similarity

    Authorship attribution

    Verifying authorship

      5   Sequential NLP

    5.1  Memory and language

    The problem: Question Answering

    5.2  Data and data processing

    5.3  Question Answering with sequential models

    RNNs for Question Answering

    LSTMs for Question Answering

    End-to-end memory networks for Question Answering

      6   Episodic memory for NLP

    6.1  Memory networks for sequential NLP

    6.2  Data and data processing

    PP-attachment data

    Dutch diminutive data

    Spanish part-of-speech data

    6.3  Strongly supervised memory networks: Experiments and results

    PP-attachment

    Dutch diminutives

    Spanish part-of-speech tagging

    6.4  Semi-supervised memory networks

    Semi-supervised memory networks: Experiments and results

    Part 3. Advanced topics

      7   Attention

    7.1  Neural attention

    7.2  Data

    7.3  Static attention: MLP

    7.4  Temporal attention: LSTM

    7.5  Experiments

    MLP

    LSTM

      8   Multitask learning

    8.1  Introduction to multitask learning

    8.2  Multitask learning

    8.3  Multitask learning for consumer reviews: Yelp and Amazon

    Data handling

    Hard parameter sharing

    Soft parameter sharing

    Mixed parameter sharing

    8.4  Multitask learning for Reuters topic classification

    Data handling

    Hard parameter sharing

    Soft parameter sharing

    Mixed parameter sharing

    8.5  Multitask learning for part-of-speech tagging and named-entity recognition

    Data handling

    Hard parameter sharing

    Soft parameter sharing

    Mixed parameter sharing

      9   Transformers

    9.1  BERT up close: Transformers

    9.2  Transformer encoders

    Positional encoding

    9.3  Transformer decoders

    9.4  BERT: Masked language modeling

    Training BERT

    Fine-tuning BERT

    Beyond BERT

    10   Applications of Transformers: Hands-on with BERT

    10.1  Introduction: Working with BERT in practice

    10.2  A BERT layer

    10.3  Training BERT on your data

    10.4  Fine-tuning BERT

    10.5  Inspecting BERT

    Homonyms in BERT

    10.6  Applying BERT

    Bibliography

    index

    front matter

    preface

    Computers have been trying hard to make sense of language in recent decades. Supported by disciplines like linguistics, computer science, statistics, and machine learning, the field of computational linguistics or natural language processing (NLP) has come into full bloom, supported by numerous scientific journals, conferences, and active industry participation. Big tech companies like Google, Facebook, IBM, and Microsoft appear to have prioritized their efforts in natural language analysis and understanding, and progressively offer datasets and helpful open source software for the natural language processing community. Currently, deep learning is increasingly dominating the NLP field.

    To someone who is eager to join this exciting field, the high pace at which new developments take place in the deep learning–oriented NLP community may seem daunting. There seems to be a large gap between descriptive, statistical, and more traditional machine learning approaches to NLP on the one hand, and the highly technical, procedural approach of deep learning neural networks on the other hand. This book aims to bridge this gap a bit, through a gentle introduction to deep learning for NLP. It targets students, linguists, computer scientists, practitioners, and all other people interested in artificial intelligence. Let’s refer to these groups of people as NLP engineers. When I was a student, lacking a systematic computational linguistics program in those days, I pretty much pieced together a personal—and necessarily incomplete—NLP curriculum. It was a tough job. My motivation for writing this book has been to make this journey a bit easier for aspiring NLP engineers, and to give you a head start by introducing you to the fundamentals of deep learning–based NLP.

    I sincerely believe that to become an NLP engineer with the ambition to produce innovative solutions, you need to possess advanced software development and machine learning skills. You need to fiddle with algorithms and come up with new variants yourself. Much like the 17th-century Dutch scientist Antonie van Leeuwenhoek, who designed and produced his own microscopes for experimentation, the modern-day NLP engineer creates their own digital instruments for studying and analyzing language. Whenever an NLP engineer succeeds in building a model of natural language that adheres to the facts, that is, is observationally adequate, not only industrial (that is, practical) but also scientific progress has been made. I invite you to adopt this mindset, to continuously observe how humans process language, and to contribute to the wonderful field of NLP, where, in spite of algorithmic progress, so many topics are still open!

    acknowledgments

    I wish to thank my employer, TNO (the Netherlands Organisation for Applied Scientific Research) for supporting the realization of this book. My thanks go to students from the faculties of Humanities and Science from Leiden University and assorted readers of the book for your feedback on the various MEAP versions, including correcting typos and other errors. I would also like to thank the Manning staff—in particular, development editor Dustin Archibald, production editor Keri Hales, and proofreader Katie Tennant, for their enduring support, encouragement and, above all, patience.

    At my request, Manning transfers all author fees to UNICEF. Through your purchase of this book, you contribute to a better future for children in need, and that need is even more acute in 2022. UNICEF is committed to ensuring special protection for the most disadvantaged children—victims of war, disasters, extreme poverty, all forms of violence and exploitation, and those with disabilities (www.unicef.org/about-us/mission-statement). Many thanks for your help.

    To all the reviewers: Alejandro Alcalde Barros, Amlan Chatterjee, Chetan Mehra, Deborah Mesquita, Eremey Vladimirovich Valetov, Erik Sapper, Giuliano Araujo Bertoti, Grzegorz Mika, Harald Kuhn, Jagan Mohan, Jorge Ezequiel Bo, Kelum Senanayake, Ken W. Alger, Kim Falk Jørgensen, Manish Jain, Mike F. Cuddy, Mortaza Doulaty, Ninoslav Čerkez, Philippe Van Bergen, Prabhuti Prakash, Ritwik Dubey, Rohit Agarwal, Shashank Polasa Venkata, Sowmya Vajjala, Thomas Peklak, Vamsi Sistla, and Vlad Navitski, thank you—your suggestions helped make this a better book.

    about this book

    This book will give you a thorough introduction to deep learning applied to a variety of language analysis tasks, supported by actual hands-on code. Explicitly linking the evergreens of computational linguistics (such as part-of-speech tagging, textual similarity, topic labeling, and Question Answering) to deep learning will help you become a proficient deep learning, natural language processing (NLP) expert. Beyond this, the book covers state-of-the-art approaches to challenging new problems.

    Who should read this book

    The intended audience for this book is anyone working in NLP: computational linguists, software engineers, and students. The field of machine learning–based NLP is vast and comprises a daunting number of formalisms and approaches. With deep learning entering the stage, many are eager to get their feet wet but may shy away from the highly technical nature of deep learning and the fast pace of this field—new approaches, software, and papers emerge on a daily basis. This book will bring you up to speed.

    This book is not for those who wish to become proficient in deep learning in a general manner, readers in need of an introduction to NLP, or anyone desiring to master Keras, the deep learning Python library we use. Manning offers two books that fill these gaps and can be read as companions to this book: Natural Language Processing in Action (Hobson Lane, Cole Howard, and Hannes Hapke, 2019; www.manning.com/books/natural-language-processing-in-action) and Deep Learning with Python (François Chollet, 2021: www.manning.com/books/deep-learning-with-python-second-edition). If you want a quick and thorough introduction to Keras, visit https://keras.io/getting_started/intro_to_keras_for_engineers.

    How this book is organized: A road map

    Part 1, consisting of chapters 1, 2, and 3, introduces the history of deep learning, the basic architectures of deep learning for NLP and their implementation in Keras, and how to represent text for deep learning using embeddings and popular embedding strategies.

    Part 2, consisting of chapters 4, 5, and 6, focuses on assessing textual similarity with deep learning, processing long sequences with memory-equipped models for Question Answering, and then applying such memory models to other NLP.

    Part 3, consisting of chapters 7, 8, 9, and 10, starts by introducing neural attention, then moves on to the concept of multitask learning, using Transformers, and finally getting hands-on with BERT and inspecting the embeddings it produces.

    About the code

    The code we develop in this book is somewhat generic. Keras is a dynamic library, and while I was writing the book, some things changed, including the now-exclusive dependency of Keras on TensorFlow as a backend (a Keras backend is low-level code for performing efficient neural network computations). The changes are limited, but occasionally you may need to adapt the syntax of your code if you're using the latest Keras version (version 2.0 and above).

    In the book, we draw pragmatic inspiration from public domain, open source code and reuse code snippets that are handy. Specific sources include the following:

    The Keras source code base, which contains many examples addressing NLP

    The code accompanying the companion book Deep Learning with Python

    Popular and excellent open source websites like https://adventuresinmachinelearning.com and https://machinelearningmastery.com

    Blogs like http://karpathy.github.io

    Coder communities like Stack Overflow

    The emphasis of the book is more on outlining algorithms and code and less on achieving academic state-of-the-art results. However, starting from the basic solutions and approaches outlined throughout the book, and backed up by the many practical code examples, you will be empowered to reach better results.

    This book contains many examples of source code both in numbered listings and in line with normal text. In both cases, source code is formatted in a fixed-width font like this to separate it from ordinary text.

    In many cases, the original source code has been reformatted; we’ve added line breaks and reworked indentation to accommodate the available page space in the book. In some cases, even this was not enough, and listings include line-continuation markers (➥). Code annotations accompany many of the listings, highlighting important concepts.

    You can get executable snippets of code from the liveBook (online) version of this book at https://livebook.manning.com/book/deep-learning-for-natural-language-processing. The complete code for the examples in the book is available for download from the Manning website at https://www.manning.com/books/deep-learning-for-natural-language-processing, and from GitHub at https://github.com/stephanraaijmakers/deeplearningfornlp.

    liveBook discussion forum

    Purchase of Deep Learning for Natural Language Processing includes free access to liveBook, Manning’s online reading platform. Using liveBook’s exclusive discussion features, you can attach comments to the book globally or to specific sections or paragraphs. It’s a snap to make notes for yourself, ask and answer technical questions, and receive help from the author and other users. To access the forum, go to https://livebook.manning.com/book/deep-learning-for-natural-language-processing/discussion. You can also learn more about Manning's forums and the rules of conduct at https://livebook.manning.com/discussion.

    Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between readers and the author can take place. It is not a commitment to any specific amount of participation on the part of the author, whose contribution to the forum remains voluntary (and unpaid). We suggest you try asking him some challenging questions lest his interest stray! The forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.

    about the author

    Stephan Raaijmakers

    received his education as a computational linguist at Leiden University, the Netherlands. He obtained his PhD on machine learning–based NLP from Tilburg University. He has been working since 2000 at TNO, the Netherlands Organisation for Applied Scientific Research, an independent organization founded by law in 1932, aimed at enabling business and government to apply scientific knowledge, contributing to industrial innovation and societal welfare. Within TNO, he has worked on many machine learning–intensive projects dealing with language. Stephan is also a professor of communicative AI at Leiden University (LUCL, Leiden University Centre for Linguistics). His chair focuses on deep learning–based approaches to human-machine dialogue.

    about the cover illustration

    The figure on the cover of Deep Learning for Natural Language Processing, titled Paisan de dalecarlie, or Peasant, Dalecarlia, is from an image held by the New York Public Library in the Miriam and Ira D. Wallach Division of Art, Prints and Photographs: Picture Collection. Each illustration is finely drawn and colored by hand.

    In those days, it was easy to identify where people lived and what their trade or station in life was just by their dress. Manning celebrates the inventiveness and initiative of the computer business with book covers based on the rich diversity of regional culture centuries ago, brought back to life by pictures from collections such as this one.

    Part 1. Introduction

    Part 1 introduces the history of deep learning, relating it to other forms of machine learning–based natural language processing (NLP; chapter 1). Chapter 2 discusses the basic architectures of deep learning for NLP and their implementation in Keras. Chapter 3 discusses how to represent text for deep learning using embeddings and focuses on Word2Vec and Doc2Vec, two popular embedding strategies.

    1 Deep learning for NLP

    This chapter covers

    Taking a short road trip through machine learning applied to NLP

    Learning about the historical roots of deep learning

    Introducing vector-based representations of language

    Language comes naturally to humans but has historically been hard for computers to grasp. This book addresses the application of recent, cutting-edge deep learning techniques to automated language analysis. In the last decade, deep learning has emerged as the vehicle of the latest wave in artificial intelligence (AI). Results have consistently redefined the state of the art for a plethora of data analysis tasks in a variety of domains. For an increasing number of deep learning algorithms, better- than-human (human-parity or superhuman) performance has been reported: for instance, speech recognition in noisy conditions and medical diagnosis based on images. Current deep learning–based natural language processing (NLP) outperforms all pre-existing approaches by a large margin. What exactly makes deep learning so suitable for these intricate analysis tasks, in particular language processing? This chapter presents some of the background necessary to answer this question and guides you through a selection of important topics in machine learning for NLP.

    We first examine a few main approaches to machine learning: the neural perceptron, support vector machines, and memory-based learning. After that, we look at historical developments leading to deep learning and address vector representations: encoding data (notably, textual) with numerical representations suitable for processing by neural networks.

    Let’s start by discussing a few well-known machine learning–based NLP algorithms in some detail, illustrated with a handful of practical examples to whet your appetite. After that, we present the case for deep learning–based NLP.

    1.1 A selection of machine learning methods for NLP

    Let’s start with a quick (and necessarily incomplete) tour of machine learning–based NLP (see figure 1.1). Current natural language processing heavily relies on machine learning. Machine learning has its roots in statistics, building among others on the seminal work by Thomas Bayes and Pierre-Simon Laplace in the 18th and 19th centuries and the least-squares methods for curve approximation by Legendre in 1812. The field of neural computing started with the work of McCulloch and Pitts in 1943, who put forward a formal theory (and logical calculus) of neural networks. It would take until 1950 before learning machines were proposed by Alan Turing.

    Figure 1.1 Machine learning for NLP. A first look at neural machine learning, plus background on support vector machines and memory-based learning.

    All machine learning algorithms that perform classification (labeling) share a single goal: to arrive at linear separability of data that is labeled with classes: labels that indicate a (usually exclusive) category to which a data point belongs. Data points presented to a machine learning algorithm typically consist of vector representations of descriptive traits. These representations constitute a so-called input space. The subsequent processing, manipulation, and abstraction of the input space during the learning stage of a self-learning algorithm yields a feature space. Some of this processing can be done external to the algorithm: raw data can be converted to features as part of a preprocessing stage, which technically creates an input space consisting of features. The output space consists of class labels that separate the various data points in a dataset based on the class boundaries. The essence of deep learning, as we will see, is to learn abstract representations in the feature space. Figure 1.2 illustrates how deep learning mediates between inputs and outputs: through abstract representations derived from the input data.

    Figure 1.2 From input space to output space (labels). Deep learning constructs inter-mediate, abstract representations of input data, mapping an input space to a feature space. Through this mapping, it learns to relate input to output: to map the input space to an output space (encoding class labels or other interpretations of the input data).

    Training a machine learning component involves learning boundaries between classes, which may depend on complex functions. The burden of learning class separability can be alleviated by smart feature preprocessing. Learning the class boundaries occurs by performing implicit or explicit transformations on linearly inseparable input spaces. Figure 1.3 shows a non-linear class boundary: a line separating objects in two classes that cannot be modeled by a linear function f(x) = ax + b. The function corresponding to this line is a non-linear classifier. A real-world example would be a bowl of multicolored marbles mixed in such a way that they cannot be separated from each other by means of a straight plate (like a flat scoop).

    Figure 1.3 Non-linear classifier. The two classes (indicated with circles and triangles) cannot be separated with a linear line.

    A linear function that separates classes with a straight line is a linear classifier and would produce a picture like figure 1.4.

    Figure 1.4 Linear classifier. The two classes (indicated with circles and triangles) can be separated with a straight line.

    We now briefly address three types of machine learning approaches that have had major uptake in NLP:

    The single-layer perceptron and its generalization to the multilayer perceptron

    Support vector machines

    Memory-based learning

    While there is a lot more to the story, these three types embody, respectively, the neural or cognitive, eager, and lazy types of machine learning. All of these approaches relate naturally to the deep learning approach to natural language analysis, which is the main topic of this book.

    1.1.1 The perceptron

    In 1957, the first implementation of a biologically inspired machine learning component was realized: Rosenblatt’s perceptron. This device, implemented on physical hardware, allowed the processing of visual stimuli represented by a square 400 (20 by 20) array of photosensitive cells. The weights of this network were set by electromotors driving potentiometers. The learning part of this perceptron was based on a simple one-layer neural network, which effectively became the archetype of neural networks (see figure 1.5).

    Figure 1.5 Rosenblatt’s perceptron: the fruit fly of neural machine learning. It represents a single neuron receiving several inputs and generating (by applying a threshold) a single output value.

    Suppose you have a vector of features that describe aspects of a certain object of interest, like the words in a document, and you want to create a function from these features to a binary label (for instance, you want to decide if the document conveys a positive or negative sentiment). The single-layer perceptron is capable of doing this. It produces a binary output y (0 or 1) from a weighted combination of input values x1...xn, based on a threshold θ and a bias b:

    The weights w1, ...wn are learned from annotated training data consisting of input vectors labeled with output labels. The thresholded unit is called a neuron. It receives the summed and weighted input v. So, assume we have the set of weights and associated inputs shown in table 1.1.

    Table 1.1 Weighted input

    Then their summed and weighted output would be

    This simplistic network is able to learn a specific set of functions that address the class of linearly separable problems: problems that are separable in input space with a linear function. Usually, these are the easier problems in classification. It is quite common for data to be heavily entangled. Consider undoing a knot in two separate ropes. Some knots are easy and can be undone in one step. Other knots need many more steps. This is the business of machine learning algorithms: undoing the intertwining of data objects living in different classes. For NLP, the single-layer perceptron nowadays plays a marginal role, but it underlies several derived algorithms that strive for simplicity, such as online learning (Bottou 1998).

    A practical example of a perceptron classifier is the following. We set out to build a document classifier that categorizes raw texts as being broadly about either atheism or medical topics. The popular 20 newsgroups dataset (http://qwone.com/~jason/20Newsgroups), one of the most widely used datasets for building and evaluating document classifiers, consists of newsgroup (Usenet) texts distributed over 20 hand-assigned topics. Here is what we do:

    Make a subselection for two newsgroups of interest: alt.atheism and sci.med.

    Train a simple perceptron on a vector representation of the documents in these two classes. A vector is nothing more than a container (an ordered list of a finite dimension) for numerical values.

    The vector representation is based on a statistical representation of words called TF.IDF, which we discuss in section 1.3.2. For now, just assume TF.IDF is a magic trick that turns documents into vectors that can be fed to a machine learning algorithm.

    Don’t worry if you don’t completely understand the following listing right now. It’s here to give you an idea of what the code looks like for a basic perceptron.

    Listing 1.1 A simple perceptron-based document classifier

    from sklearn.linear_model import Perceptron                          ① from sklearn.datasets import fetch_20newsgroups                      ② categories = ['alt.atheism', 'sci.med']                              ③ train = fetch_20newsgroups(å subset='train',categories=categories, shuffle=True)                  ④ perceptron = Perceptron(max_iter=100)                                ⑤ from sklearn.feature_extraction.text import CountVectorizer          ⑥ cv = CountVectorizer() X_train_counts = cv.fit_transform(train.data) from sklearn.feature_extraction.text import TfidfTransformer        ⑦ tfidf_tf = TfidfTransformer() X_train_tfidf = tfidf_tf.fit_transform(X_train_counts) perceptron.fit(X_train_tfidf,train.target)                          ⑧ test_docs = ['Religion is widespread, even in modern times', 'His kidneyå failed','The pope is a controversial leader', 'White blood cells fightå off infections','The reverend had a heart attack in church']        ⑨ X_test_counts = cv.transform(test_docs)                              ⑩ X_test_tfidf = tfidf_tf.transform(X_test_counts) pred = perceptron.predict(X_test_tfidf)                              ⑪ for doc, category in zip(test_docs, pred):                          ⑫     print('%r => %s' % (doc, train.target_names[category]))

    ① Import a basic perceptron classifier from sklearn.

    ② Import a routine for fetching the 20 newsgroups dataset from sklearn.

    ③ Limit the categories of the dataset.

    ④ Obtain documents for our category selection.

    ⑤ Our perceptron is defined. It will be trained for 100 iterations.

    ⑥ The familiar CountVectorizer is fit on our training data.

    ⑦ Load, fit, and deploy a TF.IDF transformer from sklearn. It computes TF.IDF representations of our count vectors.

    ⑧ The perceptron is trained on the TF.IDF vectors.

    ⑨ Our test

    Enjoying the preview?
    Page 1 of 1