Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

State of the Art on Grammatical Inference Using Evolutionary Method
State of the Art on Grammatical Inference Using Evolutionary Method
State of the Art on Grammatical Inference Using Evolutionary Method
Ebook408 pages3 hours

State of the Art on Grammatical Inference Using Evolutionary Method

Rating: 0 out of 5 stars

()

Read preview

About this ebook

State of the Art on Grammatical Inference Using Evolutionary Method presents an approach for grammatical inference (GI) using evolutionary algorithms. Grammatical inference deals with the standard learning procedure to acquire grammars based on evidence about the language. It has been extensively studied due to its high importance in various fields of engineering and science. The book's prime purpose is to enhance the current state-of-the-art of grammatical inference methods and present new evolutionary algorithms-based approaches for context free grammar induction. The book's focus lies in the development of robust genetic algorithms for context free grammar induction.

The new algorithms discussed in this book incorporate Boolean-based operators during offspring generation within the execution of the genetic algorithm. Hence, the user has no limitation on utilizing the evolutionary methods for grammatical inference.

  • Discusses and summarizes the latest developments in Grammatical Inference, with a focus on Evolutionary Methods
  • Provides an understanding of premature convergence as well as genetic algorithms
  • Presents a performance analysis of genetic algorithms as well as a complete look into the wide range of applications of Grammatical Inference methods
  • Demonstrates how to develop a robust experimental environment to conduct experiments using evolutionary methods and algorithms
LanguageEnglish
Release dateNov 13, 2021
ISBN9780128221549
State of the Art on Grammatical Inference Using Evolutionary Method
Author

Hari Mohan Pandey

Dr. Hari Mohan Pandey is Lecturer in Computer Science at Edge Hill University, UK. He is specialized in Computer Science & Engineering. His research area includes artificial intelligence, soft computing techniques, natural language processing, language acquisition and machine learning algorithms. He is author of various books in computer science engineering (algorithms, programming and evolutionary algorithms). He has published over 50 scientific papers in reputed journals and conferences, served as session chair, leading guest editor and delivered keynotes. He has been given the prestigious award “The Global Award for the Best Computer Science Faculty of the Year 2015” award for completing INDO-US project “GENTLE”, award (Certificate of Exceptionalism) from the Prime Minister of India and award for developing innovative teaching and learning models for higher-education. Previously, he worked as a research fellow in machine learning at Middlesex University, London where he worked on a European Commission project- DREAM4CAR. His role was to research and develop advanced machine learning techniques relevant to the project goals and to evaluate these on both project and reference data sets, to lead and manage relevant work packages in support of the Project, ensuring appropriate interfacing with partners.

Related to State of the Art on Grammatical Inference Using Evolutionary Method

Related ebooks

Science & Mathematics For You

View More

Related articles

Related categories

Reviews for State of the Art on Grammatical Inference Using Evolutionary Method

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    State of the Art on Grammatical Inference Using Evolutionary Method - Hari Mohan Pandey

    State of the Art on Grammatical Inference Using Evolutionary Method

    Hari Mohan Pandey

    Department of Computer Science, Edge Hill University, United Kingdom

    Table of Contents

    Cover image

    Title page

    Copyright

    Dedication

    Foreword

    Preface

    Acknowledgment

    Abbreviations

    1. Introduction and scientific goals

    1.1. Introduction

    1.2. Why grammatical inference is popular

    1.3. Scientific goals: why this book?

    2. State of the art: grammatical inference

    2.1. Introduction

    2.2. Part 1. Preliminary definitions

    2.3. Part 2. Introduction to learning algorithms

    2.4. Comparison and discussion

    2.5. What are the challenges with grammatical inference algorithms?

    2.6. Summary

    3. State of the art: genetic algorithms and premature convergence

    3.1. Introduction

    3.2. Factors affecting genetic algorithms

    3.3. Theoretical framework

    3.4. Approaches to preventing premature convergence

    3.5. Classifications and analyses

    3.6. Challenges with the genetic algorithm

    3.7. Summary

    4. Genetic algorithms and grammatical inference

    4.1. Introduction

    4.2. Bit-mask oriented genetic algorithm

    4.3. Bit-masking oriented data structure

    4.4. Reproduction operators: crossover and mutation mask fill

    4.5. New offspring generation

    4.6. Genetic algorithm implemented for grammar induction

    4.7. Maintaining regularity and generalization and minimum description length principle

    4.8. Grammatical inference and minimum description length principle

    4.9. Summary

    5. Performance analysis of genetic algorithm for grammatical inference

    5.1. Introduction

    5.2. Simulation model and test languages

    5.3. Parameter selection and tuning

    5.4. Performance analysis of proposed bit masking-oriented genetic algorithm

    5.5. Summary

    6. Applications of grammatical inference methods and future development

    6.1. Introduction

    6.2. Application of grammatical inference method

    6.3. Opportunities for future research

    Subject index

    Author index

    Copyright

    Academic Press is an imprint of Elsevier

    125 London Wall, London EC2Y 5AS, United Kingdom

    525 B Street, Suite 1650, San Diego, CA 92101, United States

    50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States

    The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom

    Copyright © 2022 Elsevier Inc. All rights reserved.

    No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.

    This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

    Notices

    Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

    Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

    To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

    Library of Congress Cataloging-in-Publication Data

    A catalog record for this book is available from the Library of Congress

    British Library Cataloguing-in-Publication Data

    A catalogue record for this book is available from the British Library

    ISBN: 978-0-12-822116-7

    For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals

    Publisher: Mara Conner

    Acquisitions Editor: Chris Katsaropoulos

    Editorial Project Manager: Franchezca A. Cabural

    Production Project Manager: Omer Mukthar

    Cover Designer: Victoria Pearson

    Typeset by TNQ Technologies

    Dedication

    To my mom and dad for motivating me and making my life sweeter.

    To my brother and sisters for helping and supporting me.

    To my wife for giving time and understanding my vision.

    To sweetest Anant for making me smile all the time.

    Foreword

    Grammatical inference, that is, learning a formal grammar from a set of observations, has many practical applications. These arise in natural language processing, computational biology, and many other areas. As with many challenges in machine learning, grammatical inference remains unsolved. Yet, over the years, a variety of machine approaches have been applied to these problems, with varying degrees of success depending on the application and the implementation. In this book, Prof. Pandey provides a very helpful overview of some of these methods, and also offers methods and results based on his own approaches within the field of evolutionary computation. Mainly, the techniques focus on derivatives of canonical genetic algorithms and swarm optimization, but the reader will be able to take any of Prof. Pandey's efforts and investigate and extend them for their own purposes using any other approach that is worthy of exploration. To get the most from the content, readers should give themselves the patience to understand the algorithms, and their acronyms, so as to be able to become familiar with the approaches. There will continue to be a need for improving our ability to perform grammatical inference reliably, and Prof. Pandey's contributions outlined in this book help to solidify the foundation for continuing this process of improvement.

    David Fogel

    Natural Selection, Inc.

    San Diego, California

    Preface

    This book is a comprehensive, hands-on guide to formal grammatical inference using evolutionary algorithms. This book is designed and Chaptalize for researchers, academicians, and students for all levels. The contents of this book assume that you have no previous knowledge or background information about grammatical inference and evolutionary algorithms, especially genetic algorithms. People familiar with formal languages, compiler construction, and parser design, and a basic understanding of algorithm design and analysis will, of course, have an easier time and move through the first few chapters quickly.

    The first three chapters of this book will allow you to take full advantage of existing learning or grammatical inference algorithms along with a comprehensive overview of both grammatical inference and genetic algorithms. You will start at the beginning, and when you have finished this book, you will have moved far along the road to grammatical inference algorithms. I have tried hard to cover most learning algorithms, starting from the very first learning algorithm proposed by Gold in 1950.

    I have also tried to stress new ways of thinking required to master learning and evolutionary algorithms, so that even experts in grammatical inference or evolutionary algorithms can benefit from this book. I have used this approach to force evolutionary algorithms or genetic algorithms into the framework of older grammatical inference algorithms with evolutionary and metaheuristic algorithms.

    To make this book even more useful, there are extensive discussions of important topics. The strengths and weakness of each method are given, which are left out by most other introductory books. Unlike many introductory books, I want to introduce you to a topic, but I also go into it in enough depth that you can actually use the techniques to develop new algorithms.

    The subject matter of this book is divided into six chapters, each of which has been written and developed with immensely simplified theoretical and practical concepts (except Chapter 1, which is the foundation chapter to introduce scientific goals). These chapters clearly provide the core concepts of grammatical inference and genetic algorithms. State of the Art on Grammatical Inference Using Evolutionary Method was written especially for those who are novices to the field of grammatical inference and evolutionary algorithms.

    As mentioned, this book is divided into six chapters. A short summary of each chapter is presented next for a quick look.

    Chapter 1 (Introduction and Scientific Goals): This chapter presents an introductory text with the motivation and scope of the book. It shows research problems and main contributions and briefly describes grammatical inference and its effectiveness across domains. The basics of formal grammars and various existing grammatical inference methods are discussed in Chapter 2.

    Chapter 2 (State of the Art: Grammatical Inference): This chapter presents the current state of the art within the context of grammatical inference methods. I have divided this chapter into two parts. Part 1 covers some preliminary definitions such as Backus-Naur form, grammars, and Chomsky hierarchy. The focus of Part 2 is mainly on the different grammatical inference methods. A comprehensive discussion of each method is presented with their strengths and weaknesses. Grammatical inference methods are classified based on various factors and learning techniques. At the end, challenges are presented with grammatical inference methods and a summary.

    Chapter 3 (State of the Art: Genetic Algorithms and Premature Convergence): The purpose of this chapter is to discuss the genetic algorithm, which is a popular algorithm in the evolutionary algorithm family. An introduction of genetic algorithm is provided with factors affecting its behavior with theoretical frameworks. In addition, challenges in executing genetic algorithms are presented. The state of the art within the context of premature convergence in genetic algorithms is presented comprehensively. A detailed summary and analysis of different methods are given for quick review. A comparative analysis is provided based on different parameters. The motivation for this chapter is to identify methods that allow the development of new strategies to prevent premature convergence, and then to apply evolutionary algorithms to solve grammatical inference problems.

    Chapter 4 (Genetic Algorithms and Grammatical Inference): The focus of this chapter is on evolutionary algorithms used for grammatical inference. For this book, I have considered genetic algorithms such as bit-masking oriented genetic algorithms. I discuss the role of bit-masking oriented data structure and its formation. The roles of the crossover mask (CM) and mutation mask (MM) are also shown with three crossover operators and an MM operator. In addition, the role of Boolean-based procedure is discussed with examples for the offspring generation. Finally, algorithms are shown that use the CM and MM with a Boolean-based procedure for grammatical inference. A detailed flowchart is presented highlighting the applicability of the minimum description length principle.

    Chapter 5 (Performance Analysis of Genetic Algorithm for Grammatical Inference): The primary aim of this chapter is to report computational and statistical test results by implementing algorithms discussed in Chapter 4. This chapter shows the detailed method for developing a robust experimental setup to conduct the experiments. This chapter is also presents a comparison with and analysis of other algorithms.

    Chapter 6 (Applications of Grammatical Inference Methods and Future Development): This chapter discusses the wide range of applications of grammatical inference methods and the possibilities of future investigation in this area.

    Now, a confession: My original goal was to make this book a one-stop resource, but realistically, grammatical inference and evolutionary algorithms have gotten far too big and powerful for any one book to do this. Nonetheless, if you finish this book, I truly believe that you will be in a position to begin writing or implementing both grammatical inference and evolutionary algorithms for commercial-quality applications! True mastery will take longer. I have tried to explain the advantages, disadvantages, and suggestions that can take you to the next level.

    Acknowledgment

    First of all, I would like to thank Baba Visvnath for constant blessings throughout the development of this book. Because an understanding of a study such as this is never the outcome of efforts of a single person, it bears the imprint of a number of people who have directly or indirectly helped me complete this book, State of the Art on Grammatical Inference Using Evolutionary Method. I would be failing in my duty if I did not thank all those whose sincere advice helped me to make this book truly educative, effective, and pleasurable. I would like to acknowledge my family: Dr. Vijay Nath Pandey, Smt. Madhuri Pandey, Anjana Pandey and Ranjana Pandey, Man Mohan Pandey, Rachana Pandey, and Anant (my sweet babu).

    I have immense pleasure in expressing wholehearted gratitude to my supervisors and mentors, Dr. Deepti Mehrotra (Amity University), Dr. Ankit Chaudhary (University of Missouri–St. Louis), and Prof. Abhay Bansal (Amity University). I am also very thankful to my friends and mentors, Prof. Arun Prakash Agarwal (Sharda University), Prof. Ankur Choudhary (Sharda University), Prof. Gaurav Raj (Sharda University), Prof. Neha Agarwal (Amity University), Prof. Neetu Narayan (Amity University), Prof. Anchal Garg (Amity University), Prof. Ranjeet Rout (NIT Srinagar), Shruti Gupta (Amity University), Prof. Graham Kendall (University of Nottingham), Prof. David Windridge (Middlesex University), Prof. Nik Bessis (Edge Hill University), Prof. David Fogel (Natural Selection), and Prof. Francesco Masulli (University of Genova) for supporting and guiding me during the preparation of this book. Also, I am truly thankful to my students, whose conceptual queries have always helped me dig more deeply into the subject matter.

    Last but not least, I am thankful to Elsevier ERC Editorial USA for feedback, support, and guidance for writing and publishing this book.

    Dr. Hari Mohan Pandey (Author)

    Abbreviations

    ABL   Alignment-Based Learning

    ADIOS   Automatic Distillation of Structure

    AGA   Abstract Genetic Algorithm

    AHCF   Adaptive Hierarchical Fair Competition

    ALLiS   Architecture for Learning Linguistic Structure

    ALPS   Age-Layered Population Structure

    AMR   Adaptive Mutation Rate

    ATIS   Air Travel Information System

    BBP   Boolean-Based Procedure

    BFWA   Broadband Fixed Wireless Access

    BMODS   Bit-Mask Oriented Data Structure

    BMOGA   Bit-Mask Oriented Genetic Algorithm

    BMU   Best Matching Unit

    BNF   Backus–Naur Form

    BS   Base Station

    CA   Clustering Approach

    CDC   Context Distribution Clustering

    CF   Crowding Factor

    CFBS   Correlative Family-Based Selection

    CFG   Context-Free Grammar

    CFL   Context-Free Language

    CG   Categorical Grammar

    CGA   Conventional Genetic Algorithm

    CHFC   Continuous Hierarchical Fair Competition

    CLL   Computational Learning of Natural Language

    CM   Crossmask/Crossover Mask

    CMOC   Crossover-Mutation Operator Combination

    CNF   Conjunctive Normal Form

    CTS   Correlative Tournament Selection

    DARO   Dynamic Application Reproduction Operator

    DCGA   Diversity Control-Oriented Genetic Algorithm

    DE   Differential Evolution

    DE-MC   Differential Evolution Markov Chain

    DFA   Deterministic Finite Automata

    DNF   Disjunctive Normal Form

    DSL   Domain-Specific Language

    EA   Evolutionary Algorithm

    EMPGA   Elite Mating Pool Genetic Algorithm

    EP   Evolutionary Programming

    ES   Evolutionary Strategy

    FC   Frequency Crossover

    FKB   Fuzzy Knowledge Base

    FUS   Fitness Uniform Selection

    GA   Genetic Algorithm

    GAC   Genetic Algorithm with Compliments

    GASOM   Genetic Algorithm Using Self-organizing Maps

    GAVaPS   Genetic Algorithm with Varying Population Size

    GAWMDL   Genetic Algorithm With Minimum Description Length Principle

    GAWOMDL   Genetic Algorithm Without Minimum Description Length Principle

    GCG   Generalized Categorial Grammar

    GI   Grammatical Inference

    GP   Genetic Programming

    GRIDS   Grammar Induction Drive by Simplicity

    GSM   Gready Search Method

    HFC   Hierarchical Fair Competition

    HM   Heuristic Method

    IFS   Individual Fitness Score

    IPA   Incest Prevention Algorithm

    ITBL   Improved Tabular Representation Algorithm

    KL   Kullback–Leibler

    LAgent   Language Agent

    LHDC   Long Hamming Distance Crossover

    LOC   Local Optimum Convergence

    MA   Memetic Algorithm

    MCMC   Markov Chain Monte-Carlo

    MCMP   Multicrossover on Multiple Parents

    MCMPIP   Multiple Crossover on Multiple Parents with Incest Prevention

    MCPC   Multiple Crossovers per Couple

    MDL   Minimum Description Length

    MM   Mutmask/Mutation Mask

    MOGA   Matrix-Oriented Genetic Algorithm

    NN   Neural Network

    NSGA   Nondominated Sorting Genetic Algorithm

    NSGA-DAR   Nondominated Sorting Genetic Algorithm–Dynamic Application of Reproduction Operators

    OEGA   Odd–Even Genetic Algorithm

    OVIS   Openbaar Vervoer Informatie System

    PA   Pygmy Algorithm

    PAC   Probably Approximately Correct

    PAHCF   Parallel Adaptive Hierarchal Fair Competition

    PDT   Population Distribution Table

    PRA   Population Replacement Algorithm

    PSO   Particle Swarm Optimization

    QPSO   Quantum-Behaved Particle Swarm Optimization

    RBCGA   Real/Binary-like Coded Genetic Algorithm

    RE   Resultant Effectivity

    RL   Regular Language

    RNN   Recurrent Neural Network

    RO   Random offspring

    ROGGA   Random Offspring Generation Genetic Algorithm

    RPF   Racial Preference Factor

    RTS   Restricted Tournament Selection

    SA   Simulated Annealing

    SASEGA   Self-adaptive Segregative Genetic Algorithm

    SASEGASA   Self-adaptive Segregative Genetic Algorithm with Simulated Annealing aspects

    SBGA   Shifting Balance Genetic Algorithm

    SBT   Shifting Balance Theory

    SDT   Social Disaster Techniques

    SEGA   Segregative Genetic Algorithm

    SHDC   Short Hamming Distance Crossover

    SHT   Search History Table

    SINR   Signal to Interference and Noise Ratio

    SM   Statistical Method

    SNR   Signal-to-Noise Ratio

    SOM   Self-organizing Map

    TBL   Tabular Representation Algorithm

    TPC   Two-Point Crossover

    TPCIS   Two-Point Crossover with Internal Swapping

    TS   Terminal Station

    TSP   Traveling Salesman Problem

    UC   Uniform Crossover

    1: Introduction and scientific goals

    Abstract

    This chapter explains the rationale and importance of studying grammatical inference. Section 1.2 highlights the application areas where grammatical inference has been used successfully. Scientific goals and different objectives are listed in Section 1.3. The purpose of this chapter is to set a road map for readers so they can develop an understanding of what this book is about.

    Keywords

    Evolutionary algorithms; Genetic algorithm; Genetic programming; Grammatical inference; Languages; Multilingual data; Programming languages

    1.1. Introduction

    Multilingual text data are increasing daily. This creates opportunities and challenges in the field of computational linguistics. This book will enrich the domains of language processing with performance improvement. This book is a resource presenting an approach to grammatical inference (GI) using the evolutionary algorithm (EA). GI deals with the standard learning procedure to acquire grammar based on evidence about language. It was extensively studied owing to its high importance in various fields of science and engineering. Typically, GI is a research domain that uses an unsupervised learning procedure to extract grammatical descriptions from a corpus or set of textual data. Several approaches have been proposed for GI that had their own merits and demerits. It was noticed that some approaches showed good results whereas others had limitations in dealing with negative samples. Hence, this book provides a basic but comprehensive overview of different GI algorithms. The book can be used as a reference study material for different domain. It focuses on formal grammar learning and applications, language acquisition, language processing, and information retrieval.

    1.2. Why grammatical inference is popular

    The field of GI is transversal to several research domains. There are few applications based purely on GI, but many uses of GI techniques exist; hence, there is plenty of room to find tasks in which GI techniques have performed far better than other machine learning or pattern recognition methods. GI methods have been used successfully in many domains, including robotics and control systems, structural pattern recognition, computational linguistics, automatic translation, computational biology, inductive logic programming, document management, compression, applications to time series, data mining, and music. GI methods have attracted researchers who are working in the software engineering domain. Some application areas in which GI techniques have been heavily used are (1) inference of general-purpose programming languages; (2) inference of grammar rules from programming languages; (3) inference of domain-specific languages; (4) inference of graph grammar and visual languages; and (5) software testing and test case generation. This discussion reveals that GI has a wide spectrum of applications, but there are few powerful GI algorithms. Where they exist, they deal only with the positive corpus. Some GI algorithms have been proposed to deal with both the positive and negative corpus. These GI algorithms use EAs or other optimization methods such as the greedy method, minimum description length principle, and statistical methods. Most GI algorithms are based on EAs whose genetic algorithm and genetic programming have been employed on a larger scale. There is no doubt about the exploratory power of EAs, but they suffer with key challenges known as premature convergence and slow finishing. Hence, a serious investigation is required to withstand premature convergence. This book presents a comprehensive study of approaches that were proposed to handle premature convergence.

    1.3. Scientific goals: why this book?

    The aims of this book are to present the state of the art on GI algorithms and discuss the most recent evolutionary algorithms proposed for GI from the positive and negative corpus.

    To achieve these aims, these objectives were formed in the preparation of this book:

    Objective 1: Explore the current state of the art for GI algorithms.

    Objective 2: Offer a comprehensive discussion on EAs. Here, the focus will be on the genetic algorithm.

    Objective 3: Explore the current state of the art on premature

    Enjoying the preview?
    Page 1 of 1