State of the Art on Grammatical Inference Using Evolutionary Method
()
About this ebook
State of the Art on Grammatical Inference Using Evolutionary Method presents an approach for grammatical inference (GI) using evolutionary algorithms. Grammatical inference deals with the standard learning procedure to acquire grammars based on evidence about the language. It has been extensively studied due to its high importance in various fields of engineering and science. The book's prime purpose is to enhance the current state-of-the-art of grammatical inference methods and present new evolutionary algorithms-based approaches for context free grammar induction. The book's focus lies in the development of robust genetic algorithms for context free grammar induction.
The new algorithms discussed in this book incorporate Boolean-based operators during offspring generation within the execution of the genetic algorithm. Hence, the user has no limitation on utilizing the evolutionary methods for grammatical inference.
- Discusses and summarizes the latest developments in Grammatical Inference, with a focus on Evolutionary Methods
- Provides an understanding of premature convergence as well as genetic algorithms
- Presents a performance analysis of genetic algorithms as well as a complete look into the wide range of applications of Grammatical Inference methods
- Demonstrates how to develop a robust experimental environment to conduct experiments using evolutionary methods and algorithms
Hari Mohan Pandey
Dr. Hari Mohan Pandey is Lecturer in Computer Science at Edge Hill University, UK. He is specialized in Computer Science & Engineering. His research area includes artificial intelligence, soft computing techniques, natural language processing, language acquisition and machine learning algorithms. He is author of various books in computer science engineering (algorithms, programming and evolutionary algorithms). He has published over 50 scientific papers in reputed journals and conferences, served as session chair, leading guest editor and delivered keynotes. He has been given the prestigious award “The Global Award for the Best Computer Science Faculty of the Year 2015 award for completing INDO-US project “GENTLE, award (Certificate of Exceptionalism) from the Prime Minister of India and award for developing innovative teaching and learning models for higher-education. Previously, he worked as a research fellow in machine learning at Middlesex University, London where he worked on a European Commission project- DREAM4CAR. His role was to research and develop advanced machine learning techniques relevant to the project goals and to evaluate these on both project and reference data sets, to lead and manage relevant work packages in support of the Project, ensuring appropriate interfacing with partners.
Related to State of the Art on Grammatical Inference Using Evolutionary Method
Related ebooks
Organic Chemistry Concepts: An EFL Approach Rating: 0 out of 5 stars0 ratingsSensory Evaluation Practices Rating: 0 out of 5 stars0 ratingsMachine Learning with Noisy Labels: Definitions, Theory, Techniques and Solutions Rating: 0 out of 5 stars0 ratingsPedigree Analysis in R Rating: 0 out of 5 stars0 ratingsAn Introduction to Practical Formal Methods Using Temporal Logic Rating: 0 out of 5 stars0 ratingsModern Deep Learning Design and Application Development: Versatile Tools to Solve Deep Learning Problems Rating: 0 out of 5 stars0 ratingsMathematical Optimization Terminology: A Comprehensive Glossary of Terms Rating: 0 out of 5 stars0 ratingsCognitive Information Systems in Management Sciences Rating: 0 out of 5 stars0 ratingsMachine Reading Comprehension: Algorithms and Practice Rating: 0 out of 5 stars0 ratingsObject-oriented Programming with Smalltalk Rating: 0 out of 5 stars0 ratingsIntroduction to Quantitative Macroeconomics Using Julia: From Basic to State-of-the-Art Computational Techniques Rating: 0 out of 5 stars0 ratingsIntroduction to Algorithms for Data Mining and Machine Learning Rating: 0 out of 5 stars0 ratingsCognitive Approach to Natural Language Processing Rating: 0 out of 5 stars0 ratingsIntroduction to Genetics: Science of Heredity Rating: 0 out of 5 stars0 ratingsEfficient Computation of Argumentation Semantics Rating: 0 out of 5 stars0 ratingsTechniques of Functional Analysis for Differential and Integral Equations Rating: 0 out of 5 stars0 ratingsExploring Mathematical Modeling in Biology Through Case Studies and Experimental Activities Rating: 0 out of 5 stars0 ratingsMulti-criteria Decision Analysis: Methods and Software Rating: 0 out of 5 stars0 ratingsGuidelines for Effective Research to Publication: A Concise Approach Rating: 0 out of 5 stars0 ratingsAn Elementary Introduction to Statistical Learning Theory Rating: 0 out of 5 stars0 ratingsFoundations of Biomaterials Engineering Rating: 0 out of 5 stars0 ratingsSyntax: A Generative Introduction Rating: 3 out of 5 stars3/5Handbook of Regression Analysis Rating: 0 out of 5 stars0 ratingsShape-Memory Polymer Device Design Rating: 5 out of 5 stars5/5Sensory Evaluation Practices Rating: 5 out of 5 stars5/5Introduction to the Constraints-Led Approach: Application in Football Rating: 5 out of 5 stars5/5Learn the Secrets of Writing Thesis: for High School, Undergraduate, Graduate Students and Academicians Rating: 0 out of 5 stars0 ratingsNature-Inspired Optimization Algorithms Rating: 4 out of 5 stars4/5Milestone Moments in Getting your PhD in Qualitative Research Rating: 0 out of 5 stars0 ratingsThe Models of Skill Acquisition and Expertise Development: A Quick Reference of Summaries Rating: 0 out of 5 stars0 ratings
Science & Mathematics For You
The Joy of Gay Sex: Fully revised and expanded third edition Rating: 4 out of 5 stars4/5The Big Book of Hacks: 264 Amazing DIY Tech Projects Rating: 4 out of 5 stars4/5Activate Your Brain: How Understanding Your Brain Can Improve Your Work - and Your Life Rating: 4 out of 5 stars4/5Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career Rating: 4 out of 5 stars4/5Becoming Cliterate: Why Orgasm Equality Matters--And How to Get It Rating: 4 out of 5 stars4/5Homo Deus: A Brief History of Tomorrow Rating: 4 out of 5 stars4/5How Emotions Are Made: The Secret Life of the Brain Rating: 4 out of 5 stars4/5Feeling Good: The New Mood Therapy Rating: 4 out of 5 stars4/5Alchemy: The Dark Art and Curious Science of Creating Magic in Brands, Business, and Life Rating: 4 out of 5 stars4/5Outsmart Your Brain: Why Learning is Hard and How You Can Make It Easy Rating: 4 out of 5 stars4/5On Food and Cooking: The Science and Lore of the Kitchen Rating: 5 out of 5 stars5/5Memory Craft: Improve Your Memory with the Most Powerful Methods in History Rating: 3 out of 5 stars3/52084: Artificial Intelligence and the Future of Humanity Rating: 4 out of 5 stars4/5The Systems Thinker: Essential Thinking Skills For Solving Problems, Managing Chaos, Rating: 4 out of 5 stars4/5No-Drama Discipline: the bestselling parenting guide to nurturing your child's developing mind Rating: 4 out of 5 stars4/5Bad Science: Quacks, Hacks, and Big Pharma Flacks Rating: 4 out of 5 stars4/5Metaphors We Live By Rating: 4 out of 5 stars4/5109 East Palace: Robert Oppenheimer and the Secret City of Los Alamos Rating: 5 out of 5 stars5/5The Psychology of Totalitarianism Rating: 5 out of 5 stars5/5Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness Rating: 4 out of 5 stars4/5The Invisible Rainbow: A History of Electricity and Life Rating: 4 out of 5 stars4/5Free Will Rating: 4 out of 5 stars4/5The Gulag Archipelago [Volume 1]: An Experiment in Literary Investigation Rating: 4 out of 5 stars4/5The Woman Who Changed Her Brain: And Other Inspiring Stories of Pioneering Brain Transformation Rating: 4 out of 5 stars4/5A Crack In Creation: Gene Editing and the Unthinkable Power to Control Evolution Rating: 4 out of 5 stars4/5Why People Believe Weird Things: Pseudoscience, Superstition, and Other Confusions of Our Time Rating: 4 out of 5 stars4/5The Gulag Archipelago: The Authorized Abridgement Rating: 4 out of 5 stars4/5The Truth About COVID-19: Exposing The Great Reset, Lockdowns, Vaccine Passports, and the New Normal Rating: 3 out of 5 stars3/5The Big Fat Surprise: Why Butter, Meat and Cheese Belong in a Healthy Diet Rating: 4 out of 5 stars4/5The Wisdom of Psychopaths: What Saints, Spies, and Serial Killers Can Teach Us About Success Rating: 4 out of 5 stars4/5
Related categories
Reviews for State of the Art on Grammatical Inference Using Evolutionary Method
0 ratings0 reviews
Book preview
State of the Art on Grammatical Inference Using Evolutionary Method - Hari Mohan Pandey
State of the Art on Grammatical Inference Using Evolutionary Method
Hari Mohan Pandey
Department of Computer Science, Edge Hill University, United Kingdom
Table of Contents
Cover image
Title page
Copyright
Dedication
Foreword
Preface
Acknowledgment
Abbreviations
1. Introduction and scientific goals
1.1. Introduction
1.2. Why grammatical inference is popular
1.3. Scientific goals: why this book?
2. State of the art: grammatical inference
2.1. Introduction
2.2. Part 1. Preliminary definitions
2.3. Part 2. Introduction to learning algorithms
2.4. Comparison and discussion
2.5. What are the challenges with grammatical inference algorithms?
2.6. Summary
3. State of the art: genetic algorithms and premature convergence
3.1. Introduction
3.2. Factors affecting genetic algorithms
3.3. Theoretical framework
3.4. Approaches to preventing premature convergence
3.5. Classifications and analyses
3.6. Challenges with the genetic algorithm
3.7. Summary
4. Genetic algorithms and grammatical inference
4.1. Introduction
4.2. Bit-mask oriented genetic algorithm
4.3. Bit-masking oriented data structure
4.4. Reproduction operators: crossover and mutation mask fill
4.5. New offspring generation
4.6. Genetic algorithm implemented for grammar induction
4.7. Maintaining regularity and generalization and minimum description length principle
4.8. Grammatical inference and minimum description length principle
4.9. Summary
5. Performance analysis of genetic algorithm for grammatical inference
5.1. Introduction
5.2. Simulation model and test languages
5.3. Parameter selection and tuning
5.4. Performance analysis of proposed bit masking-oriented genetic algorithm
5.5. Summary
6. Applications of grammatical inference methods and future development
6.1. Introduction
6.2. Application of grammatical inference method
6.3. Opportunities for future research
Subject index
Author index
Copyright
Academic Press is an imprint of Elsevier
125 London Wall, London EC2Y 5AS, United Kingdom
525 B Street, Suite 1650, San Diego, CA 92101, United States
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
Copyright © 2022 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
ISBN: 978-0-12-822116-7
For information on all Academic Press publications visit our website at https://www.elsevier.com/books-and-journals
Publisher: Mara Conner
Acquisitions Editor: Chris Katsaropoulos
Editorial Project Manager: Franchezca A. Cabural
Production Project Manager: Omer Mukthar
Cover Designer: Victoria Pearson
Typeset by TNQ Technologies
Dedication
To my mom and dad for motivating me and making my life sweeter.
To my brother and sisters for helping and supporting me.
To my wife for giving time and understanding my vision.
To sweetest Anant for making me smile all the time.
Foreword
Grammatical inference, that is, learning a formal grammar from a set of observations, has many practical applications. These arise in natural language processing, computational biology, and many other areas. As with many challenges in machine learning, grammatical inference remains unsolved.
Yet, over the years, a variety of machine approaches have been applied to these problems, with varying degrees of success depending on the application and the implementation. In this book, Prof. Pandey provides a very helpful overview of some of these methods, and also offers methods and results based on his own approaches within the field of evolutionary computation. Mainly, the techniques focus on derivatives of canonical genetic algorithms and swarm optimization, but the reader will be able to take any of Prof. Pandey's efforts and investigate and extend them for their own purposes using any other approach that is worthy of exploration. To get the most from the content, readers should give themselves the patience to understand the algorithms, and their acronyms, so as to be able to become familiar with the approaches. There will continue to be a need for improving our ability to perform grammatical inference reliably, and Prof. Pandey's contributions outlined in this book help to solidify the foundation for continuing this process of improvement.
David Fogel
Natural Selection, Inc.
San Diego, California
Preface
This book is a comprehensive, hands-on guide to formal grammatical inference using evolutionary algorithms. This book is designed and Chaptalize for researchers, academicians, and students for all levels. The contents of this book assume that you have no previous knowledge or background information about grammatical inference and evolutionary algorithms, especially genetic algorithms. People familiar with formal languages, compiler construction, and parser design, and a basic understanding of algorithm design and analysis will, of course, have an easier time and move through the first few chapters quickly.
The first three chapters of this book will allow you to take full advantage of existing learning or grammatical inference algorithms along with a comprehensive overview of both grammatical inference and genetic algorithms. You will start at the beginning, and when you have finished this book, you will have moved far along the road to grammatical inference algorithms. I have tried hard to cover most learning algorithms, starting from the very first learning algorithm proposed by Gold in 1950.
I have also tried to stress new ways of thinking required to master learning and evolutionary algorithms, so that even experts in grammatical inference or evolutionary algorithms can benefit from this book. I have used this approach to force evolutionary algorithms or genetic algorithms into the framework of older grammatical inference algorithms with evolutionary and metaheuristic algorithms.
To make this book even more useful, there are extensive discussions of important topics. The strengths and weakness of each method are given, which are left out by most other introductory books. Unlike many introductory books, I want to introduce you to a topic, but I also go into it in enough depth that you can actually use the techniques to develop new algorithms.
The subject matter of this book is divided into six chapters, each of which has been written and developed with immensely simplified theoretical and practical concepts (except Chapter 1, which is the foundation chapter to introduce scientific goals). These chapters clearly provide the core concepts of grammatical inference and genetic algorithms. State of the Art on Grammatical Inference Using Evolutionary Method was written especially for those who are novices to the field of grammatical inference and evolutionary algorithms.
As mentioned, this book is divided into six chapters. A short summary of each chapter is presented next for a quick look.
Chapter 1 (Introduction and Scientific Goals): This chapter presents an introductory text with the motivation and scope of the book. It shows research problems and main contributions and briefly describes grammatical inference and its effectiveness across domains. The basics of formal grammars and various existing grammatical inference methods are discussed in Chapter 2.
Chapter 2 (State of the Art: Grammatical Inference): This chapter presents the current state of the art within the context of grammatical inference methods. I have divided this chapter into two parts. Part 1 covers some preliminary definitions such as Backus-Naur form, grammars, and Chomsky hierarchy. The focus of Part 2 is mainly on the different grammatical inference methods. A comprehensive discussion of each method is presented with their strengths and weaknesses. Grammatical inference methods are classified based on various factors and learning techniques. At the end, challenges are presented with grammatical inference methods and a summary.
Chapter 3 (State of the Art: Genetic Algorithms and Premature Convergence): The purpose of this chapter is to discuss the genetic algorithm, which is a popular algorithm in the evolutionary algorithm family. An introduction of genetic algorithm is provided with factors affecting its behavior with theoretical frameworks. In addition, challenges in executing genetic algorithms are presented. The state of the art within the context of premature convergence in genetic algorithms is presented comprehensively. A detailed summary and analysis of different methods are given for quick review. A comparative analysis is provided based on different parameters. The motivation for this chapter is to identify methods that allow the development of new strategies to prevent premature convergence, and then to apply evolutionary algorithms to solve grammatical inference problems.
Chapter 4 (Genetic Algorithms and Grammatical Inference): The focus of this chapter is on evolutionary algorithms used for grammatical inference. For this book, I have considered genetic algorithms such as bit-masking oriented genetic algorithms. I discuss the role of bit-masking oriented data structure and its formation. The roles of the crossover mask (CM) and mutation mask (MM) are also shown with three crossover operators and an MM operator. In addition, the role of Boolean-based procedure is discussed with examples for the offspring generation. Finally, algorithms are shown that use the CM and MM with a Boolean-based procedure for grammatical inference. A detailed flowchart is presented highlighting the applicability of the minimum description length principle.
Chapter 5 (Performance Analysis of Genetic Algorithm for Grammatical Inference): The primary aim of this chapter is to report computational and statistical test results by implementing algorithms discussed in Chapter 4. This chapter shows the detailed method for developing a robust experimental setup to conduct the experiments. This chapter is also presents a comparison with and analysis of other algorithms.
Chapter 6 (Applications of Grammatical Inference Methods and Future Development): This chapter discusses the wide range of applications of grammatical inference methods and the possibilities of future investigation in this area.
Now, a confession: My original goal was to make this book a one-stop resource, but realistically, grammatical inference and evolutionary algorithms have gotten far too big and powerful for any one book to do this. Nonetheless, if you finish this book, I truly believe that you will be in a position to begin writing or implementing both grammatical inference and evolutionary algorithms for commercial-quality applications! True mastery will take longer. I have tried to explain the advantages, disadvantages, and suggestions that can take you to the next level.
Acknowledgment
First of all, I would like to thank Baba Visvnath for constant blessings throughout the development of this book. Because an understanding of a study such as this is never the outcome of efforts of a single person, it bears the imprint of a number of people who have directly or indirectly helped me complete this book, State of the Art on Grammatical Inference Using Evolutionary Method. I would be failing in my duty if I did not thank all those whose sincere advice helped me to make this book truly educative, effective, and pleasurable. I would like to acknowledge my family: Dr. Vijay Nath Pandey, Smt. Madhuri Pandey, Anjana Pandey and Ranjana Pandey, Man Mohan Pandey, Rachana Pandey, and Anant (my sweet babu).
I have immense pleasure in expressing wholehearted gratitude to my supervisors and mentors, Dr. Deepti Mehrotra (Amity University), Dr. Ankit Chaudhary (University of Missouri–St. Louis), and Prof. Abhay Bansal (Amity University). I am also very thankful to my friends and mentors, Prof. Arun Prakash Agarwal (Sharda University), Prof. Ankur Choudhary (Sharda University), Prof. Gaurav Raj (Sharda University), Prof. Neha Agarwal (Amity University), Prof. Neetu Narayan (Amity University), Prof. Anchal Garg (Amity University), Prof. Ranjeet Rout (NIT Srinagar), Shruti Gupta (Amity University), Prof. Graham Kendall (University of Nottingham), Prof. David Windridge (Middlesex University), Prof. Nik Bessis (Edge Hill University), Prof. David Fogel (Natural Selection), and Prof. Francesco Masulli (University of Genova) for supporting and guiding me during the preparation of this book. Also, I am truly thankful to my students, whose conceptual queries have always helped me dig more deeply into the subject matter.
Last but not least, I am thankful to Elsevier ERC Editorial USA for feedback, support, and guidance for writing and publishing this book.
Dr. Hari Mohan Pandey (Author)
Abbreviations
ABL Alignment-Based Learning
ADIOS Automatic Distillation of Structure
AGA Abstract Genetic Algorithm
AHCF Adaptive Hierarchical Fair Competition
ALLiS Architecture for Learning Linguistic Structure
ALPS Age-Layered Population Structure
AMR Adaptive Mutation Rate
ATIS Air Travel Information System
BBP Boolean-Based Procedure
BFWA Broadband Fixed Wireless Access
BMODS Bit-Mask Oriented Data Structure
BMOGA Bit-Mask Oriented Genetic Algorithm
BMU Best Matching Unit
BNF Backus–Naur Form
BS Base Station
CA Clustering Approach
CDC Context Distribution Clustering
CF Crowding Factor
CFBS Correlative Family-Based Selection
CFG Context-Free Grammar
CFL Context-Free Language
CG Categorical Grammar
CGA Conventional Genetic Algorithm
CHFC Continuous Hierarchical Fair Competition
CLL Computational Learning of Natural Language
CM Crossmask/Crossover Mask
CMOC Crossover-Mutation Operator Combination
CNF Conjunctive Normal Form
CTS Correlative Tournament Selection
DARO Dynamic Application Reproduction Operator
DCGA Diversity Control-Oriented Genetic Algorithm
DE Differential Evolution
DE-MC Differential Evolution Markov Chain
DFA Deterministic Finite Automata
DNF Disjunctive Normal Form
DSL Domain-Specific Language
EA Evolutionary Algorithm
EMPGA Elite Mating Pool Genetic Algorithm
EP Evolutionary Programming
ES Evolutionary Strategy
FC Frequency Crossover
FKB Fuzzy Knowledge Base
FUS Fitness Uniform Selection
GA Genetic Algorithm
GAC Genetic Algorithm with Compliments
GASOM Genetic Algorithm Using Self-organizing Maps
GAVaPS Genetic Algorithm with Varying Population Size
GAWMDL Genetic Algorithm With Minimum Description Length Principle
GAWOMDL Genetic Algorithm Without Minimum Description Length Principle
GCG Generalized Categorial Grammar
GI Grammatical Inference
GP Genetic Programming
GRIDS Grammar Induction Drive by Simplicity
GSM Gready Search Method
HFC Hierarchical Fair Competition
HM Heuristic Method
IFS Individual Fitness Score
IPA Incest Prevention Algorithm
ITBL Improved Tabular Representation Algorithm
KL Kullback–Leibler
LAgent Language Agent
LHDC Long Hamming Distance Crossover
LOC Local Optimum Convergence
MA Memetic Algorithm
MCMC Markov Chain Monte-Carlo
MCMP Multicrossover on Multiple Parents
MCMPIP Multiple Crossover on Multiple Parents with Incest Prevention
MCPC Multiple Crossovers per Couple
MDL Minimum Description Length
MM Mutmask/Mutation Mask
MOGA Matrix-Oriented Genetic Algorithm
NN Neural Network
NSGA Nondominated Sorting Genetic Algorithm
NSGA-DAR Nondominated Sorting Genetic Algorithm–Dynamic Application of Reproduction Operators
OEGA Odd–Even Genetic Algorithm
OVIS Openbaar Vervoer Informatie System
PA Pygmy Algorithm
PAC Probably Approximately Correct
PAHCF Parallel Adaptive Hierarchal Fair Competition
PDT Population Distribution Table
PRA Population Replacement Algorithm
PSO Particle Swarm Optimization
QPSO Quantum-Behaved Particle Swarm Optimization
RBCGA Real/Binary-like Coded Genetic Algorithm
RE Resultant Effectivity
RL Regular Language
RNN Recurrent Neural Network
RO Random offspring
ROGGA Random Offspring Generation Genetic Algorithm
RPF Racial Preference Factor
RTS Restricted Tournament Selection
SA Simulated Annealing
SASEGA Self-adaptive Segregative Genetic Algorithm
SASEGASA Self-adaptive Segregative Genetic Algorithm with Simulated Annealing aspects
SBGA Shifting Balance Genetic Algorithm
SBT Shifting Balance Theory
SDT Social Disaster Techniques
SEGA Segregative Genetic Algorithm
SHDC Short Hamming Distance Crossover
SHT Search History Table
SINR Signal to Interference and Noise Ratio
SM Statistical Method
SNR Signal-to-Noise Ratio
SOM Self-organizing Map
TBL Tabular Representation Algorithm
TPC Two-Point Crossover
TPCIS Two-Point Crossover with Internal Swapping
TS Terminal Station
TSP Traveling Salesman Problem
UC Uniform Crossover
1: Introduction and scientific goals
Abstract
This chapter explains the rationale and importance of studying grammatical inference. Section 1.2 highlights the application areas where grammatical inference has been used successfully. Scientific goals and different objectives are listed in Section 1.3. The purpose of this chapter is to set a road map for readers so they can develop an understanding of what this book is about.
Keywords
Evolutionary algorithms; Genetic algorithm; Genetic programming; Grammatical inference; Languages; Multilingual data; Programming languages
1.1. Introduction
Multilingual text data are increasing daily. This creates opportunities and challenges in the field of computational linguistics. This book will enrich the domains of language processing with performance improvement. This book is a resource presenting an approach to grammatical inference (GI) using the evolutionary algorithm (EA). GI deals with the standard learning procedure to acquire grammar based on evidence about language. It was extensively studied owing to its high importance in various fields of science and engineering. Typically, GI is a research domain that uses an unsupervised learning procedure to extract grammatical descriptions from a corpus or set of textual data. Several approaches have been proposed for GI that had their own merits and demerits. It was noticed that some approaches showed good results whereas others had limitations in dealing with negative samples. Hence, this book provides a basic but comprehensive overview of different GI algorithms. The book can be used as a reference study material for different domain. It focuses on formal grammar learning and applications, language acquisition, language processing, and information retrieval.
1.2. Why grammatical inference is popular
The field of GI is transversal to several research domains. There are few applications based purely on GI, but many uses of GI techniques exist; hence, there is plenty of room to find tasks in which GI techniques have performed far better than other machine learning or pattern recognition methods. GI methods have been used successfully in many domains, including robotics and control systems, structural pattern recognition, computational linguistics, automatic translation, computational biology, inductive logic programming, document management, compression, applications to time series, data mining, and music. GI methods have attracted researchers who are working in the software engineering domain. Some application areas in which GI techniques have been heavily used are (1) inference of general-purpose programming languages; (2) inference of grammar rules from programming languages; (3) inference of domain-specific languages; (4) inference of graph grammar and visual languages; and (5) software testing and test case generation. This discussion reveals that GI has a wide spectrum of applications, but there are few powerful GI algorithms. Where they exist, they deal only with the positive corpus. Some GI algorithms have been proposed to deal with both the positive and negative corpus. These GI algorithms use EAs or other optimization methods such as the greedy method, minimum description length principle, and statistical methods. Most GI algorithms are based on EAs whose genetic algorithm and genetic programming have been employed on a larger scale. There is no doubt about the exploratory power of EAs, but they suffer with key challenges known as premature convergence and slow finishing. Hence, a serious investigation is required to withstand premature convergence. This book presents a comprehensive study of approaches that were proposed to handle premature convergence.
1.3. Scientific goals: why this book?
The aims of this book are to present the state of the art on GI algorithms and discuss the most recent evolutionary algorithms proposed for GI from the positive and negative corpus.
To achieve these aims, these objectives were formed in the preparation of this book:
Objective 1: Explore the current state of the art for GI algorithms.
Objective 2: Offer a comprehensive discussion on EAs. Here, the focus will be on the genetic algorithm.
Objective 3: Explore the current state of the art on premature