Ebook1,017 pages8 hours

Recent Advances in Learning Automata

Name: Recent Advances in Learning Automata
Author: Alireza Rezvanian
ISBN: 9783319724287

By Alireza Rezvanian, Ali Mohammad Saghiri, Seyed Mehdi Vahidipour and

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book collects recent theoretical advances and concrete applications of learning automata (LAs) in various areas of computer science, presenting a broad treatment of the computer science field in a survey style. Learning automata (LAs) have proven to be effective decision-making agents, especially within unknown stochastic environments. The book starts with a brief explanation of LAs and their baseline variations. It subsequently introduces readers to a number of recently developed, complex structures used to supplement LAs, and describes their steady-state behaviors. These complex structures have been developed because, by design, LAs are simple units used to perform simple tasks; their full potential can only be tapped when several interconnected LAs cooperate to produce a group synergy.
In turn, the next part of the book highlights a range of LA-based applications in diverse computer science domains, from wireless sensor networks, to peer-to-peer networks, to complex social networks, and finally to Petri nets. The book accompanies the reader on a comprehensive journey, starting from basic concepts, continuing to recent theoretical findings, and ending in the applications of LAs in problems from numerous research domains. As such, the book offers a valuable resource for all computer engineers, scientists, and students, especially those whose work involves the reinforcement learning and artificial intelligence domains.

Skip carousel

LanguageEnglish

PublisherSpringer

Release dateJan 17, 2018

ISBN9783319724287

Author

Alireza Rezvanian

Related authors

Skip carousel

Related to Recent Advances in Learning Automata

Titles in the series (8)

Skip carousel

Recent Advances in Learning Automata
Ebook
Recent Advances in Learning Automata
byAlireza Rezvanian
Rating: 0 out of 5 stars
0 ratings
Digital Image Forensics: Theory and Implementation
Ebook
Digital Image Forensics: Theory and Implementation
byAniket Roy
Rating: 0 out of 5 stars
0 ratings
Evolutionary Algorithms and Neural Networks: Theory and Applications
Ebook
Evolutionary Algorithms and Neural Networks: Theory and Applications
bySeyedali Mirjalili
Rating: 0 out of 5 stars
0 ratings
Robot Operating System (ROS): The Complete Reference (Volume 3)
Ebook
Robot Operating System (ROS): The Complete Reference (Volume 3)
byAnis Koubaa
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Medical Decision Support Systems
Ebook
Deep Learning for Medical Decision Support Systems
byUtku Kose
Rating: 0 out of 5 stars
0 ratings
Robot Operating System (ROS): The Complete Reference (Volume 5)
Ebook
Robot Operating System (ROS): The Complete Reference (Volume 5)
byAnis Koubaa
Rating: 0 out of 5 stars
0 ratings
Robot Operating System (ROS): The Complete Reference (Volume 6)
Ebook
Robot Operating System (ROS): The Complete Reference (Volume 6)
byAnis Koubaa
Rating: 0 out of 5 stars
0 ratings
Information Security of Intelligent Vehicles Communication: Overview, Perspectives, Challenges, and Possible Solutions
Ebook
Information Security of Intelligent Vehicles Communication: Overview, Perspectives, Challenges, and Possible Solutions
byMadhusudan Singh
Rating: 5 out of 5 stars
5/5

Related ebooks

Skip carousel

Evolutionary Algorithms and Neural Networks: Theory and Applications
Ebook
Evolutionary Algorithms and Neural Networks: Theory and Applications
bySeyedali Mirjalili
Rating: 0 out of 5 stars
0 ratings
Neuromorphic Computing and Beyond: Parallel, Approximation, Near Memory, and Quantum
Ebook
Neuromorphic Computing and Beyond: Parallel, Approximation, Near Memory, and Quantum
byKhaled Salah Mohamed
Rating: 0 out of 5 stars
0 ratings
Deep Learning on Edge Computing Devices: Design Challenges of Algorithm and Architecture
Ebook
Deep Learning on Edge Computing Devices: Design Challenges of Algorithm and Architecture
byXichuan Zhou
Rating: 0 out of 5 stars
0 ratings
Deep Learning and Parallel Computing Environment for Bioengineering Systems
Ebook
Deep Learning and Parallel Computing Environment for Bioengineering Systems
byArun Kumar Sangaiah
Rating: 0 out of 5 stars
0 ratings
Computer Vision in Advanced Control Systems-5: Advanced Decisions in Technical and Medical Applications
Ebook
Computer Vision in Advanced Control Systems-5: Advanced Decisions in Technical and Medical Applications
byMargarita N. Favorskaya
Rating: 0 out of 5 stars
0 ratings
Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn
Ebook
Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn
byTshepo Chris Nokeri
Rating: 0 out of 5 stars
0 ratings
Neural Networks for Beginners: A Journey Through the World of Artificial Intelligence
Ebook
Neural Networks for Beginners: A Journey Through the World of Artificial Intelligence
byBrandon Scott
Rating: 0 out of 5 stars
0 ratings
Practical Machine Learning for Data Analysis Using Python
Ebook
Practical Machine Learning for Data Analysis Using Python
byAbdulhamit Subasi
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Robot Perception and Cognition
Ebook
Deep Learning for Robot Perception and Cognition
byAlexandros Iosifidis
Rating: 4 out of 5 stars
4/5
Cognitive Radio Communication and Networking: Principles and Practice
Ebook
Cognitive Radio Communication and Networking: Principles and Practice
byRobert Caiming Qiu
Rating: 0 out of 5 stars
0 ratings
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
Ebook
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
byGiancarlo Zaccone
Rating: 0 out of 5 stars
0 ratings
An Elementary Introduction to Statistical Learning Theory
Ebook
An Elementary Introduction to Statistical Learning Theory
bySanjeev Kulkarni
Rating: 0 out of 5 stars
0 ratings
Neural Networks and Fuzzy Logic
Ebook
Neural Networks and Fuzzy Logic
byC. Naga Bhaskar
Rating: 0 out of 5 stars
0 ratings
Human Activity Recognition and Behaviour Analysis: For Cyber-Physical Systems in Smart Environments
Ebook
Human Activity Recognition and Behaviour Analysis: For Cyber-Physical Systems in Smart Environments
byLiming Chen
Rating: 0 out of 5 stars
0 ratings
Beginning Machine Learning in the Browser: Quick-start Guide to Gait Analysis with JavaScript and TensorFlow.js
Ebook
Beginning Machine Learning in the Browser: Quick-start Guide to Gait Analysis with JavaScript and TensorFlow.js
byNagender Kumar Suryadevara
Rating: 0 out of 5 stars
0 ratings
Swarm Intelligence for Resource Management in Internet of Things
Ebook
Swarm Intelligence for Resource Management in Internet of Things
byAboul Ella Hassanien
Rating: 0 out of 5 stars
0 ratings
Semi-empirical Neural Network Modeling and Digital Twins Development
Ebook
Semi-empirical Neural Network Modeling and Digital Twins Development
byDmitriy Tarkhov
Rating: 0 out of 5 stars
0 ratings
Ascend AI Processor Architecture and Programming: Principles and Applications of CANN
Ebook
Ascend AI Processor Architecture and Programming: Principles and Applications of CANN
byXiaoyao Liang
Rating: 0 out of 5 stars
0 ratings
Multicriteria Portfolio Construction with Python
Ebook
Multicriteria Portfolio Construction with Python
byElissaios Sarmas
Rating: 0 out of 5 stars
0 ratings
Signals and Systems using MATLAB
Ebook
Signals and Systems using MATLAB
byLuis Chaparro
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Future Fiber-Optic Communication Systems
Ebook
Machine Learning for Future Fiber-Optic Communication Systems
byAlan Pak Tao Lau
Rating: 0 out of 5 stars
0 ratings
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
Ebook
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
byJyh-Horng Jeng
Rating: 0 out of 5 stars
0 ratings
Meta-Learning: Theory, Algorithms and Applications
Ebook
Meta-Learning: Theory, Algorithms and Applications
byLan Zou
Rating: 0 out of 5 stars
0 ratings
Machine Learning: A Constraint-Based Approach
Ebook
Machine Learning: A Constraint-Based Approach
byMarco Gori
Rating: 0 out of 5 stars
0 ratings
Distributed Computing Through Combinatorial Topology
Ebook
Distributed Computing Through Combinatorial Topology
byMaurice Herlihy
Rating: 0 out of 5 stars
0 ratings
Advanced Deep Learning for Engineers and Scientists: A Practical Approach
Ebook
Advanced Deep Learning for Engineers and Scientists: A Practical Approach
byKolla Bhanu Prakash
Rating: 0 out of 5 stars
0 ratings
Self-Adaptive Systems for Machine Intelligence
Ebook
Self-Adaptive Systems for Machine Intelligence
byHaibo He
Rating: 0 out of 5 stars
0 ratings
Practical MATLAB: With Modeling, Simulation, and Processing Projects
Ebook
Practical MATLAB: With Modeling, Simulation, and Processing Projects
byIrfan Turk
Rating: 0 out of 5 stars
0 ratings
Computer Science and Ambient Intelligence
Ebook
Computer Science and Ambient Intelligence
byGaëlle Calvary
Rating: 0 out of 5 stars
0 ratings
Quantum Machine Learning: What Quantum Computing Means to Data Mining
Ebook
Quantum Machine Learning: What Quantum Computing Means to Data Mining
byPeter Wittek
Rating: 0 out of 5 stars
0 ratings

Technology & Engineering For You

Skip carousel

The Big Book of Hacks: 264 Amazing DIY Tech Projects
Ebook
The Big Book of Hacks: 264 Amazing DIY Tech Projects
byDoug Cantor
Rating: 4 out of 5 stars
4/5
The Art of War
Ebook
The Art of War
bySun Tzu
Rating: 4 out of 5 stars
4/5
My Inventions: The Autobiography of Nikola Tesla
Ebook
My Inventions: The Autobiography of Nikola Tesla
byNikola Tesla
Rating: 4 out of 5 stars
4/5
The Big Book of Maker Skills: Tools & Techniques for Building Great Tech Projects
Ebook
The Big Book of Maker Skills: Tools & Techniques for Building Great Tech Projects
byChris Hackett
Rating: 4 out of 5 stars
4/5
The 48 Laws of Power in Practice: The 3 Most Powerful Laws & The 4 Indispensable Power Principles
Ebook
The 48 Laws of Power in Practice: The 3 Most Powerful Laws & The 4 Indispensable Power Principles
byJon Waterlow
Rating: 5 out of 5 stars
5/5
The CIA Lockpicking Manual
Ebook
The CIA Lockpicking Manual
byCentral Intelligence Agency
Rating: 5 out of 5 stars
5/5
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ebook
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
byAshlee Vance
Rating: 4 out of 5 stars
4/5
Electrical Engineering 101: Everything You Should Have Learned in School...but Probably Didn't
Ebook
Electrical Engineering 101: Everything You Should Have Learned in School...but Probably Didn't
byDarren Ashby
Rating: 5 out of 5 stars
5/5
80/20 Principle: The Secret to Working Less and Making More
Ebook
80/20 Principle: The Secret to Working Less and Making More
byPaul J. Stanley
Rating: 5 out of 5 stars
5/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
The Total Inventor's Manual: Transform Your Idea into a Top-Selling Product
Ebook
The Total Inventor's Manual: Transform Your Idea into a Top-Selling Product
bySean Michael Ragan
Rating: 1 out of 5 stars
1/5
Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career
Ebook
Ultralearning: Master Hard Skills, Outsmart the Competition, and Accelerate Your Career
byScott H. Young
Rating: 4 out of 5 stars
4/5
Smart Phone Dumb Phone: Free Yourself from Digital Addiction
Ebook
Smart Phone Dumb Phone: Free Yourself from Digital Addiction
byAllen Carr
Rating: 0 out of 5 stars
0 ratings
Sneaky Uses for Everyday Things: How to Turn a Penny into a Radio, Make a Flood Alarm with an Aspirin, Change Milk into Plastic, Extract Water and Electricity from Thin Air, Turn on a TV with your Ring, and Other Amazing Feats
Ebook
Sneaky Uses for Everyday Things: How to Turn a Penny into a Radio, Make a Flood Alarm with an Aspirin, Change Milk into Plastic, Extract Water and Electricity from Thin Air, Turn on a TV with your Ring, and Other Amazing Feats
byCy Tymony
Rating: 3 out of 5 stars
3/5
U.S. Marine Close Combat Fighting Handbook
Ebook
U.S. Marine Close Combat Fighting Handbook
byUnited States Marine Corps
Rating: 4 out of 5 stars
4/5
Logic Pro X For Dummies
Ebook
Logic Pro X For Dummies
byGraham English
Rating: 0 out of 5 stars
0 ratings
Understanding Media: The Extensions of Man
Ebook
Understanding Media: The Extensions of Man
byMarshall McLuhan
Rating: 4 out of 5 stars
4/5
The Art of War
Ebook
The Art of War
bySun Tsu
Rating: 4 out of 5 stars
4/5
Broken Money: Why Our Financial System is Failing Us and How We Can Make it Better
Ebook
Broken Money: Why Our Financial System is Failing Us and How We Can Make it Better
byLyn Alden
Rating: 5 out of 5 stars
5/5
The Total Motorcycling Manual: 291 Essential Skills
Ebook
The Total Motorcycling Manual: 291 Essential Skills
byMark Lindemann
Rating: 5 out of 5 stars
5/5
How to Disappear and Live Off the Grid: A CIA Insider's Guide
Ebook
How to Disappear and Live Off the Grid: A CIA Insider's Guide
byJohn Kiriakou
Rating: 0 out of 5 stars
0 ratings
Ghost Rider: Travels on the Healing Road
Ebook
Ghost Rider: Travels on the Healing Road
byNeil Peart
Rating: 4 out of 5 stars
4/5
The Complete Titanic Chronicles: A Night to Remember and The Night Lives On
Ebook
The Complete Titanic Chronicles: A Night to Remember and The Night Lives On
byWalter Lord
Rating: 4 out of 5 stars
4/5
Don't Know Much About Geography: Everything You Need to Know About the World but Never Learned
Ebook
Don't Know Much About Geography: Everything You Need to Know About the World but Never Learned
byKenneth C. Davis
Rating: 0 out of 5 stars
0 ratings
The Systems Thinker: Essential Thinking Skills For Solving Problems, Managing Chaos,
Ebook
The Systems Thinker: Essential Thinking Skills For Solving Problems, Managing Chaos,
byAlbert Rutherford
Rating: 4 out of 5 stars
4/5
A History of the American People
Ebook
A History of the American People
byPaul Johnson
Rating: 4 out of 5 stars
4/5
A Night to Remember: The Sinking of the Titanic
Ebook
A Night to Remember: The Sinking of the Titanic
byWalter Lord
Rating: 4 out of 5 stars
4/5
Summary of Empire of Pain: by Patrick Radden Keefe - The Secret History of the Sackler Dynasty - A Comprehensive Summary
Ebook
Summary of Empire of Pain: by Patrick Radden Keefe - The Secret History of the Sackler Dynasty - A Comprehensive Summary
byAlexander Cooper
Rating: 3 out of 5 stars
3/5
The Invisible Rainbow: A History of Electricity and Life
Ebook
The Invisible Rainbow: A History of Electricity and Life
byArthur Firstenberg
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

Network Analysis At The Speed Of C With The Power Of Python Using NetworKit: An interview with Eugenio Angriman about the NetworKit library and how you can use it to gain insights into large volumes of networked data
Podcast episode
Network Analysis At The Speed Of C With The Power Of Python Using NetworKit: An interview with Eugenio Angriman about the NetworKit library and how you can use it to gain insights into large volumes of networked data
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Neural Augmentation for Wireless Communication with Max Welling - #398: Today we’re joined by Max Welling, Vice President of Technologies at Qualcomm Netherlands, and Professor at the University of Amsterdam. In case you missed it, Max joined us last year to discuss his work on - the 2nd most popular episode of...
Podcast episode
Neural Augmentation for Wireless Communication with Max Welling - #398: Today we’re joined by Max Welling, Vice President of Technologies at Qualcomm Netherlands, and Professor at the University of Amsterdam. In case you missed it, Max joined us last year to discuss his work on - the 2nd most popular episode of...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Alignment Newsletter #167: Concrete ML safety problems and their relevance to x-risk: Concrete ML safety problems and their relevance to x-risk
Podcast episode
Alignment Newsletter #167: Concrete ML safety problems and their relevance to x-risk: Concrete ML safety problems and their relevance to x-risk
byAlignment Newsletter Podcast
0 ratings
0% found this document useful
ATLAS with Dr. Mario Lassnig: Our guest today is Dr. Mario Lassnig, a software engineer working on the ATLAS Experiment at CERN!
Podcast episode
ATLAS with Dr. Mario Lassnig: Our guest today is Dr. Mario Lassnig, a software engineer working on the ATLAS Experiment at CERN!
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Learning Forward – Michael Littman, Professor of Computer Science, Brown University – Learning for Neural Networks and The Endless Possibilities for Advanced AI: What do the robots know and when did they know it? But more importantly, how did they learn it? Technology is improving and advancing at a blistering pace and the implications…
Podcast episode
Learning Forward – Michael Littman, Professor of Computer Science, Brown University – Learning for Neural Networks and The Endless Possibilities for Advanced AI: What do the robots know and when did they know it? But more importantly, how did they learn it? Technology is improving and advancing at a blistering pace and the implications…
byFinding Genius Podcast
0 ratings
0% found this document useful
Brain-Inspired Hardware and Algorithm Co-Design with Melika Payvand - #585
Podcast episode
Brain-Inspired Hardware and Algorithm Co-Design with Melika Payvand - #585
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Hyperparameter Optimization through Neural Network Partitioning with Christos Louizos - #627
Podcast episode
Hyperparameter Optimization through Neural Network Partitioning with Christos Louizos - #627
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]
Podcast episode
Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
How LLMs and Generative AI are Revolutionizing AI for Science with Anima Anandkumar - #614
Podcast episode
How LLMs and Generative AI are Revolutionizing AI for Science with Anima Anandkumar - #614
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Pushing the Boundaries of AI, Cheaply and Efficiently: Murat Onen Explains: Large-scale AI models that enable next-generation applications like natural language processing and autonomous systems require intensive training and immense power. The monetary and environmental expense is too great. This is where analog deep...
Podcast episode
Pushing the Boundaries of AI, Cheaply and Efficiently: Murat Onen Explains: Large-scale AI models that enable next-generation applications like natural language processing and autonomous systems require intensive training and immense power. The monetary and environmental expense is too great. This is where analog deep...
byFinding Genius Podcast
0 ratings
0% found this document useful
BI 141 Carina Curto: From Structure to Dynamics: Check out my free video series about whats missing in AI and Neuroscience Support the show to get full episodes and join the Discord community. Carina Curto is a professor in the Department of Mathematics at The Pennsylvania State Universit
Podcast episode
BI 141 Carina Curto: From Structure to Dynamics: Check out my free video series about whats missing in AI and Neuroscience Support the show to get full episodes and join the Discord community. Carina Curto is a professor in the Department of Mathematics at The Pennsylvania State Universit
byBrain Inspired
0 ratings
0% found this document useful
100x Improvements in Deep Learning Performance with Sparsity with Subutai Ahmad - #562
Podcast episode
100x Improvements in Deep Learning Performance with Sparsity with Subutai Ahmad - #562
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Quantum Queries — Dr. Florian Neukart, Principle Scientist at Volkswagen — Quantum Computing for Research and Simulation, the Pathway to Understanding Materials and Building Better Products: Dr. Florian Neukart, principle scientist at Volkswagen, delivers a comprehensive analysis of quantum computing and simulation utilized for research. As principal scientist at Volkswagen Group, Dr. Neukart, focuses on intensive research in the fields...
Podcast episode
Quantum Queries — Dr. Florian Neukart, Principle Scientist at Volkswagen — Quantum Computing for Research and Simulation, the Pathway to Understanding Materials and Building Better Products: Dr. Florian Neukart, principle scientist at Volkswagen, delivers a comprehensive analysis of quantum computing and simulation utilized for research. As principal scientist at Volkswagen Group, Dr. Neukart, focuses on intensive research in the fields...
byFinding Genius Podcast
0 ratings
0% found this document useful
Efficient LLM Inference on CPUs: Large language models (LLMs) have demonstrated remarkable performance and tremendous potential across a wide range of tasks. However, deploying these models has been challenging due to the astronomical amount of model parameters, which requires a dem...
Podcast episode
Efficient LLM Inference on CPUs: Large language models (LLMs) have demonstrated remarkable performance and tremendous potential across a wide range of tasks. However, deploying these models has been challenging due to the astronomical amount of model parameters, which requires a dem...
byPapers Read on AI
0 ratings
0% found this document useful
LISA: Reasoning Segmentation via Large Language Model: Although perception systems have made remarkable advancements in recent years, they still rely on explicit human instruction to identify the target objects or categories before executing visual recognition tasks. Such systems lack the ability to acti...
Podcast episode
LISA: Reasoning Segmentation via Large Language Model: Although perception systems have made remarkable advancements in recent years, they still rely on explicit human instruction to identify the target objects or categories before executing visual recognition tasks. Such systems lack the ability to acti...
byPapers Read on AI
0 ratings
0% found this document useful
185: InstructorEx for LLMs: Explore InstructorEx's approach to harnessing LLMs for structured JSON data and Elixir's role in refining AI interactions. Uncover strategies for enhancing tasks and integrating Python skills with Elixir potential, and more!
Podcast episode
185: InstructorEx for LLMs: Explore InstructorEx's approach to harnessing LLMs for structured JSON data and Elixir's role in refining AI interactions. Uncover strategies for enhancing tasks and integrating Python skills with Elixir potential, and more!
byThinking Elixir Podcast
0 ratings
0% found this document useful
14: Artificial Thought (Neural Networks): Go to www.brilliant.org/breakingmathpodcast to learn neural networks, everyday physics, computer science fundamentals, the joy of problem solving, and many related topics in science, technology, engineering, and math. Mathematics takes inspiration...
Podcast episode
14: Artificial Thought (Neural Networks): Go to www.brilliant.org/breakingmathpodcast to learn neural networks, everyday physics, computer science fundamentals, the joy of problem solving, and many related topics in science, technology, engineering, and math. Mathematics takes inspiration...
byBreaking Math Podcast
0 ratings
0% found this document useful
#184 André van Schaik: Building An Artificial Brain With 100 Billion Neurons: Dive into the cutting-edge realm of neuromorphic computing with André van Schaik, a professor of electrical engineering at the Western Sydney University, and director of the International Centre for Neuromorphic Systems, in Penrith, New South Wales,...
Podcast episode
#184 André van Schaik: Building An Artificial Brain With 100 Billion Neurons: Dive into the cutting-edge realm of neuromorphic computing with André van Schaik, a professor of electrical engineering at the Western Sydney University, and director of the International Centre for Neuromorphic Systems, in Penrith, New South Wales,...
byEye On A.I.
0 ratings
0% found this document useful
100: Nanotechnology and the Brain: Fundamentals of Neuromorphic Computing | (ft. Dr. Jean Anne Incorvia): Biological brains can accomplish more than modern computing systems while using much less power. However, computers are much better at dealing with computation, while brains are (unsurprisingly) much better at interacting with ever-changing environme...
Podcast episode
100: Nanotechnology and the Brain: Fundamentals of Neuromorphic Computing | (ft. Dr. Jean Anne Incorvia): Biological brains can accomplish more than modern computing systems while using much less power. However, computers are much better at dealing with computation, while brains are (unsurprisingly) much better at interacting with ever-changing environme...
byIt's a Material World | Materials Science Podcast
0 ratings
0% found this document useful
LM101-083: Ch5: How to Use Calculus to Design Learning Machines: This particular podcast covers the material from Chapter 5 of my new book “Statistical Machine Learning: A unified framework” which is now available! The book chapter shows how matrix calculus is very useful for the analysis and design of both linear
Podcast episode
LM101-083: Ch5: How to Use Calculus to Design Learning Machines: This particular podcast covers the material from Chapter 5 of my new book “Statistical Machine Learning: A unified framework” which is now available! The book chapter shows how matrix calculus is very useful for the analysis and design of both linear
byLearning Machines 101
0 ratings
0% found this document useful
Deep Learning Neural Networks: Building Trust and Breaking Bias
Podcast episode
Deep Learning Neural Networks: Building Trust and Breaking Bias
byDataCafé
0 ratings
0% found this document useful
This Week In Machine Learning & AI - 6/10/16: Self-Motivated AI, Plus A Kill-Switch for Rogue Bots: This Week in Machine Learning & AI brings you the…
Podcast episode
This Week In Machine Learning & AI - 6/10/16: Self-Motivated AI, Plus A Kill-Switch for Rogue Bots: This Week in Machine Learning & AI brings you the…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
World’s Largest Supercomputer v. Biology’s Toughest Problems: with @vijaypande, @drGregBowman, @lr_bio This episode celebrates the 20th anniversary of Folding at Home, the distributed computing project for simulating protein dynamics. Folding at Home is run on millions of devices, is the world’s largest supercomputer, and tackles some of biology’s toughest problems, including COVID-19.
Podcast episode
World’s Largest Supercomputer v. Biology’s Toughest Problems: with @vijaypande, @drGregBowman, @lr_bio This episode celebrates the 20th anniversary of Folding at Home, the distributed computing project for simulating protein dynamics. Folding at Home is run on millions of devices, is the world’s largest supercomputer, and tackles some of biology’s toughest problems, including COVID-19.
bya16z Podcast
0 ratings
0% found this document useful
LM101-071: How to Model Common Sense Knowledge using First-Order Logic and Markov Logic Nets: In this podcast, we provide some insights into the complexity of common sense. First, we discuss the importance of building common sense into learning machines. Second, we discuss how first-order logic can be used to represent common sense knowledge. Thir
Podcast episode
LM101-071: How to Model Common Sense Knowledge using First-Order Logic and Markov Logic Nets: In this podcast, we provide some insights into the complexity of common sense. First, we discuss the importance of building common sense into learning machines. Second, we discuss how first-order logic can be used to represent common sense knowledge. Thir
byLearning Machines 101
0 ratings
0% found this document useful
4 + 1 Model of Data Science: Before diving into the complex world of data science it seemed to wise to establish a shared definition of the field. Here at the UVA School of Data Science, we have defined data science with the 4 + 1 Model. This model serves an outline for the first series of UVA Data Points. It also serves as a guiding definition within the School of Data Science, touching everything from research to course planning. In this introduction trailer, host Monica Manney discusses the history, development, and function of the 4 + 1 Model of Data Science with its main author, Raf Alvarado. Below is a brief expect from An Outline of the 4 + 1 Model of Data Science by Raf Alvarado: “The point of the 4 + 1 model, abstract as it is, is to provide a practical template for strategically planning the various elements of a school of data science. To serve as an effective template, a model must be general. But generality if often purchased at the cost of intuitive understanding. The fol
Podcast episode
4 + 1 Model of Data Science: Before diving into the complex world of data science it seemed to wise to establish a shared definition of the field. Here at the UVA School of Data Science, we have defined data science with the 4 + 1 Model. This model serves an outline for the first series of UVA Data Points. It also serves as a guiding definition within the School of Data Science, touching everything from research to course planning. In this introduction trailer, host Monica Manney discusses the history, development, and function of the 4 + 1 Model of Data Science with its main author, Raf Alvarado. Below is a brief expect from An Outline of the 4 + 1 Model of Data Science by Raf Alvarado: “The point of the 4 + 1 model, abstract as it is, is to provide a practical template for strategically planning the various elements of a school of data science. To serve as an effective template, a model must be general. But generality if often purchased at the cost of intuitive understanding. The fol
byUVA Data Points
0 ratings
0% found this document useful
The Role of Infrastructure in ML // Niels Bantilan // #197
Podcast episode
The Role of Infrastructure in ML // Niels Bantilan // #197
byMLOps.community
0 ratings
0% found this document useful
ICLR 2020: Yoshua Bengio and the Nature of Consciousness
Podcast episode
ICLR 2020: Yoshua Bengio and the Nature of Consciousness
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Data Governance and AI // Alexandra Diem // #212
Podcast episode
Data Governance and AI // Alexandra Diem // #212
byMLOps.community
0 ratings
0% found this document useful
Who’s Harry Potter? Approximate Unlearning in LLMs: Large language models (LLMs) are trained on massive internet corpora that often contain copyrighted content. This poses legal and ethical challenges for the developers and users of these models, as well as the original authors and publishers. In this...
Podcast episode
Who’s Harry Potter? Approximate Unlearning in LLMs: Large language models (LLMs) are trained on massive internet corpora that often contain copyrighted content. This poses legal and ethical challenges for the developers and users of these models, as well as the original authors and publishers. In this...
byPapers Read on AI
0 ratings
0% found this document useful
The Earth Dies Dreaming: Andy and Dave discuss the latest in AI news, including a letter from the National Transportation Safety Board that asks the National Highway Traffic Safety Administration to regulate more strictly autonomous vehicles and driver assistance...
Podcast episode
The Earth Dies Dreaming: Andy and Dave discuss the latest in AI news, including a letter from the National Transportation Safety Board that asks the National Highway Traffic Safety Administration to regulate more strictly autonomous vehicles and driver assistance...
byAI with AI: Artificial Intelligence with Andy Ilachinski
0 ratings
0% found this document useful

Skip carousel

The Race To Exascale Supercomputers
Maximum PC
Article
The Race To Exascale Supercomputers
Jun 21, 2022
9 min read
Quantum Simulators An Overview
Techfastly
Article
Quantum Simulators An Overview
Oct 1, 2021
4 min read
‘Deep Learning’ Goes Faster With Organized Data
Futurity
Article
‘Deep Learning’ Goes Faster With Organized Data
Jun 5, 2017
Researchers have found that a technique for speedy data lookup, called hashing, can dramatically reduce the amount of computation required for deep learning, a demanding form of machine learning. “This applies to any deep-learning architecture, and t
2 min read
Quantum Computing and The Rise Of Machine Learning
Techfastly
Article
Quantum Computing and The Rise Of Machine Learning
Oct 1, 2021
2 min read
Is Artificial Intelligence Permanently Inscrutable?: Despite new biology-like tools, some insist interpretation is impossible.
Nautilus
Article
Is Artificial Intelligence Permanently Inscrutable?: Despite new biology-like tools, some insist interpretation is impossible.
Sep 1, 2016
Dmitry Malioutov can’t say much about what he built. As a research scientist at IBM, Malioutov spends part of his time building machine learning systems that solve difficult problems faced by IBM’s corporate clients. One such program was meant for a
13 min read
Is Artificial Intelligence Permanently Inscrutable?
Nautilus
Article
Is Artificial Intelligence Permanently Inscrutable?
Sep 1, 2016
Dmitry Malioutov can’t say much about what he built. As a research scientist at IBM, Malioutov spends part of his time building machine learning systems that solve difficult problems faced by IBM’s corporate clients. One such program was meant for a
13 min read
Microchips Use ‘Sparse Coding’ to Recognize Objects Like We Do
Futurity
Article
Microchips Use ‘Sparse Coding’ to Recognize Objects Like We Do
May 26, 2017
3 min read
The Fundamental Limits of Machine Learning
Nautilus
Article
The Fundamental Limits of Machine Learning
Aug 14, 2017
5 min read
Building PCs
Linux Format
Article
Building PCs
Apr 7, 2020
2 min read
Tales For Makers
The Shed
Article
Tales For Makers
Oct 3, 2022
4 min read
Business applications For Quantum computing
Rotman Management
Article
Business applications For Quantum computing
May 1, 2022
COMPUTERS DO ARITHMETIC. Underlying every amazing application of computers today is math, calculated using binary digits or ‘bits.’ The original computers of the early 1950s could perform about 465 multiplications per second — much faster than the ‘h
11 min read
Generative AI: What Leaders Need To Know
Rotman Management
Article
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
Prototype Paves Way For ‘Computer-on-a-chip’
Futurity
Article
Prototype Paves Way For ‘Computer-on-a-chip’
Feb 22, 2019
2 min read
The Fundamental Limits of Machine Learning
Nautilus
Article
The Fundamental Limits of Machine Learning
Sep 20, 2016
5 min read
Memristor Setup Could Make Computer Chips More Efficient
Futurity
Article
Memristor Setup Could Make Computer Chips More Efficient
Jul 31, 2018
A new way of arranging advanced computer components called memristors on a chip could pave the way for their use in general computing. This could cut energy consumption by a factor of 100. Using memristors would improve performance in low power envir
2 min read
Tech Lets People Play Games With Their Thoughts
Futurity
Article
Tech Lets People Play Games With Their Thoughts
Apr 1, 2024
Engineers have created a program that lets people use their thoughts to control video games. The innovation is part of research into brain-computer interfaces to help improve the lives of people with motor disabilities. The researchers incorporated m
2 min read
‘Brain Password’ Uses Leo DiCaprio To Unlock Your Phone
Futurity
Article
‘Brain Password’ Uses Leo DiCaprio To Unlock Your Phone
Jun 7, 2018
Researchers are working on a “brain password” system that could let you unlock your phone with your brainwaves—or your reaction to a photo of Leonardo DiCaprio. “Like a password, it’s easy to reset; and like a biometric, it’s easy to use.” To overcom
2 min read
Researchers Discover a More Flexible Approach to Machine Learning
Nautilus
Article
Researchers Discover a More Flexible Approach to Machine Learning
Feb 10, 2023
4 min read
AI Overcomes Stumbling Block on Brain-Inspired Hardware
Nautilus
Article
AI Overcomes Stumbling Block on Brain-Inspired Hardware
Feb 22, 2022
Today’s most successful artificial intelligence algorithms, artificial neural networks, are loosely based on the intricate webs of real neural networks in our brains. But unlike our highly efficient brains, running these algorithms on computers guzzl
4 min read
How To Make Sense From And With AI ?
The European Business Review
Article
How To Make Sense From And With AI ?
Sep 25, 2021
4 min read
Device May Take Brain-machine Interfaces To The Next Level
Futurity
Article
Device May Take Brain-machine Interfaces To The Next Level
Mar 23, 2020
3 min read
Nerves Of Steel
PC Gamer (US Edition)
Article
Nerves Of Steel
Jun 16, 2020
4 min read
Mind-controlled Tech: Is It Possible?
BBC Science Focus Magazine
Article
Mind-controlled Tech: Is It Possible?
Nov 9, 2022
3 min read
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Nautilus
Article
Moore’s Law Is About to Get Weird: Never mind tablet computers. Wait till you see bubbles and slime mold.
Feb 12, 2015
I’ve never seen the computer you’re reading this story on, but I can tell you a lot about it. It runs on electricity. It uses binary logic to carry out programmed instructions. It shuttles information using materials known as semiconductors. Its brai
7 min read
Wireless Network Gets Data From Sensors The Size Of Salt Grains
Futurity
Article
Wireless Network Gets Data From Sensors The Size Of Salt Grains
Mar 19, 2024
Tiny chips may be a big breakthrough, researchers report. They have a new approach for a wireless communication network that can efficiently transmit, receive, and decode data from thousands of microelectronic chips that are each no larger than a gra
3 min read
Quantum Computing Is Here… With One Small Caveat
APC
Article
Quantum Computing Is Here… With One Small Caveat
Feb 5, 2024
8 min read
For More Trustworthy AI, We May Need an ‘Interpreter’
Futurity
Article
For More Trustworthy AI, We May Need an ‘Interpreter’
Jul 6, 2017
A team of researchers is working to build trust between humans and artificial intelligence (AI) by creating an “interpreter” that can explain how an AI arrived at the answer to a specific question. In an age of self-driving cars and autonomous drones
4 min read
Forward Thinking
Racecar Engineering
Article
Forward Thinking
Feb 4, 2022
8 min read
How To Train Computers Faster For ‘Extreme’ Datasets
Futurity
Article
How To Train Computers Faster For ‘Extreme’ Datasets
Dec 12, 2019
4 min read
Quantum Cyberattacks Are Coming. This Maths Can Stop Them
Popular Mechanics South Africa
Article
Quantum Cyberattacks Are Coming. This Maths Can Stop Them
Dec 9, 2022
3 min read

Related categories

Skip carousel

Reviews for Recent Advances in Learning Automata

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Recent Advances in Learning Automata - Alireza Rezvanian

Part IModels

Alireza Rezvanian, Ali Mohammad Saghiri, Seyed Mehdi Vahidipour, Mehdi Esnaashari and Mohammad Reza MeybodiRecent Advances in Learning AutomataStudies in Computational Intelligence754https://doi.org/10.1007/978-3-319-72428-7_1

1. Learning Automata Theory

Alireza Rezvanian¹ , Ali Mohammad Saghiri² , Seyed Mehdi Vahidipour³ , Mehdi Esnaashari⁴ and Mohammad Reza Meybodi⁵

(1)

Department of Computer Engineering and Information Technology, Institute for Research in Fundamental Sciences (IPM), School of Computer Science, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran

(2)

Department of Computer Engineering and Information Technology, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran

(3)

Faculty of Electrical and Computer Engineering, Department of Computer Engineering, University of Kashan, Kashan, Iran

(4)

Faculty of Computer Engineering, K.N.Toosi University of Technology, Tehran, Iran

(5)

Soft Computing Laboratory, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran

Alireza Rezvanian (Corresponding author)

Email: a.rezvanian@aut.ac.ir

Ali Mohammad Saghiri

Email: a_m_saghiri@aut.ac.ir

Seyed Mehdi Vahidipour

Email: vahidipour@kashanu.ac.ir

Mehdi Esnaashari

Email: esnaashari@kntu.ac.ir

Mohammad Reza Meybodi

Email: mmeybodi@aut.ac.ir

1.1 Learning Automata

Reinforcement learning or learning with a critic is defined by characterizing a learning problem, instead of characterizing the learning methods (Sutton and Barto 1998). Any method that is well suited to solve that problem is considered as a reinforcement learning method. The reinforcement learning problem is the problem of learning from interactions in order to achieve a certain goal. Considering this specification for the reinforcement learning problem, there must be an agent capable of learning, called the learner or the decision-maker. The learner must interact with a so-called environment or teacher, comprising everything outside of the learner, which provides evaluative responses to the actions performed by the learner. The learner and the environment interact continually; the learner selects actions and the environment responds to the selected actions, presenting new situations to the learner. In short, reinforcement learning is a framework of the learning problems in which the environment does not indicate the correct action, but provides only a scalar evaluative response to the selection of an action by the learner.

Learning automaton (LA) as one of computational intelligence techniques have been found very useful tool to solve many complex and real world problems in networks where a large amount of uncertainty or lacking the information about the environment exists (Torkestani 2012; Mousavian et al. 2013, 2014; Mahdaviani et al. 2015; Zhao et al. 2015; Khomami et al. 2016b; Mirsaleh and Meybodi 2016a). A learning automaton is a stochastic model operating in the framework of the reinforcement learning (Narendra and Thathachar 1989; Thathachar and Sastry 2004). This model can be classified under the reinforcement learning schemes in the category of the temporal-difference (TD) learning methods. TD learning is a combination of the Monte Carlo ideas and the dynamic programming ideas. Like Monte Carlo methods, TD methods can learn directly from raw experience without a model of the environment’s dynamics. Like the dynamic programming, TD methods update estimates based in part on the other learned estimates, without waiting for a final outcome (Sutton and Barto 1998). Sarsa (Rummery and Niranjan 1994), Q-learning (Watkins 1989), Actor-Critic methods (Barto et al. 1983) and R-learning (Schwartz 1993) are other samples of TD methods. Learning automata differ from other TD methods in the following two ways; the representation of the internal states and the updating method of the internal states.

The automata approach to learning can be considered as the determination of an optimal action from a set of actions. A learning automaton can be regarded as an abstract object that has a finite number of actions. It selects an action from its finite set of actions and applies it to an environment. The environment evaluates the applied action and sends a reinforcement signal to the learning automaton (Fig. ‎1.1). The reinforcement signal provided by the environment is used to update the internal state of the learning automaton. By continuing this process, the learning automaton gradually learns to select the optimal action, which leads to favorable responses from the environment.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig1_HTML.gif

Fig. 1.1

The interaction of a learning automaton and its environment

A learning automaton is a quintuple

$$ \left\langle {\alpha ,\,\varPhi ,\;\beta,\,F,\;G} \right\rangle $$

where

$$ \alpha = \left\{ {\alpha_{1} ,\,\alpha_{2} , \ldots \alpha_{r} } \right\} $$

is the set of actions that it must choose from,

$$ \varPhi = (\varPhi_{1} ,\;\varPhi_{2} , \ldots ,\varPhi_{s} ) $$

is the set of states,

$$ \beta = \left\{ {\beta_{1} ,\beta_{2} , \ldots ,\beta_{q} } \right\} $$

is the set of inputs, G: Φ → α is the output map and determines the action taken by the learning automaton if it is in the state Φ j , and

$$ F{:}\,\varPhi \times \beta \to \varPhi $$

is the transition map that defines the transition of the state of the learning automaton upon receiving an input from the environment.

The selected action at the time instant k, denoted by α(k), serves as the input to the environment which in turn emits a stochastic response, β(k), at the time instant k, which is considered as the response of the environment to the learning automaton. Based upon the nature of β, environments could be classified in three classes: P-, Q-, and S-models. The output of a P-model environment has two elements of success or failure. Usually in P-model environments, a failure (or unfavorable response) is denoted by 1 while a success (or a favorable response) is denoted by 0. In Q-model environments, β can take a finite number of values in the interval [0, 1] while in S-model environments, β lies in the interval [0, 1]. Based on the response β(k), the state of the learning automaton Φ(k) is updated and a new action is chosen at the time instant (k + 1). Learning automata can be classified into two main classes: fixed and variable structure learning automata (VSLA) (Narendra and Thathachar 1989).

A simple pseudo-code for the behavior of an r-action learning automaton within a stationary environment with $$ \beta \in \left\{ {0,1} \right\} $$ is presented in Fig. ‎1.2.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig2_HTML.gif

Fig. 1.2

Pseudo-code of the behavior of a learning automaton

Learning automata have several good features, which make them suitable for using in many applications. Main features of learning automata are stated.

Learning automata can be used without any priori information about the underlying application.

Learning automata are very useful for applications with large amount of uncertainty.

Unlike traditional hill-climbing algorithms, hill-climbing in learning automata is done in expected sense in a probability space.

Learning automata require a very little and simple feedback from their environment.

Learning automata are very useful in multi-agent and distributed systems with limited intercommunication and incomplete information.

Learning automata are very simple in structure and can be implemented easily in software or hardware.

The action set of learning automata can be a set of symbolic or numeric values.

Optimization algorithms based on learning automata don’t need the objective function to be an analytical function of adjustable parameters.

Learning automata can be analyzed by powerful mathematical methodologies. It has been shown that learning automata are optimal in single, hierarchical, and distributed structures.

10.

Learning automata require a few mathematical operations at each iteration so it can be used in real-time applications.

11.

Learning automata have flexibility and analytical tractability needed for most applications.

1.1.1 Fixed-Structure Learning Automata

The learning automaton is called fixed-structure if the followings are fixed: 1. the probability of the transition from one state to another state and 2. the action probability of any action in any state. Examples of the fixed-structure LA are L2N2, G2N2, Krylov, and Krinsky (Narendra and Thathachar 1989) which has been used in many applications such as adaptation of the back-propagation algorithm parameters.

1.1.2 Variable-Structure Learning Automata (VSLA)

A VSLA is represented by a 6-tuple 〈β, ϕ, α, P, G, T〉, where β denotes the set of inputs, ϕ is a set of internal states, α is the action sets as output, P is the state probability vector governing the choice of state as each instant k, G is the output mapping, and T is the learning algorithm (also known as learning scheme). The learning algorithm T refers to a recurrence relation which is used to update the action probability vector. Let α i (k) ∈ α be the action that is chosen by a learning automaton and p(k) is the probability vector defined over the set of action at instant k. At each instant k, the action probability vector p(k) is updated by the linear learning algorithm given in Eq. (1.1) if the chosen action α i (k) is rewarded by the random environment (β = 0) and it is updated according to Eq. (‎1.2) if the chosen action is penalized (β = 1).

$$ p_{j} \left( {k + 1} \right) = \left\{ {\begin{array}{*{20}l} {p_{j} \left( k \right) + a\left[ {1 - p_{j} \left( k \right)} \right]} \hfill & {j = i} \hfill \\ {\left( {1 - a} \right)p_{j} \left( k \right)} \hfill & {\forall j \ne i} \hfill \\ \end{array} } \right. $$

(1.1)

$$ p_{j} \left( {k + 1} \right) = \left\{ {\begin{array}{*{20}l} {\left( {1 - b} \right)p_{j} \left( k \right)} \hfill & {j = i} \hfill \\ {\left( {\frac{b}{r - 1}} \right) + \left( {1 - b} \right)p_{j} \left( k \right)} \hfill & {\forall j \ne i} \hfill \\ \end{array} } \right. $$

(1.2)

where a denotes reward parameter which determines the amount of increases of the action probability values, b is the penalty parameter determining the amount of decrease of the action probabilities values and r is the number of actions that learning automaton can take. If a = b, the learning algorithm is called linear reward‐penalty (L RP ) algorithm; if b = εa where 0 < ε < 1, then the learning algorithm is called linear reward‐ε-penalty (L RεP ) algorithm; and finally if b = 0, the learning algorithm is called linear reward‐Inaction (L RI ) algorithm which is perhaps the earliest scheme considered in mathematical psychology.

1.1.3 Learning Automata with Variable Number of Actions

In variable action-set learning automata (also called a learning automaton with variable number of actions) the number of available actions that can be taken by the learning automaton at every instant can be changed with time (Thathachar 1987). In variable action set learning automata, at each instant k, a learning automaton selects an action based on the scaled action probability vector $$ \hat{p}\left( k \right) $$ defined as

$$ \hat{p}_{i} \left( k \right) = \frac{{p_{i} \left( k \right)}}{K\left( k \right)} $$

(1.3)

where $$ K\left( k \right) $$ is the sum of the probabilities of the available actions at instant k, and p i (k) = Prob[α(k) = α i ]. And then the selected action performed on the environment. After receiving the reinforcement signal, the LA will similarly update probability vector of available actions via learning algorithm given in Eqs. (‎1.1) and (‎1.2). Finally, the probability vector of available actions $$ p\left( {k + 1} \right) $$ is rescaled as follows

$$ p_{i} \left( {k + 1} \right) = \hat{p}_{i} \left( {k + 1} \right).K\left( k \right) $$

(1.4)

The pseudo-code of the behavior of a variable action set learning automaton is shown in Fig. ‎1.3.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig3_HTML.gif

Fig. 1.3

Pseudo-code of the behavior of a variable action-set learning automaton

1.1.4 Estimator Algorithms

The slow convergence speed of the LA is the most important problem with this learning model. Estimator learning algorithms (Thathachar and Sastry 1985) are a class of the learning schemes which improve the convergence speed of the LA. In such algorithms, additional information about the characteristics of the environment is maintained and is used to increase the speed of convergence of the LA. Unlike other learning algorithms, in which the action probability vector is updated based solely on the current response received from the environment, in estimator learning algorithms, an estimation of the reward strength is maintained for each action of the LA and is used along with the current response of the environment to update the action probability vector. The estimation of the reward strength for any action α i is computed as the ratio of the number of times α i is rewarded to the number of times α i is selected. Different estimator learning algorithms have been proposed in literature thus far, such as stochastic (Papadimitriou 1994), absorbing stochastic (Papadimitriou et al. 2002), Discretized TS (Lanctot and Oommen 1992), relative strength (Simha and Kurose 1989), and S-model ergodic discretized (Paximadis and Paximadis 1994).

1.1.5 Pursuit Algorithms

A pursuit learning algorithm is a simplified version of the estimator learning algorithms, in which the action probability vector chases after the currently optimal action. At each iteration of a pursuit learning algorithm, the action probability of the action with the maximum rewarding estimation is increased. By this updating method, learning algorithm always pursues the optimal action. Several different pursuit algorithms have been proposed in literature, namely reward-penalty (Thathachar and Sastry 1986), discretized reward-penalty (Oommen and Lanctot 1990), reward-inaction (Agache and Oommen 2001), discretized reward-inaction (Agache and Oommen 2001), generalized (Agache and Oommen 2002), and discretized generalized (Agache and Oommen 2002).

1.1.6 Continuous Action-Set Learning Automata

In a continuous action-set learning automaton (CALA), the action-set is defined as a continuous interval over the real numbers. This means that each automaton chooses its actions from the real line. In such a learning automaton, the action probability of the possible actions is defined as a probability distribution function. All actions are initially selected with the same probability, that is, the probability distribution function under which the actions are initially selected has a uniform distribution. The probability distribution function is updated depending upon the responses received from the environment.

In (Santharam et al. 1994), a continuous action-set learning automaton has been proposed by Santharam et al. In this CALA, at any time instant k, the probability distribution function $$ \xi $$ has a normal distribution with mean $$ \mu_{k} $$ and standard deviation $$ \sigma_{k} $$ . The action α k and the mean $$ \mu_{k} $$ are two parameters using which the CALA interacts with its environment. CALA applies these parameters as two actions to the environment and gets two reinforcement signals from the environment, one for each of them. Denote these reinforcement signals as β(μ) and β(α). Based on the supplied reinforcement signals, CALA updates the probability distribution function of its actions using the following equations.

$$ \begin{aligned} \mu_{k + 1} & = \mu_{k} + af_{1} \left[ {\mu_{k} ,\sigma {}_{k},\alpha_{k} ,\beta \left( \alpha \right),\beta \left( \mu \right)} \right] \\ \sigma_{k + 1} & = \sigma_{k} + af_{2} \left[ {\mu_{k} ,\sigma {}_{k},\alpha_{k} ,\beta \left( \alpha \right),\beta \left( \mu \right)} \right] - Ca\left[ {\sigma_{k} - \sigma_{L} } \right] \\ \end{aligned} $$

(1.5)

where C, σ L , and a are the parameters of the CALA and f 1 and f 2 are defined as given below.

$$ \begin{aligned} f_{1} \left[ {\mu ,\sigma ,\alpha ,\beta \left( \alpha \right),\beta \left( \mu \right)} \right] & = \left[ {\frac{\beta \left( \alpha \right) - \beta \left( \mu \right)}{\phi \left( \sigma \right)}} \right]\left[ {\frac{\alpha - \mu }{\phi \left( \sigma \right)}} \right] \\ f_{2} \left[ {\mu ,\sigma ,\alpha ,\beta \left( \alpha \right),\beta \left( \mu \right)} \right] & = \left[ {\frac{\beta \left( \alpha \right) - \beta \left( \mu \right)}{\phi \left( \sigma \right)}} \right]\left[ {\left( {\frac{\alpha - \mu }{\phi \left( \sigma \right)}} \right)^{2} - 1} \right] \\ \phi \left( \sigma \right) & = \left( {\sigma - \sigma_{L} } \right)I\left\{ {\sigma > \sigma_{L} } \right\} + \sigma_{L} . \\ \end{aligned} $$

(1.6)

The CALA, by using the above updating equations, converges to a normal distribution N(α *, σ L ), where α * is the optimum action. One another continuous action-set learning automaton is given by Howell et al. (1997).

1.2 Interconnected Learning Automata

It seems that the full potential of learning automaton is realized when multiple automata interact with each other. It is shown that a set of interconnected learning automata is able to describe the behavior of an ant colony capable of finding the shortest path from their nest to food sources and back (Verbeeck and Nowe 2002). In this section, we study the interconnected learning automata. The interconnected learning automata techniques based on activation type of learning automata for taking an action can be classified into three classes: synchronous, sequential, and asynchronous, as follows

Synchronous Model of Interconnected Automata: In synchronous model of interconnected automata, at any time instant, all automata are activated simultaneously, choose their actions, then apply their chosen actions to the environment, and finally update their states. Two models of synchronous interconnected learning automata have been reported in the literature: game of automata and synchronous cellular learning automata.

Asynchronous Model of Interconnected Automata: In asynchronous model of interconnected automata, at any time instant only a group of automaton is activated, independently. The only proposed model for asynchronous model is asynchronous cellular learning automata. An asynchronous cellular learning automaton is a cellular automaton in which an automaton (or multiple automata) is assigned to its every cell. The learning automata residing in a particular cell determines its state (action) on the basis of its action probability vector. Like cellular automata, there is a rule that cellular learning automata operate under it. The rule of cellular learning automata and the state of neighboring cell of any particular cell determine the reinforcement signal to the learning automata residing in that cell. In cellular learning automata, the neighboring cells of any particular cell constitute its environment because they produce the reinforcement signal to the learning automata residing in that cell. This environment is a nonstationary environment because it varies as action probability vectors of cell vary and called local environment because is local to every cell. Krishna proposed an asynchronous cellular learning automata in which the order to which learning automata is determined is imposed by the environment (Krishna 1993).

Sequential Model of Interconnected Automata: In sequential model of interconnected automata, at any time instant only a group of automaton is activated and the actions chosen by currently activated automata determine next automata to be activated. The hierarchical structure learning automata, network of learning automata, distributed learning automata, and extended distributed learning automata are examples of sequential model.

In following subsections, we focus on sequential interconnected learning automata.

1.2.1 Hierarchical Structure Learning Automata

When the number of actions for a learning automaton becomes large (more than 10 actions), the time taken for the action probability vector to converges it also increases. Under such circumstances, a hierarchical structure of learning automata (HSLA) can be used. A hierarchical system of automata is a tree structure with depth of M where each node corresponds to an automaton and the arcs emanating from that node corresponds to actions of that automaton. In HSLA, an automaton with r actions is in the first level (root of tree) and kth level has r k − 1 automata each with r actions. The root node corresponds to an automaton which will be referred to as the first-level or top-level automaton. Selection of each action of this automaton leads to activate an automaton at the second level. In this way, the structure can be extended to an arbitrary number of levels. A three-level hierarchy with three actions per automaton is shown in Fig. ‎1.4.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig4_HTML.gif

Fig. 1.4

Hierarchical structure learning automata

The operation of hierarchical structure learning automata can be described as follows: initially, root automaton selects one action, say action $$ \alpha_{{i_{1} }} $$ . Then the $$ i_{1}^{th} $$ automaton at the second level will be activated. The action selected by $$ i_{1}^{th} $$ automaton at the second level (say $$ \alpha_{{i_{1} i_{2} }} $$ ) activates an automaton at the third level. This process is continued until a leaf automaton is activated. The action of this automaton is applied to the environment. The response from the environment is used to update the action probability vectors of the activated automata at the path from root to the selected leaf node. The basic idea of learning algorithm is to increase the probability of selecting good action and to decrease the probability of selecting other actions. In HSLA, set of actions

$$ \left\{ {\alpha_{{i_{1} }} ,\, \alpha_{{i_{1} i_{2} }} , \ldots , \alpha_{{i_{1} i_{2} \ldots i_{M} }} } \right\} $$

is said to be on optimal path if the product of their respective reward probabilities are maximum. The HSLA can be classified into three types I, II and III (Thathachar and Sastry 1997). A HSLA is said to be of type I if actions constituting the optimal path are also individual optimal at their respective levels. A HSLA is said to be of type II if the actions constituting the optimal path are also individual optimal at their respective automata. Any general hierarchy is said to be of type III.

1.2.2 Multi-Level Game of Learning Automata

Multi-level game of LAs can be effectively used for solving the multi-criteria optimization problems in which several objective functions are required to be optimized. In multi-level game of LAs, multiple games are played in different levels (Billard 1994, 1996). The game, which is being played at each level of the multi-level game of LAs, decides the game, which has to be played in the next level. Figure ‎1.5 shows a two-level game of LAs with four players.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig5_HTML.gif

Fig. 1.5

Multi-level game of learning automata

1.2.3 Network of Learning Automata

A network of LAs (Williams 1988) is a collection of LAs connected together as a hierarchical feed-forward layered structure. In this structure, the outgoing link of the LAs in the preceding layers is the input of the LAs of the succeeding layers. In this model, the LAs (and consequently the layers) are classified into three separate groups. The first group includes the LAs located at the first level of the network, called the input LAs (input layer). The second group is composed of the LAs located at the last level of the network, called the output LAs (output layer). The third group includes the LAs located between the first and the last layers, called the hidden LAs (hidden layer). In a network of LA (NLA), the input LAs receive the context vectors as external inputs from the environment, and the output LAs apply the output of the network to the environment. The difference between the feed-forward neural networks and NLA is that units of neural networks are deterministic while units of NLA are stochastic and the learning algorithms used in two networks are different. Since units are stochastic, the output of a particular unit $$ i $$ is drawn from a distribution depending its input weight vector and output of the units in the preceding layers. This model operates as follows. The context vector is applied to the input LAs. Each input LA selects one of its possible actions on the basis of its action probability vector and the input signals it receives from the environment. The chosen action activates the LAs of the next level, which are connected to this LA. Each activated LA selects one of its actions as stated before. The actions selected by the output LAs are applied to the random environment. The environment evaluates the output action in comparison with the desired output and generates the reinforcement signal. This reinforcement signal is then used by all LAs for updating their states. The structure of such a network is shown in Fig. ‎1.6.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig6_HTML.gif

Fig. 1.6

Network of learning automata

1.2.4 Distributed Learning Automata (DLA)

The hierarchical structure learning automata has a tree structure, in which there exists a unique path between the root of the tree and each of its leaves. However, in some applications, such as routing in computer networks, there may be multiple paths between the source and destination nodes. This system is a generalization of HSLA, which referred to as distributed learning automata (DLA). A Distributed learning automata (DLA) (Beigy and Meybodi 2006) shown in Fig. ‎1.7 is a network of interconnected learning automata which collectively cooperate to solve a particular problem. The number of actions for a particular LA in DLA is equal to the number of LA’s that are connected to this LA. Selection of an action by a LA in DLA activates another LA which corresponds to this action. Formally, a DLA can be defined by a quadruple 〈A, E, T, A 0〉, where A = {A 1, A 2, …, A n } is the set of learning automata, E ⊂ A × A is the set of edges where edge (v i , v j ) corresponds to action $$ \alpha_{i}^{j} $$ of automaton A i , T is the set of learning algorithms with which learning automata update their action probability vectors, and A 1 is the root automaton of DLA at which activation of DLA starts.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig7_HTML.gif

Fig. 1.7

Structure of distributed learning automata

The operation of a DLA can be described as follows: At first, the root automaton A 0 randomly chooses one of its outgoing edges (actions) according to its action probabilities and activates the learning automaton at the other end of the selected edge. The activated automaton also randomly selects an action which results in activation of another automaton. The process of choosing actions and activating automata is continued until a leaf automaton (an automaton which interacts with the environment) is reached. The chosen actions, along the path induced by the activated automata are applied to the random environment. The environment evaluates the applied actions and emits a reinforcement signal to DLA. The activated learning automata along the chosen path update their action probability vectors on the basis of the reinforcement signal according to the learning algorithms. The paths from the unique root automaton to one of the leaf automata are selected until the probability with which one of the chosen paths is close enough to unity. Each DLA has exactly one root automaton which is always activated, and at least one leaf automaton which is activated probabilistically. For example in Fig. ‎1.8, every automaton has two actions. If automaton LA 1 selects $$ \upalpha_{2}^{1}$$ from its action set, then it will activate automaton LA 2 . Afterward, automaton LA 2 chooses one of its possible actions and so on. At any time, only one LA in the network is active.

In (Sato 1999), a restricted version of above DLA is introduced. In this model, the underlying graph, in which the DLA is embedded, is a finite directed acyclic graph (DAG). Sato used the LR-I learning algorithm with decaying reward parameter. It is shown that every edge of DAG is selected infinitely often and thus every learning automaton is activated infinitely often. Also, it is shown that when every learning automaton has unique best action, the DLA converges to its best action with probability 1 (Sato 1999). Meybodi and Beigy introduced a DLA in which the underlying graph is not restricted to be a DAG (Beigy and Meybodi 2006). But in order to restrict a learning automaton to appear more than once in any path, the learning automaton with changing number of actions are used. The DLA was used for solving the several stochastic graph problems (Torkestani and Meybodi 2010a, 2012a; Mollakhalili Meybodi and Meybodi 2014; Rezvanian and Meybodi 2015a, b).

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig8_HTML.gif

Fig. 1.8

Example of distributed learning automata

1.2.5 Extended Distributed Learning Automata (eDLA)

An extended distributed learning automata (eDLA) (Mollakhalili Meybodi and Meybodi 2014) is a new extension of DLA supervised by a set of rules governing the operation of the LAs. Mollakhalili-Meybodi et al. presented a framework based on eDLA for solving stochastic graph optimization problems such as stochastic shortest path problem and stochastic minimum spanning tree problem. Here, we provide a brief introduction to the eDLA. In general in eDLA, the ability of a DLA is improved by adding communication rules and changing the activity level of each LA depends on a problem to be solved by an eDLA. An eDLA similar to a DLA can be modeled by a directed graph in which the node-set of graph constructs the set of LA and the number of actions for each LA in eDLA equals to the number of LAs that are connected to that LA. In eDLA, at any time, not only each LA can be in one mode of activity level but also each LA with a high activity level can be performed an action according to its probabilities on the random environment. Formally, an eDLA can be described by a 7-tuple 〈A, E, S, P, S ⁰ , F, C〉, where A is the set of LA, E ⊆ A × A is the edge-set of communication graph G = 〈V, E〉 and

$$ S = \left\{ {s_{1} ,s_{2} , \ldots ,s_{n} } \right\} $$

is a set of activity levels corresponding to each LA in eDLA; specially s i indicates the activity level for learning automaton A i in which

$$ s_{i} \in \left\{ {Pa, \, Ac, \, Fi, \, Of} \right\} $$

consists of one of the following activity levels: Passive (initial level of each LA and can be changed to Active), Active (activity level for set of available LAs and its level can be upgraded to Fire), Fire (the highest level of activity, LA can be performed and its level can be changed to Off) and Off (the lowest level of activity, LA is disabled and its level stay unchanged), represented briefly by Pa, Ac, Fi, and Of respectively. As mentioned, at any time only one LA in eDLA can be in the Fi level of activity and can be determined by fire function C which randomly selects a LA from a set of LAs with activity level of Ac. Governing rule P is the finite set of rules that governs the activity levels of each LA. P according to the current activity level of each LA, its adjacent LA or depending on the particular problem which eDLA is designed is defined.

$$ S^{0} = \left( {s_{1}^{0} ,s_{2}^{0} , \ldots ,s_{n}^{0} } \right) $$

and

$$ F = \left\{ {S^{F} |S^{F} = \left( {s_{1}^{F} ,s_{2}^{F} , \ldots ,s_{n}^{F} } \right)} \right\} $$

are the initial state and final conditions of eDLA.

The operation of eDLA can be described as follows. In eDLA, at first at initial state S ⁰ , a starting LA is randomly selected by firing function F to fires, selects one of outgoing edges (actions) according to its action probabilities and performs it on the random environment and at the same time the activity level of fired LA and neighboring LAs are changed to Of and Ac respectively. Changing activity levels of LAs result in the state of eDLA transferred from state S k to state S k+1 at instant k by governing rule P. Then the firing function C fires one LA from the set of LA with activity level of Ac to selects an action and then changes its activity level and neighboring LAs. The process of firing one LA by firing function, performing an action by fired LA, changing the activity level of fired LA and its neighbors by governing rule P is continued until the final condition of eDLA F is reached. F can be defined based on a set of criteria in terms of activity levels of LAs such that, if one of them is satisfied, the final condition of eDLA is realized. The environment evaluates the performed actions by fired LA and generates a reinforcement signal to eDLA. The action probabilities of fired LA along the visited nodes or LA of the nodes which are part of a solution to the problem of graph are then updated on the basis of the reinforcement signal according to the learning algorithm. Firing LAs of eDLA by starting from randomly LA is repeated predefined number of times until the solution of the problem for which eDLA is designed is obtained.

1.3 Cellular Learning Automata

Cellular learning automaton (CLA) (Meybodi et al. 2003) is a combination of cellular automaton (CA) (Packard and Wolfram 1985) and learning automaton (LA) (Narendra and Thathachar 1989). The basic idea of CLA is to use LA for adjusting the state transition probability of a stochastic CA. This model, which opens a new learning paradigm, is superior to CA because of its ability to learn and is also superior to single LA because it consists of a collection of LAs interacting with each other (Beigy and Meybodi 2004). A CLA is a CA in which a number of LAs is assigned to every cell. Each LA residing in a particular cell determines its action (state) on the basis of its action probability vector. Like CA, there is a local rule that the CLA operates under. The local rule of the CLA and the actions selected by the neighboring LAs of any particular LA determine the reinforcement signal to that LA. The neighboring LAs (cells) of any particular LA (cell) constitute the local environment of that LA (cell). The local environment of an LA (cell) is non-stationary due to the fact that the action probability vectors of the neighboring LAs vary during the evolution of the CLA. The operation of a CLA could be described as the following steps (Fig. ‎1.9): At the first step, the internal state of every cell is determined on the basis of the action probability vectors of the LAs residing in that cell. In the second step, the local rule of the CLA determines the reinforcement signal to each LA residing in that cell. Finally, each LA updates its action probability vector based on the supplied reinforcement signal and the chosen action. This process continues until the desired result is obtained.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig9_HTML.gif

Fig. 1.9

Operation of the cellular learning automaton

CLA can be either synchronous or asynchronous. In a synchronous CLA (Meybodi et al. 2003), LAs in all cells are activated at the same time synchronously using a global clock whereas in an asynchronous CLA (ACLA) (Beigy and Meybodi 2008), LAs in different cells are activated asynchronously. The LAs may be activated in either time-driven or step-driven manner. In a time-driven ACLA, each cell is assumed to have an internal clock which wakes up the LAs associated to that cell. In a step-driven ACLA, a cell is selected for activation in either a random or a fixed sequence. From another point of view, CLA can be either close or open. In a close CLA, the action selection of any particular LA in the next iteration of its evolution only depends on the state of the local environment of that LA (actions of its neighboring LAs) whereas in an open CLA (Beigy and Meybodi 2007), this not only depends on the local environment, but also on the external environments. In (Beigy and Meybodi 2010), a new type of CLA, called CLA with multiple LAs in each cell, has been introduced. This model is suitable for applications such as channel assignment in cellular networks, in which it is needed that each cell is equipped with multiple LAs. In (Beigy and Meybodi 2004), a mathematical framework for studying the behavior of the CLA has been introduced. It was shown that, for a class of local rules called commutative rules, different models of CLA converge to a globally stable state (Beigy and Meybodi 2004, 2007, 2008, 2010).

1.3.1 CLA-EC: CLA-Based Evolutionary Computing

CLA-EC model, proposed in (Rastegar et al. 2004), is a combination of CLA and the evolutionary computing model (Fig. ‎1.10). In this model, each cell of the CLA is equipped with an m-bit binary genome. Each genome has two components; model genome and string genome. Model genome is composed of m learning automata. Each LA has two actions; 0 and 1. The set of actions selected by the set of LAs of a particular cell concatenated to each other to form the second component of the genome, i.e. the string genome. The operation of a CLA-EC, in any particular cell c i , takes place as follows. Each LA residing within the cell c i randomly selects one of its actions according to its action probability vector. The actions selected by the set of LAs of the cell c i are concatenated to each other to form a new string genome for that cell. The fitness of this newly generated genome is then evaluated. If the newly generated genome is better than the current genome of the cell, then the current genome of the cell c i is replaced by the newly generated genome. Next, a number of the neighboring cells of the cell c i , according to the fitness evaluation of their corresponding genomes, are selected for mating. Note that the mating in this context is not reciprocal, i.e., a cell selects another cell for mating but not necessarily vice versa.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig10_HTML.gif

Fig. 1.10

The structure of the CLA-EC model

The results of the mating process in the cell c i are a number of reinforcement signals, one for each LA of the cell. The process of computing the reinforcement signal for each LA is described in Fig. ‎1.11. Each LA updates its action probability vector based on the supplied reinforcement signal and its selected action. This process continues until a termination condition is satisfied.

../images/458733_1_En_1_Chapter/458733_1_En_1_Fig11_HTML.gif

Fig. 1.11

The process of computing the reinforcement signal for each LA

1.4 Applications of Learning Automata

In the recent years, learning automata have been used as optimization tools in complex and dynamic environments where a large amount of uncertainty or lacking the information about the environment exists. In the recent years, learning automata have been successfully applied to a wide variety of applications such as optimization (Rezvanian and Meybodi 2010a, b; Hasanzadeh et al. 2013; Moradabadi and Beigy 2014; Mahdaviani et al. 2015; Mirsaleh and Meybodi 2015; Moradabadi et al. 2016; Vafashoar and Meybodi 2016), (Moradabadi and Beigy 2014), image processing (Mofrad et al. 2015; Damerchilu et al. 2016), graph problems (Soleimani-Pouri et al. 2012; Mousavian et al. 2013, 2014; Khomami et al. 2016a; Mirsaleh and Meybodi 2016b; Vahidipour et al. 2017a), data clustering (Ahangaran et al. 2017; Hasanzadeh-Mofrad and Rezvanian 2017), community detection (Amiri et al. 2013; Khomami et al. 2016b, 2017b; Mirsaleh and Meybodi 2016a; Elyasi et al. 2016.), link prediction (Moradabadi and Meybodi 2016, 2017), grid computing (Hasanzadeh and Meybodi 2014, 2015; Mofrad et al. 2016), stochastic social networks (Rezvanian and Meybodi 2015a, b, 2016a, c, 2017a), network sampling (Rezvanian et al. 2014; Jalali et al. 2015, 2016; Rezvanian and Meybodi 2015c, 2016b; Ghavipour and Meybodi 2017a), information diffusion (Daliri Khomami et al. 2014; Khomami et al. 2017a, c), recommender systems (Ghavipour and Meybodi 2016), wireless sensor networks (Safavi et al. 2014; Nicopolitidis 2015), WiMAX networks (Misra et al. 2015), network security (Krishna et al. 2014), wireless mesh networks (Kumar and Lee 2015), mobile video surveillance (Kumar et al. 2015a), vehicular environment (Kumar et al. 2014, 2015b), Peer-to-Peer networks (Saghiri and Meybodi 2016a, 2017a), and cloud computing (Morshedlou and Meybodi 2014, 2017).

2. Cellular Learning Automata

Alireza Rezvanian¹ , Ali Mohammad Saghiri² , Seyed Mehdi Vahidipour³ , Mehdi Esnaashari⁴ and Mohammad Reza Meybodi⁵

(1)

(2)

Department of Computer Engineering and Information Technology, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran

(3)

Faculty of Electrical and Computer Engineering, Department of Computer Engineering, University of Kashan, Kashan, Iran

(4)

Faculty of Computer Engineering, K.N.Toosi University of Technology, Tehran, Iran

(5)

Soft Computing Laboratory, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran

Alireza Rezvanian (Corresponding author)

Email: a.rezvanian@aut.ac.ir

Ali Mohammad Saghiri

Email: a_m_saghiri@aut.ac.ir

Seyed Mehdi Vahidipour

Email: vahidipour@kashanu.ac.ir

Mehdi Esnaashari

Email: esnaashari@kntu.ac.ir

Mohammad Reza Meybodi

Email: mmeybodi@aut.ac.ir

2.1 Introduction

In recent years, Cellular Learning Automata (CLAs) which are hybrid models based on Cellular Automata (CAs) and Learning Automata (LAs) have received much attentions (Thathachar and Sastry 2004; Beigy and Meybodi 2010; Esnaashari and Meybodi 2015). CLAs are distributed learning models which bring together the computational power of CAs and also the learning capability of LAs in unknown environments. These models are superior to CAs because of their ability to learn and are also superior to single LA because they consist of a collection of LAs interacting with each other. CLAs have been used in wide range of applications such as computer networks (Esnaashari and Meybodi 2008, 2010a, 2011, 2013; Beigy and Meybodi 2010; Saghiri and Meybodi 2016a, b, 2017a), social networks (Zhao et al. 2015; Ghavipour and Meybodi 2017b; Khomami et al. 2017c; Rezvanian and Meybodi 2017b), evolutionary computing (Rastegar and Meybodi 2004; Rastegar et al. 2006), and optimization (Mozafari et al. 2015). The CLAs can be classified from different perspectives, some of which are described as given bellow.

ClosedversusOpen CLAs: In closed CLAs, states of neighboring cells of each cell, called local environment, affect the action selection process of the LA of that cell, whereas in open CLAs, the local environment of each cell, a global environment, and an exclusive environment affect the action selection process of the LA of that cell. In an open CLA, each cell has its own exclusive environment and one global environment is defined for the whole CLA (Beigy and Meybodi 2007; Saghiri and Meybodi 2017b).

RegularversusIrregular CLAs: Depending on their structures, CLAs can be also classified as regular (Beigy and Meybodi 2004) or irregular (Esnaashari and Meybodi 2015). In irregular CLAs, the structure regularity assumption is relaxed.

SynchronousversusAsynchronous CLAs: In synchronous CLAs, all LAs in different cells are activated synchronously, whereas in asynchronous CLAs, LAs in different cells are activated asynchronously (Beigy and Meybodi 2007, 2008).

StaticversusDynamic CLAs: In a static CLA, the structure of the cells remains fixed during the evolution of the CLA (Beigy and Meybodi 2007, 2008, 2010; Esnaashari and Meybodi 2015; Mozafari et al. 2015; Zhao et al. 2015). Due to the dynamic nature of many real world problems, their structural properties are time varying and for this reason, using fixed graphs for modeling them is too restrictive. In a dynamic CLA, one aspect of the model such as structure, local rule, or neighborhood radius may change over time (Esnaashari and Meybodi 2011, 2013; Saghiri and Meybodi 2016a). Dynamic CLAs can be used to solve problems that can be modeled as dynamic graphs. Dynamic CLAs have higher capability than static CLAs in adapting themselves to the environment under which they operate. These models are used to solve complicated learning problems, which arise in many real world situations.

CLAs with fixed number of LAs in each cellversusCLAs with varying number of LAs in each cell: in CLAs with varying number of LAs in each cell, the number of LAs of each cell changes over time (Saghiri and Meybodi 2017b).

CLAs with one LAs in each cellversusCLAs with multiple LAs in each cell: in CLAs with multiple LAs in each cell, each cell is equipped with multiple LAs (Beigy and Meybodi 2010).

CLAs with fixed structure LAsversusCLAs with variable structure LAs: LAs can be classified into two main families; fixed and variable structure (Narendra and Thathachar 1989; Thathachar and Sastry 2004). In CLAs with fixed structure LAs, constituting LAs are of fixed structure type, whereas in CLAs with variable structure LAs, LAs are of variable structure type.

In the rest of this section, we focus on irregular and dynamic models of CLAs.

2.2 Irregular Cellular Learning Automata

Irregular cellular learning automaton (Fig. 2.1) (Esnaashari and Meybodi 2015) is a generalization of CLA in which the restriction of regular structure is removed. An ICLA is defined as an undirected graph in which, each vertex represents a cell and is equipped with an LA, and each edge induces an adjacency relation between two cells (two LAs). The LA residing in a particular cell determines its state (action) according to its action probability vector. Like CLA, there is a rule that the ICLA operates under. The local rule of the ICLA and the actions selected by the neighboring LAs of any particular LA determine the reinforcement signal to that LA. The neighboring LAs of any particular LA constitute the local environment of that LA. The local environment of an LA is non-stationary because the action probability vectors of the neighboring LAs vary during the evolution of the ICLA.

../images/458733_1_En_2_Chapter/458733_1_En_2_Fig1_HTML.gif

Fig. 2.1

Irregular cellular learning automaton

The operation of the ICLA is similar to the operation of the CLA. At the first step, the internal state of each cell is specified on the basis of the action probability vector of the LA residing in that cell. In the second step, the rule of the ICLA determines the reinforcement signal to the LA residing in each cell. Finally, each LA updates its action probability vector on the basis of the supplied reinforcement signal and the internal state of the cell. This process continues until the desired result is obtained. Formally, an ICLA is defined as given below.

Definition 2.1

Irregular cellular learning automaton is a structure

$$ {\mathsf{A}} = (G < E,\;V > ,\;\Phi ,\;A,\;{\mathsf{F}}) $$

, where:

G is an undirected graph, with V as the set of vertices (cells) and E as the set of edges (adjacency relations).

$$ \Phi $$ is a finite set of states and $$ \Phi _{i} $$ represents the state of the cell ci.

A is the set of LAs each of which is assigned to one cell of the ICLA.

$$ \mathsf{F} {:}\; \underline{\varphi }_{i} \to \underline{\beta } $$

is the local rule of the ICLA in each cell ci, where

$$ \underline{\varphi }_{i} = \left\{ {\varphi_{j} \left| {\left\{ {i,j} \right\} \in E} \right.} \right\} \cup \left\{ {\varphi_{i} } \right\} $$

is the set of states of all neighbors of ci and $$ \underline{\beta } $$ is the set of values that the reinforcement signal can take. Local rule gives the reinforcement signal to each LA from the current actions selected by the neighboring LAs of that LA.

Note that in the definition of the ICLA, no explicit definition is given for the neighborhood of each cell. It is implicitly defined in the definition of the graph G.

In what follows, we consider ICLA with N cells. The learning automaton LA i which has a finite action set $$ \underline{\alpha }_{i} $$ is associated to cell c i (for i = 1, 2, …, N) of the ICLA. Let the cardinality of $$ \underline{\alpha }_{i} $$ be m i .

The operation of the ICLA takes place as the following iterations. At iteration k, each learning automaton selects an action. Let $$ \alpha_{i} \in \underline{\alpha }_{i} $$ be the action selected by LA i . Then all learning automata receive a reinforcement signal. Let $$ \beta_{i} \in \underline{\beta } $$ be the reinforcement signal received by LA i . This reinforcement signal is produced by the application of the local rule

$$ \mathsf{F}^{i} \left( {\underline{\varphi }_{i} } \right) \to \underline{\beta } $$

. Higher values of β i mean that the selected action of LA i will receive higher penalties. Then, each LA i updates its action probability vector on the basis of the supplied reinforcement signal and its selected action $$ \alpha_{i} $$ .

Like CLA, ICLA can be either synchronous or asynchronous and an asynchronous ICLA can be either time-driven or step-driven.

2.2.1 Definitions and Notations

In this section, we give some definitions, which will be used later for the analysis of the ICLA.

Definition 2.2

A configuration of the ICLA at step k is denoted by

$$ \underline{p} \left( k \right) = \left( {\underline{p}_{1} ,\,\underline{p}_{2} , \,\ldots ,\,\underline{p}_{N} } \right)^{T} $$

, where $$ \underline{p}_{i} $$ is the action probability vector of the learning automaton LA i and T denotes the transpose operator.

Definition 2.3

A configuration $$ \underline{p} $$ is called deterministic if the action probability vector of each learning automaton is a unit vector, otherwise it is called probabilistic. Hence, the set of all deterministic configurations, $$ \mathsf{K}^{ *} $$ , and the set of probabilistic configurations, $$ \mathsf{K} $$ , in ICLA are

$$ \mathsf{K}^{ *} = \left\{ {\underline{p} \left| {\begin{array}{*{20}r} \hfill {\underline{p} = \left( {\underline{p}_{1} ,\,\underline{p}_{2} ,\, \ldots ,\,\underline{p}_{N} } \right)^{T} ,\;\;\underline{p}_{i} = \left( {p_{i1} ,\, \ldots ,\,p_{{im_{i} }} } \right)^{T} ,\quad \quad \quad \quad } \\ \hfill {p_{iy} = 0\,or\,1\,\forall y,\;i,\;\sum\limits_{y} {p_{iy} } = 1\,\forall i} \\ \end{array} } \right.} \right\}, $$

(2.1)

and

$$ \mathsf{K} = \left\{ {\underline{p} \left| {\begin{array}{*{20}r} \hfill {\underline{p} = \left( {\underline{p}_{1} ,\,\underline{p}_{2} ,\, \ldots ,\,\underline{p}_{N} } \right)^{T} ,\;\underline{p}_{i} = \left( {p_{i1} ,\, \ldots ,\,p_{{im_{i} }} } \right)^{T} ,\quad \quad \quad \quad } \\ \hfill {0 \le p_{iy} \le 1\,\forall y,\,i,\,\sum\limits_{y} {p_{iy} } = 1\,\forall i} \\ \end{array} } \right.} \right\}, $$

(2.2)

respectively.

Lemma 2.1

$$ \mathsf{K} $$ is the convex hull of $$ \mathsf{K}^{ *} $$ .

Proof

Proof of this lemma is given in (Beigy and Meybodi 2004). ■

The application of the local rule to every cell allows transforming a configuration to a new one.

Definition 2.4

The global behavior of an ICLA is a mapping G: K → K that describes the dynamics of the ICLA. The evolution of the ICLA from a given initial configuration $$ \underline{p} \left( 0 \right) \in \mathsf{K} $$ is a sequence of configurations $$ \left\{ {\underline{p} \left( k \right)} \right\}_{k \ge 0} $$ , such that

$$ \underline{p} \left( {k + 1} \right) = \mathsf{G} \left( {\underline{p} \left( k \right)} \right) $$

Definition 2.5

Neighborhood set of any particular LA i , denoted by N(i), is defined as the set of all learning automata residing in the adjacent cells of the cell c i , that is,

$$ N\left( i \right) = \left\{ {LA_{j} \left| {\left\{ {i,j} \right\} \in E} \right.} \right\}. $$

(2.3)

Let $$ \mathsf{N}_{i} $$ be the cardinality of N(i).

Definition 2.6

The average penalty for action r of learning automaton LA i for configuration $$ \underline{p} \in \mathsf{K} $$ is defined as

$$ d_{ir} \left( {\underline{p} } \right) = {\rm E}\left[ {\beta_{i} \left| {\underline{p} } \right.,\alpha_{i} = r} \right] = \sum\limits_{{y_{{j_{1} }} , \ldots ,\,y_{{j_{{\mathsf{N}_{i} }} }} }} {\mathsf{F}^{i} \left( {y_{{j_{1} }} , \ldots ,y_{{j_{{\mathsf{N}_{i} }} }} ,r} \right)\prod\limits_{{LA_{l} \in N\left( i \right)}} {p_{{ly_{{j_{l} }} }} } } , $$

(2.4)

and the average penalty for the learning automaton LA i is defined as

$$ \begin{aligned} D_{i} \left( {\underline{p} } \right) & = {\rm E}\left[ {\beta_{i} \left| {\underline{p} } \right.} \right] \\ & = \sum\limits_{y} {d_{iy} \left( {\underline{p} } \right)p_{iy} } . \\ \end{aligned} $$

(2.5)

The above definition implies that if the learning automaton LA j is not a neighboring learning automaton for LA i , then

Enjoying the preview?

Page 1 of 1

Recent Advances in Learning Automata

About this ebook

Alireza Rezvanian

Related authors

Related to Recent Advances in Learning Automata

Titles in the series (8)

Related ebooks

Technology & Engineering For You

Related podcast episodes

Related articles

Related categories

Reviews for Recent Advances in Learning Automata

What did you think?

Book preview

Recent Advances in Learning Automata - Alireza Rezvanian

1. Learning Automata Theory

1.1 Learning Automata

1.1.1 Fixed-Structure Learning Automata

1.1.2 Variable-Structure Learning Automata (VSLA)

1.1.3 Learning Automata with Variable Number of Actions

1.1.4 Estimator Algorithms

1.1.5 Pursuit Algorithms

1.1.6 Continuous Action-Set Learning Automata

1.2 Interconnected Learning Automata

1.2.1 Hierarchical Structure Learning Automata

1.2.2 Multi-Level Game of Learning Automata

1.2.3 Network of Learning Automata

1.2.4 Distributed Learning Automata (DLA)

1.2.5 Extended Distributed Learning Automata (eDLA)

1.3 Cellular Learning Automata

1.3.1 CLA-EC: CLA-Based Evolutionary Computing

1.4 Applications of Learning Automata

2. Cellular Learning Automata

2.1 Introduction

2.2 Irregular Cellular Learning Automata

Definition 2.1

2.2.1 Definitions and Notations

Definition 2.2

Definition 2.3

Lemma 2.1

Proof

Definition 2.4

Definition 2.5

Definition 2.6