Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Formal Languages, Automata and Numeration Systems 2: Applications to Recognizability and Decidability
Formal Languages, Automata and Numeration Systems 2: Applications to Recognizability and Decidability
Formal Languages, Automata and Numeration Systems 2: Applications to Recognizability and Decidability
Ebook309 pages3 hours

Formal Languages, Automata and Numeration Systems 2: Applications to Recognizability and Decidability

Rating: 0 out of 5 stars

()

Read preview

About this ebook

The interplay between words, computability, algebra and arithmetic has now proved its relevance and fruitfulness. Indeed, the cross-fertilization between formal logic and finite automata (such as that initiated by J.R. Büchi) or between combinatorics on words and number theory has paved the way to recent dramatic developments, for example, the transcendence results for the real numbers having a “simple” binary expansion, by B. Adamczewski and Y. Bugeaud.

This book is at the heart of this interplay through a unified exposition. Objects are considered with a perspective that comes both from theoretical computer science and mathematics. Theoretical computer science offers here topics such as decision problems and recognizability issues, whereas mathematics offers concepts such as discrete dynamical systems.

The main goal is to give a quick access, for students and researchers in mathematics or computer science, to actual research topics at the intersection between automata and formal language theory, number theory and combinatorics on words.

The second of two volumes on this subject, this book covers regular languages, numeration systems, formal methods applied to decidability issues about infinite words and sets of numbers.

LanguageEnglish
PublisherWiley
Release dateSep 10, 2014
ISBN9781119042860
Formal Languages, Automata and Numeration Systems 2: Applications to Recognizability and Decidability

Related to Formal Languages, Automata and Numeration Systems 2

Related ebooks

Technology & Engineering For You

View More

Related articles

Reviews for Formal Languages, Automata and Numeration Systems 2

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Formal Languages, Automata and Numeration Systems 2 - Michel Rigo

    2014

    Introduction

    This book, comprised of two volumes, is a somewhat extended version of lectures basically dedicated to combinatorics on words and numeration systems that I am giving at the University of Liège. The course is usually (but not necessarily) followed by students interested in discrete mathematics or theoretical computer science. The chosen level of abstraction should allow undergraduate students to study the exposed topics.

    I.1. What this book is or is not about

    In the long process of writing this book, I have expanded my initial notes with many examples and many extra concepts to acquire a consistent overview of the field. Nevertheless, this book is not intended to serve as an encyclopedic reference.

    I have picked some of my favorite topics in the area and I have also decided to shorten the presentation of some items (not because there are less interesting but choices had to be made to keep this book reasonably short). Indeed, the book most probably reflects what I myself prefer: I am always more interested in the combinatorics and the underlying discrete structures arising from a problem.

    When preparing this book, I chose to present a fairly large variety of basic notions and important tools and results. Sometimes, I only give an overview of a subject and proofs are, therefore, omitted. For the reader wanting to study a specific topic further, many pointers to the relevant bibliography are given and each chapter ends with notes and comments. Indeed, the main goal of this book is to give quick access to actual research topics at the intersection between automata and formal language theory, number theory and combinatorics on words.

    I.2. A few words about what you will find

    The notion of a word, i.e. a (finite or infinite) sequence of symbols belonging to a finite set, is central throughout this book. It has connections with many branches of mathematics and computer science: number theory, combinatorics, formal language theory, mathematical logic, symbolic dynamics, coding theory, computational complexity, discrete geometry, stringology, etc.

    Combinatorics on words. We can be interested in the combinatorial properties of finite or infinite sequences of symbols over a finite alphabet: what the possible arrangements are, how many such configurations can be achieved and so on. As a trivial example, over a binary alphabet any word of a length of at least 4 contains a repeated factor of the kind uu (try to prove it). Therefore, we can look at patterns that are unavoidable in sufficiently long sequences or count the number of patterns or configurations that may appear in a particular context. These are some of the general questions that will be considered in Volume 1, [RIG 14]. In particular, we will concentrate on infinite words that can be obtained by a simple procedure consisting of the iteration of a morphism over a free monoid. We will mostly deal with a large class of self-similar words: the so-called morphic words and, in particular, and with automatic sequences that are generated by a constant-length morphism.

    Formal language theory. A language is merely a set of words. In this book, we will mostly encounter languages of finite words. One exception is a short incursion into symbolic dynamical systems with the language of the β-expansions of the real numbers in the interval [0, 1). Chomsky’s hierarchy introduced in the theory of formal languages provides a classification depending on the machine needed to recognize an infinite language of finite words. From a computational perspective, the simplest languages are the regular languages. They are accepted (or recognized) by finite automata, and described by regular expressions. Chapter 1, is a short chapter presenting the main properties of these languages. We will constantly see connections existing between regular languages, automatic sequences and numeration systems. For instance, quite often we associate a finite automaton with a morphism.

    Number theory. A finite word can also be used to represent an integer in a given numeration system (e.g. integer base expansions and many other non-standard systems are discussed in depth in several chapters of this book). To quote A. Fraenkel: There are many ways of representing an integer uniquely! [FRA 85]. Similarly, an infinite word can represent a real number or the characteristic sequence of a set of integers. With that respect, a natural question is to study links existing between arithmetical properties of numbers (or sets of numbers) and syntactical properties of their expansions. Chapter 2, is dedicated to numeration systems with a particular emphasis on words representing numbers. Indeed, the chosen numeration system has a strong influence on the syntactical properties of the corresponding representations. A cornerstone is the notion of a recognizable set of numbers whose elements, when represented within a given numeration system, are recognized by a finite automaton.

    Formal methods applied to infinite words and sets of numbers. In Chapter 3 of this Volume, I describe a recent trend in combinatorics on words. Due to automata theory and Büchi’s theorem, we will see how formal methods enter the frame regarding decision problems, or automatic theorem-proving, relevant in combinatorics on words. If a property about some infinite words can be described by a well-written logical formula, then this property can be tested automatically. Such a procedure holds for a large class of infinite words generated by iterated morphisms (for automatic sequences and those stemming from Pisot numeration systems as presented in this book). The expressiveness of Presburger arithmetic (with an extra predicate) provides an interesting alternative to dealing with a sufficiently large class of problems about infinite morphic words. We can imagine automated certificates for several families of combinatorial properties. But the price to pay is that we would have to deal with fairly large automata. It is a field of research where combinatorists and computer scientists can work together fruitfully: on the one hand, it is well-known that, in the worst-case, the obtained decision procedures can be super-exponential, but on the other hand, the considered problems about words seem to be of relatively small complexity.

    I.3. How to read this book

    The goal is that, after reading this book (or at least parts of this book), the reader should be able to fruitfully attend a conference or a seminar in the field. I hope that the many examples presented along the text will help the reader to get some feeling about the presented topics even though we are not going too far in the technical aspects of the proofs. Also, prerequisites are minimal. We will not explore topics requiring measure theory or advanced linear algebra (we have avoided results related to Jordan normal form of matrices) or non-elementary number theory. Two sections are devoted to results in algebraic number theory and formal series. Sections 1.1.2 and 1.2.2 of Volume 1 serve as references that the reader may consult when needed. Sections 3.1 and 3.2 give a self-contained presentation of the concepts of mathematical logic needed in this book. Those rigorous and technical sections should not discourage the reader to pursue his/her study. Most of the material can be accessed without much background.

    My initial aim was to quickly get to the point but it seems that the stories I wanted to tell were indeed quite longer than I initially thought. I have to confess that writing this book was a quite unexpected adventure (I was perpetually trying to meet the deadlines and also dealing with my other duties at the University and at home).

    There are several paths that the reader can follow through this book. Some are quite long, some are shorter.

    – For a basic introduction, I propose reading parts of Chapter 1, Volume 1 (skipping the reference sections), Chapter 2, again Volume 1, up to and including section 2.4. If the reader already has some knowledge about automata, then we can conclude with Chapter 3 of this volume, concentrating on results about integer base systems.

    – For a one-semester course in combinatorics on words, I propose a reading of Volume 1, not sacrificing the rigorous presentation of section 1.2.1, Volume 1.

    – For a numeration system oriented reading, again organized over one semester: browse through the first chapter (with a careful reading of the examples related to numeration systems), then go to section 2.3, of Volume 1, and conclude with the last two chapters of this volume.

    – For a course oriented toward interaction between automata, logic and numeration systems, we can focus on Chapters 1 and 3 of this volume.

    About other sources treating similar subjects, an excellent companion for this book is definitely Automatic Sequences: Theory, Applications, Generalizations [ALL 03a] written by Allouche and Shallit. I do hope that the two books can be read independently and can benefit from each other. There is also a non-zero intersection with several chapters of the Lothaire’s book Algebraic Combinatorics on Words (namely those about Sturmian words written by Berstel and Séébold and the one on numeration systems written by Frougny) [LOT 02]. Some chapters of the volume Combinatorics, Automata and Number Theory [BER 10] as well as [PYT 02] can also serve as a follow up for the present book. In particular, Cassaigne and Nicolas’s chapter on factor complexity is a natural continuation for our Chapter 2, Volume 1. I should finish by mentioning two papers that were very influential in my work: [BRU 94] and [BRU 95]. With this book, I hope that the reader would learn as much material as found in these two papers.

    Tags of bibliographic entries are based on the first three letters of the last name of the first author and then the year of publication. In the bibliography, entries are sorted in alphabetical order using these tags.

    I intend to make a page:

    www.discmath.ulg.ac.be/flans/

    for errata and comments about this book.

    I.4. Acknowledgments

    I would like to express my gratitude to Valérie Berthé for her constant and enthusiastic support, for the many projects we run together and finally, for her valuable comments on a draft of this book.

    Several researchers have spent some precious time reading a first draft of this book, their careful reading, their feedback and expert comments were very useful and valuable: Anna Frid, Julien Leroy, Aline Parreau, Narad Rampersad, Eric Rowland, Aleksi Saarela and Jeffrey Shallit. They proposed many clever improvements of the text. I warmly thank them all. I would like to give a special thank to Véronique Bruyère for comments on the last chapter.

    I also sincerely thank Jean-Paul Allouche, Émilie Charlier, Fabien Durand and Victor Marsault for their feedback.

    Even though he was not directly involved in the writing process of this book, the first half of the book has greatly benefited from the many discussions I had with Pavel Salimov when he was a postdoctoral fellow in Liège. Naturally, all the discussions and interactions I could have had along the years with students, colleagues and researchers worldwide had some great influence on me (but such a list would be too long) and I thank them all.

    Michel RIGO

    July 2014

    1

    Crash Course on Regular Languages

    The theory of finite automata has preserved from its origins a great diversity of aspects. From one point of view, it is a branch of mathematics connected with the algebraic theory of semigroups and associative algebras. From another point of view, it is a branch of algorithm design concerned with string manipulation and sequence processing. It is perhaps this diversity that has enriched the field to make it presently one with both interesting applications and significant mathematical problems.

    Dominique Perrin [PER 90]

    This chapter is intended to be short. Automata theory and formal language theory have been developed for more than 50 years. There are many textbooks devoted to these theories (and we can easily find series of exercises). To cite just a few (all these books cover in particular what is nowadays considered as classical material): [HOP 79] or [SUD 06] where the focus is oriented toward the computational models and the corresponding algorithms, the comprehensive [SAK 09] where a wider perspective, a more general algebraic framework and emphasis on the underlying structures are provided. I personally like the presentation and the material covered in [SHA 08]. I can also mention [LAW 04] (for a light introduction to the syntactic monoid of a language) or the classic [EIL 74]. See also, the survey [YU 97] or [PER 90] for a condensed exposition.

    We have already encountered the notion of automaton in several sections of this book: the formal presentation of deterministic automaton first appeared in definition 2.23. Then the extended model with output was given in definition 2.27, Volume 1, to deal with automatic sequences. We will work only with these types of finite machines¹.

    First, we will provide summary of some basic results about finite automata and regular languages. Then, we will select some particular topics related to the main themes of the book: recognizable sets of numbers and morphic or automatic words. The lastsection will present regular languages of polynomial growth. Thus, their characterization is applied to growing letters in a morphic word. Also, this chapter will serve as a preparation for the last chapter of the book where automata will play an important role when dealing with decision procedures. For an easy understanding of Chapter 3, the readers familiar with automata theory should consult sections 1.3 and 1.7.

    1.1. Automata and regular languages

    We begin with a definition that we have already described in definition 2.23, Volume 1. But, here the focus is put on the words, and thus on the language that is accepted by such a machine.

    DEFINITION 1.1.– A deterministic finite automaton, or DFA for short, over an alphabet B = (Q, q0,B, δ, F), where Q is a finite set of states, q0 ∈ Q is the initial state, δ : Q × B → Q is the transition function and F Q is the set of final states. The map δ can be extended to Q × B* by setting δ(q, ε) = q and δ(q, wa) = δ(δ(q, w),a) for all q Q, a B and w B*. The language accepted or recognized is

    A regular language is a language accepted by some DFA.

    Given a DFA, we have a partition of B* = L) ∪ (B* \L)). A word w is trivially either accepted whenever δ(q0,w) ∈ F . Since the automaton is deterministic², exactly one of the two situations occurs. So to speak, given a word w, we start reading this word from the initial state, one letter at a time from left to right. Following the transitions of δ, the current state is updated until the whole word has been read. At that time, we test if the reached state is final or not.

    EXAMPLE 1.2.– Consider the DFA over {a, b} having {0, 1, 2} as set of states. The transition function is defined by δ(0, a) = 0, δ(0, b) = 1, δ(1, a) = 2, δ(1, b) = 1, δ(2, a) = 2, δ(2, b) = 2. The initial state (represented with an in-coming arrow) is 0, and the final states are 0, 1 (represented with an out-going arrow). This automaton is depicted in Figure 1.1. For a word w = w0 … wℓ, we can consider the behavior (or state transition sequence) of the DFA, that is the sequence of + 2 states

    Figure 1.1. A DFA accepting {aibj | i, j ≥ 0}

    Reading aabb or ababb gives, respectively,

    The word aabb is accepted because δ(q0, aabb) = 1 ∈ F . It is easy to see that the language accepted by this DFA is {aibj | i, j ≥ 0}.

    As we will see soon (when dealing with operations on regular languages), it is also interesting to introduce a more general model of machines: the non-deterministic automata.

    DEFINITION 1.3.– A non-deterministic finite automaton, or NFA for short, over an alphabet B = (Q, I, B, Δ,F), where Q is a finite set of states, I Q is the set of initial states, Δ ⊂ Q × B* × Q is the (finite) transition relation and F Q is the set of final states. A word w is accepted if there exists an integer i, some (possibly empty) words v1,…,vi and a sequence of states q1,…,qi+1 such that w = v1 … vi and

    Otherwise stated, there is at least one accepting path from an initial state to some final state with label w. The language accepted is the set of accepted words. We can assume that (q, ε, q) belongs to Δ for all q Q.

    . Let w be a finite word. We let

    to denote the set of states defined as follows. The state r Q belongs to R.w if and only if there exists an integer i, some (possibly empty) words v1, … , vi and a sequence of states q1, … , qi such that w = v1 … vi,

    Otherwise stated, there is a path of label w starting in a state belonging to R and leading to ris

    EXAMPLE 1.4.– We have depicted an NFA in Figure 1.2. The readers may notice that we have two initial states: 0 and 3. There are several ε-transitions, i.e. transitions labeled by the empty word, for instance, from 1 to 4. Reading the symbol a from state 0 can lead to any of the four states 0, 1, 3, 4. With the notation introduced above

    and {5}.b = {1, 2, 4, 5}. Indeed we have, for instance, that

    There are several ways to accept the word aa (but also some non-accepting paths, although the existence of one accepting path is enough).

    Figure 1.2. A non-deterministic automaton over {a, b}

    REMARK 1.5.–

    Enjoying the preview?
    Page 1 of 1