Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Metaheuristic Applications in Structures and Infrastructures
Metaheuristic Applications in Structures and Infrastructures
Metaheuristic Applications in Structures and Infrastructures
Ebook1,088 pages10 hours

Metaheuristic Applications in Structures and Infrastructures

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Due to an ever-decreasing supply in raw materials and stringent constraints on conventional energy sources, demand for lightweight, efficient and low-cost structures has become crucially important in modern engineering design. This requires engineers to search for optimal and robust design options to address design problems that are commonly large in scale and highly nonlinear, making finding solutions challenging. In the past two decades, metaheuristic algorithms have shown promising power, efficiency and versatility in solving these difficult optimization problems.

This book examines the latest developments of metaheuristics and their applications in structural engineering, construction engineering and earthquake engineering, offering practical case studies as examples to demonstrate real-world applications. Topics cover a range of areas within engineering, including big bang-big crunch approach, genetic algorithms, genetic programming, harmony search, swarm intelligence and some other metaheuristic methods. Case studies include structural identification, vibration analysis and control, topology optimization, transport infrastructure design, design of reinforced concrete, performance-based design of structures and smart pavement management. With its wide range of everyday problems and solutions, Metaheursitic Applications in Structures and Infrastructures can serve as a supplementary text for design courses and computation in engineering as well as a reference for researchers and engineers in metaheuristics, optimization in civil engineering and computational intelligence.

  • Review of the latest development of metaheuristics in engineering.
  • Detailed algorithm descriptions with focus on practical implementation.
  • Uses practical case studies as examples and applications.
LanguageEnglish
Release dateJan 31, 2013
ISBN9780123983794
Metaheuristic Applications in Structures and Infrastructures

Read more from Xin She Yang

Related to Metaheuristic Applications in Structures and Infrastructures

Related ebooks

Structural Engineering For You

View More

Related articles

Related categories

Reviews for Metaheuristic Applications in Structures and Infrastructures

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Metaheuristic Applications in Structures and Infrastructures - Xin-She Yang

    1

    Metaheuristic Algorithms in Modeling and Optimization

    Amir Hossein Gandomi¹, Xin-She Yang², Siamak Talatahari³ and Amir Hossein Alavi⁴, ¹Department of Civil Engineering, The University of Akron, Akron, OH, USA, ²School of Science and Technology, Middlesex University, London, UK, ³Marand Faculty of Engineering, University of Tabriz, Tabriz, Iran, ⁴School of Civil Engineering, Iran University of Science and Technology, Tehran, Iran

    1.1 Introduction

    In metaheuristic algorithms, meta- means beyond or higher level. They generally perform better than simple heuristics. All metaheuristic algorithms use some trade-off of local search and global exploration. The variety of solutions is often realized via randomization. Despite the popularity of metaheuristics, there is no agreed upon definition of heuristics and metaheuristics in the literature. Some researchers use heuristics and metaheuristics interchangeably. However, the recent trend tends to name all stochastic algorithms with randomization and global exploration as metaheuristic. Randomization provides a good way to move away from local search to the search on the global scale. Therefore, almost all metaheuristic algorithms are usually suitable for nonlinear modeling and global optimization.

    Metaheuristics can be an efficient way to use trial and error to produce acceptable solutions to a complex problem in a reasonably practical time. The complexity of the problem of interest makes it impossible to search every possible solution or combination; the aim is to find good, feasible solutions in an acceptable timescale. There is no guarantee that the best solutions can be found; we do not even know whether an algorithm will work, and if it does work, why (Yang, 2008, 2010a). The idea is to have an efficient and practical algorithm that will work most of the time and is able to produce good-quality solutions. Among the quality solutions found, it can be expected that some of them are nearly optimal, though there is no guarantee for such optimality.

    The main components of any metaheuristic algorithm are: intensification and diversification, or exploitation and exploration (Blum and Roli, 2003). Diversification means generating diverse solutions so as to explore the search space on the global scale, while intensification means focusing on the search in a local region by exploiting the information that a current good solution is found in this region. This is in combination with the selection of the best solutions (Yang, 2011a). The selection of the best ensures that the solutions will converge to the optimality. On the other hand, the diversification via randomization increases the diversity of the solutions while keeping the solutions from being trapped at local optima. The good combination of these two major components will usually ensure that the global solution is achievable.

    Metaheuristic algorithms can be classified in many ways. One way is to classify them as either population-based or trajectory-based (Yang, 2010a). For example, genetic algorithms (GAs) and genetic programming (GP) are population-based as they use a set of strings; the particle swarm optimization (PSO) algorithm, which uses multiple agents or particles, is also population-based (Kennedy and Eberhart, 1995). On the other hand, simulated annealing (SA) (Kirkpatrick et al., 1983) uses a single solution that moves through the design space or search space, while artificial neural networks (ANNs) use a different approach.

    Modeling and optimization may have a different emphasis, but for solving real-world problems, we often have to use both modeling and optimization because modeling makes sure that the objective functions are evaluated using the correct mathematical/numerical model for the problem of interest, while optimization can achieve the optimal settings of the design parameters. For optimization, the essential parts are the optimization algorithms. For this reason, we will focus on the algorithms, especially metaheuristic algorithms.

    1.2 Metaheuristic Algorithms

    1.2.1 Characteristics of Metaheuristics

    Throughout history, especially during the early periods of human history, the main approach to problem-solving has always been heuristic or metaheuristic—by trial and error. Many important discoveries were done by thinking outside the box, and often by accident; that is heuristics. Archimedes’ Eureka moment was a heuristic triumph. In fact, our daily learning experiences (at least as children) are dominantly heuristic (Yang, 2010a). There are many reasons for the popularity and success of metaheuristics, and one of the main reasons is that these algorithms have been developed by mimicking the most successful processes in nature, including biological systems, and physical and chemical processes. For most algorithms, we know their fundamental components, but how exactly these components interact to achieve efficiency still remains largely a mystery, which inspires more active studies. Convergence analysis of a few algorithms shows some insight, but in general the mathematical analysis of metaheuristic algorithms still has many open questions and is still an ongoing active research topic (Yang, 2011a,c).

    The notable performance of metaheuristic algorithms is due to how they imitate the best features in nature. Intensification and diversification are two main features of metaheuristic algorithms (Blum and Roli, 2003; Gandomi and Alavi, 2012a; Yang, 2010a, 2011c). The intensification phase, also called the exploitation phase, searches the current best solutions and selects the best candidates or solutions. The diversification phase, also called the exploration phase, ensures that the algorithm explores the search space more efficiently. The overall efficiency of an algorithm is mainly influenced by a fine balance between these two components. The system may be trapped in local optima if there is too little exploration or too much exploitation. In this case, it would be very difficult or even impossible to find the global optimum. On the other hand, if there is too much exploration but too little exploitation, it may be difficult for the system to converge. In this case, the overall search performance decelerates. Balancing these two components is itself a major optimization problem (Yang, 2011c). Evidently, simple exploitation and exploration are a part of the search. During the search, a proper mechanism or criterion should be considered to select the best solutions. Survival of the fittest is a common criterion. It is based on repeatedly updating the current best solution found so far. Moreover, certain elitism should be used. This is to verify that the best or fittest solutions are not lost and are passed onto the next generations.

    Each algorithm and its variants use different ways to obtain a balance between exploration and exploitation. Certain randomization in combination with a deterministic procedure can be considered an efficient way to achieve exploration or diversification. This makes sure that the newly generated solutions distribute as diversely as possible in the feasible search space. From the implementation viewpoint, the actual way of implementing the algorithm does affect the performance to some degree. Hence, validation and testing of implementation of any algorithm are important (Talbi, 2009).

    1.2.2 No Free Lunch Theorems

    There are the so-called no free lunch theorems, which can have significant implications in the field of optimization (Wolpert and Macready, 1997). One of the theorems states that if algorithm A outperforms algorithm B for some optimization functions, then B will be superior to A for other functions. In other words, if averaged over all possible function space, both algorithms A and B will perform, on average, equally well. That is to say, there are no universally better algorithms. An alternative viewpoint is that there is no need to find the average over all the possible functions for a given optimization problem. In this case, the major task is to find the best solutions, which has nothing to do with the average over all possible function space. Other researchers believe that there is no universal tool and, based on experiences, some algorithms outperform others for given types of optimization problems. Thus, the main objective would be either to choose the most suitable algorithm for a given problem or to design better algorithms for most types of problems, not necessarily for all the problems.

    1.3 Metaheuristic Algorithms in Modeling

    Various methodologies can be employed for nonlinear system modeling. Each method has its own advantages or drawbacks. The need to determine both the structure and the parameters of the engineering systems makes the modeling of these systems a difficult task. In general, models are classified into two main groups: (i) phenomenological and (ii) behavioral (Metenidis et al., 2004). The first class is established by taking into account the physical relationships governing a system. The structure of a phenomenological model is chosen on the basis of a priori knowledge about the system. To cope with the design complexity of phenomenological models, behavioral models are usually used. The behavioral models capture the relationships between the inputs and the outputs on the basis of a measured set of data. Thus, there is no need for a priori knowledge about the mechanisms that produced the experimental data. Such models are beneficial because they can provide very good results with minimal effort (Gandomi and Alavi, 2011, 2012a,b; Metenidis et al., 2004). Statistical regression techniques are widely used in behavioral modeling approaches.

    Several alternative metaheuristic approaches have been developed for behavioral modeling. Developments in computer hardware during the last two decades have made it much easier for these techniques to grow into more efficient frameworks. In addition, various metaheuristics may be used as efficient tools in problems where conventional approaches fail or perform poorly. Two well-known classes of metaheuristic algorithms used in nonlinear modeling are ANNs (Haykin, 1999) and GP (Koza, 1992). ANNs have been used for a wide range of structural engineering problems (Alavi and Gandomi, 2011a; Sakla and Ashour, 2005). In spite of the successful performance of ANNs, they usually do not give a deep insight into the process for which they obtain a solution. GP, as an extension of GAs, possess completely new characteristics. GP is essentially a supervised machine-learning approach that searches a program space instead of a data space and automatically generates computer programs that are represented as tree structures and expressed using a functional programming language (Gandomi and Alavi, 2011; Koza, 1992). The ability to generate prediction models without assuming the form of the existing relationships is surely a main advantage of GP over regression and ANN techniques. GP and its variants are widely used for solving real-world problems (Alavi and Gandomi, 2011b; Gandomi et al., 2011a,b). There are some other metaheuristic algorithms have been described in the literature for modeling; these include, fuzzy logic (FL) and support vector machine (SVM). These algorithms (ANNs, GP, FL, and SVM) are explained in the following sections.

    1.3.1 Artificial Neural Networks

    ANNs emerged as a result of simulating a biological nervous system. The ANN method was developed in the early 1940s by McCulloch and coworkers (Perlovsky, 2001). The first studies were focused on building simple neural networks to model simple logic functions. At present, ANNs have been applied to problems that do not have algorithmic solutions or problems with complex solutions. In this study, the approximation ability of two of the most well-known ANN architectures, multilayer perceptron (MLP) and radial basis function (RBF), are investigated.

    1.3.1.1 Multilayer Perceptron Network

    MLP networks are a class of ANN structures using feed-forward architecture. They are among the most widely used metaheuristics for modeling complex systems in real-world applications (Alavi et al., 2010a). The MLP networks are usually applied to perform supervised learning tasks, which involve iterative training methods to adjust the connection weights within the neural network. MLPs are universal approximators; that is, they are capable of approximating essentially any continuous function to an arbitrary degree of accuracy. They are often trained using back propagation (BP) (Rumelhart et al., 1986) algorithms. MLPs consist of an input layer, at least one hidden layer of neurons, and an output layer. Each of these layers has several processing units, and each unit is fully interconnected with weighted connections to units in the subsequent layer. Each layer contains a number of nodes. Every input is multiplied by the interconnection weights of the nodes. The output (hj) is obtained by passing the sum of the product through an activation function. Further details of MLPs can be found in Cybenko (1989) and Haykin (1999).

    1.3.1.2 Radial Basis Function

    RBFs have feed-forward architectures. Compared with other ANN structures such as MLPs, the RBF procedure for finding complex relationships is generally faster, and their training is much less computationally intensive. The structure of the RBF network consists of an input layer, a hidden layer with a nonlinear RBF activation function, and a linear output layer. Input vectors are transformed into RBFs by means of the hidden layer (Alavi et al., 2009).

    The transformation functions used are based on a Gaussian distribution as an activation function. The center and width are two important parameters that are related to the Gaussian basis function. As the distance, usually Euclidean distance, between the input vector and its center increases, as the output given by the activation function decays to zero. The rate of decrease in the output is controlled by the width of RBF. The RBF networks with Gaussian basis functions have been shown to be universal function approximators with high point-wise convergence (Girosi and Poggio, 1990).

    1.3.2 Genetic Programming

    GP is a symbolic optimization technique that creates computer programs to solve a problem using the principle of Darwinian natural selection (Koza, 1992). Friedberg (1958) left the first footprints in the area of GP by using a learning algorithm to stepwise improve a program in a stepwise manner. Much later, Cramer (1985) applied GAs and tree-like structures to evolve programs. The breakthrough in GP then came in the late 1980s with the experiments of Koza (1992) on symbolic regression. GP was introduced by Koza (1992) as an extension of GA. The main difference between GP and GA is the representation of the solution. The GP solutions are computer programs that are represented as tree structures and expressed in a functional programming language (like LISP) (Koza, 1992). GA first creates a string of numbers that represent the solutions. In GP, the evolving programs (individuals) are parse trees that, unlike fixed-length binary strings, can vary in length throughout the run. Essentially, this was the beginning of computer programs that could program themselves (Koza, 1992). Since GP often evolves computer programs, the solutions can be executed without post-processing, while coded binary strings typically evolved by GA require post-processing. The optimization techniques, like GA, are generally used in parameter optimization to evolve so as to find the best values for a given set of model parameters. GP, on the other hand, provides the basic structure of the approximation model, together with the values of its parameters (Javadi and Rezania, 2009). GP optimizes a population of computer programs according to a fitness landscape determined by a program’s ability to perform a given computational task. The fitness of each program in the population is evaluated using a predefined fitness function. Thus, the fitness function is the objective function GP aims to optimize (Torres et al., 2009).

    This classical GP approach is referred to as tree-based GP. In addition to the traditional tree-based GP, there are other types of GP where programs are represented in different ways (Figure 1.1). These are linear GP and graph-based GP (Alavi et al., 2012; Banzhaf et al., 1998). The emphasis of the present study is on the linear-based GP techniques.

    Figure 1.1 Different types of GP.

    1.3.2.1 Linear-Based GP

    There are a number of reasons for using linear GP. Basic computer architectures are fundamentally the same now as they were 20 years ago when GP began. Almost all the architectures represent computer programs in a linear fashion. Also, computers do not naturally run tree-shaped programs. Hence, slow interpreters have to be used as part of tree-based GP. Conversely, by evolving the binary bit patterns, in fact, used by computers, the use of an expensive interpreter (or compiler) is avoided and GP can run several orders of magnitude faster (Poli et al., 2007). Several linear variants of GP have been recently proposed. Some of them are (Oltean and Grosşan, 2003a): linear genetic programming (LGP) (Brameier and Banzhaf, 2007), gene expression programming (GEP) (Ferreira, 2001), multiexpression programming (MEP) (Oltean and Dumitrescu, 2002), Cartesian genetic programming (CGP) (Miller and Thomson, 2002), genetic algorithm for deriving software (GADS) (Patterson, 2002), and infix form genetic programming (IFGP) (Oltean and Grosşan, 2003b). LGP, GEP, and MEP are the most common linear-based GP methods. These variants make a clear distinction between the genotype and the phenotype of an individual. The individuals in these variants are represented as linear strings (Oltean and Grosşan, 2003a).

    1.3.2.1.1 Linear Genetic Programming

    LGP is a subset of GP with a linear representation of individuals. There are several main differences between LGP and the traditional tree-based GP. Figure 1.2 presents a comparison of program structures in LGP and tree-based GP. Linear genetic programs can be seen as a data flow graph generated by multiple usage of register content. LGP operates on genetic programs that are represented as linear sequences of instructions of an imperative programming language (like C/C++) (see Figure 1.2A). As shown in Figure 1.2B, the data flow in tree-based GP is more rigidly determined by the tree structure of the program (Brameier and Banzhaf, 2001; Gandomi et al., 2010).

    Figure 1.2 Comparison of the GP structures. (A) LGP and (B) Tree-based GP. Source: After Alavi et al. (2010).

    In the LGP system described here, an individual program is interpreted as a variable-length sequence of simple C instructions. The instruction set or function set of LGP consists of arithmetic operations, conditional branches, and function calls. The terminal set of the system is composed of variables and constants. The instructions are restricted to operations that accept a minimum number of constants or memory variables, called registers (r), and assign the result to a destination register, e.g., r0=r1+1. LGPs can be converted into a functional representation by successive replacements of variables, starting with the last effective instruction (Oltean and Grosşan, 2003a). Automatic induction of machine code by genetic programming (AIMGP) is a particular form of LGP. In AIMGP, evolved programs are stored as linear strings of native binary machine code and are directly executed by the processor during fitness calculation. The absence of an interpreter and complex memory handling results in a significant speedup in the AIMGP execution compared to tree-based GP. This machine-code-based LGP approach searches for the computer program and the constants at the same time. Comprehensive descriptions of the basic parameters used to direct a search for a linear genetic program can be found in Brameier and Banzhaf (2007).

    1.3.2.1.2 Gene Expression Programming

    GEP is a natural development of GP. It was first presented by Ferreira (2001). GEP consists of five main components: function set, terminal set, fitness function, control parameters, and termination condition. Unlike the parse-tree representation in the conventional GP, GEP uses a fixed length of character strings to represent solutions to the problems, which are afterward expressed as parse trees of different sizes and shapes. These trees are called GEP expression trees (ETs). One advantage of the GEP technique is that the creation of genetic diversity is extremely simplified as genetic operators work at the chromosome level. Another strength of GEP is that it refers to its unique; the multigenic nature allows the evolution of more complex programs composed of several subprograms (Gandomi and Alavi, 2011b, 2012c).

    GEP genes have a fixed length, which is predetermined for a given problem. Thus, what varies in GEP is not the length of genes but the size of the corresponding ETs.

    1.3.2.1.3 Multiexpression Programming

    MEP is a subarea of GP developed by Oltean and Dumitrescu (2002). MEP uses linear chromosomes for solution encoding. It has a special ability to encode multiple solutions (computer programs) of a problem in a single chromosome. Based on the fitness values of the individuals, the best encoded solution is chosen to represent the chromosome. There is no increase in the complexity of the MEP decoding process, compared with the other GP variants that store a single solution in a chromosome. The exception is for the situations where the set of training data is not known (Oltean and Grosşan, 2003a,c). The evolutionary steady-state MEP algorithm typically starts by the creation of a random population of individuals.

    MEP is represented in a similar way to that of C and Pascal compilers translating mathematical expressions into machine code. The number of MEP genes per chromosome is constant, which specifies the length of the chromosome. A terminal (an element in the terminal set T) or a function symbol (an element in the function set F) is encoded by each gene. A gene that encodes a function includes pointers toward the function arguments. Function parameters always have indices of lower values than the position of that function itself in the chromosome. The first symbol in a chromosome must be a terminal symbol as stated by the proposed representation scheme.

    The fitness of each expression in an MEP chromosome is calculated to designate the best encoded expression in that chromosome (Alavi et al., 2010b).

    1.3.3 Fuzzy Logic

    FL is a process of mapping an input space onto an output space using membership functions and linguistically specified rules (Ceven and Ozdemir, 2007). The concept of fuzzy set was preliminarily introduced by Zadeh (1965). The fuzzy approach is more in line with human thought as it provides possible rules relating input variables to the output variable. FL is well suited to implementing control rules that can only be expressed verbally. It can also be used for the modeling of systems that cannot be modeled with linear differential equations (Afandizadeh-Zargari et al., 2012).

    The essential idea in FL is the concept of partial belongings of any object to different subsets of the universal set instead of full belonging to a single set. Partial belonging to a set can be described numerically by a membership function (Topcu and Sarıdemir, 2008). A membership function is a curve, mapping an input element to a value between 0 and 1, showing the degree to which it belongs to a fuzzy set. Membership degree is the value of every element, varying between 0 and 1. A membership function can have different shapes for different kinds of fuzzy sets, such as bell, sigmoid, triangle, and trapezoid (Ceven and Ozdemir, 2007). In FL, rules and membership sets are used to make a decision. The idea of a fuzzy set is basic and simple: an object is allowed to have a gradual membership of a set. It means the degree of truth of a statement can range between 0 and 1, which is not limited to just two logic values {true, false}.

    When linguistic variables are used, these degrees may be managed by specific functions. A fuzzy system consists of output and input variables. For each variable, fuzzy sets that characterize those variables are formulated, and for each fuzzy set a membership function can be defined. After that, the rules that relate the output and input variables to their fuzzy sets are defined. Figure 1.3 depicts a typical FL system where a general fuzzy inference system has basically four components: fuzzification, fuzzy rule base, fuzzy inference engine, and defuzzification (Topcu and Sarıdemir, 2008).

    Figure 1.3 FL system.

    1.3.4 Support Vector Machines

    SVM is a well-known machine-learning method, based on statistical learning theory (Boser et al., 1992; Vapnik, 1995, 1998). Similar to ANNs, the SVM procedure involves a training phase in which a series of input and target output values are fed into the model. A trained algorithm is then employed to evaluate a separate set of testing data. Two fundamental concepts underlying the SVM are (Goh and Goh, 2007):

    1. An optimum margin classifier. This is a linear classifier that constructs a separating hyperplane (decision surface) such that the distance between the positive and the negative examples is maximized.

    2. Use of kernel functions. A kernel function is a function that calculates the dot product of two vectors. A suitable nonlinear kernel can map the original example data onto a new data set that become linearly separable in a high-dimensional feature space, even though they are nonseparable in the original input space (Goh and Goh, 2007; Vapnik, 1995, 1998).

    The SVM procedure can be outlined as follows (Goh and Goh, 2007):

    a. Choosing a kernel function with related kernel parameters.

    b. Solving a Lagrange cost function and obtaining the Lagrange multipliers.

    c. Carrying out the binary classification task, with training input data points.

    Comprehensive descriptions of SVM can be found in more advanced literature (Goh and Goh, 2007; Vapnik, 1995).

    1.4 Metaheuristic Algorithms in Optimization

    To find an optimal solution to an optimization problem is often a very challenging task, depending on the choice and the correct use of the right algorithm. The choice of an algorithm may depend on the type of problem, the available of algorithms, computational resources, and time constraints. For large-scale, nonlinear, global optimization problems, there is often no agreed guideline for algorithm choice, and in many cases, there is no efficient algorithm. For hard optimization problems, especially for nondeterministic polynomial-time hard, or NP-hard, optimization problems, there is no efficient algorithm at all. In most applications, an optimization problem can be commonly expressed in the following generic form (Yang, 2010a, 2011e):

    (1.1)

    (1.2)

    (1.3)

    where fi(x), hj(x), and gk(x) are functions of the design vector x=(x1, x2, …, xn)T. Here, the components xi of x are called design or decision variables, and they can be real continuous, discrete, or a mix of these two. The functions fi(x), where i=1, 2, …, M are called the objective functions, or simply cost functions, and in the case of M=1, there is only a single objective. The space spanned by the decision variables is called the design space or search space. The equalities for hj and inequalities for gk are called constraints. It is worth pointing out that we can also write the inequalities in the other way ≥0, and we can also formulate the objectives as a maximization problem.

    Various algorithms may be used for solving optimization problems. The conventional or classic algorithms are mostly deterministic. As an instance, the simplex method in linear programming is deterministic. Some other deterministic optimization algorithms, such as Newton–Raphson algorithm, use the gradient information and are called gradient-based algorithms. Nongradient-based, or gradient-free/derivative-free, algorithms only use the function values, not any derivative (Yang, 2011b).

    Heuristic and metaheuristic are the main types of the stochastic algorithms. The difference between heuristic and metaheuristic algorithms is negligible. Heuristic means to find or to discover by trial and error. Quality solutions to a tough optimization problem can be found in a reasonable amount of time, but there is no guarantee that optimal solutions are reached. This is useful when good solutions, but not necessarily the best solutions, are needed within a reasonable amount of time (Koziel and Yang, 2011; Yang, 2010a).

    As discussed earlier in this chapter, metaheuristic optimization algorithms are often inspired by nature. These metaheuristic algorithms can be classified into different categories based on the source of inspiration as shown in Figure 1.4. The main category is the biology-inspired algorithms, which generally use biological evolution and/or collective behavior of animals as their models. Science is another source of inspiration for metaheuristic algorithms. These algorithms are usually inspired by physics and chemistry. Moreover, art-inspired algorithms have been successful for global optimization. These are generally inspired by the creative behavior of artists such as musicians and architects. Social behavior is another source of inspiration and the socially inspired algorithms simulate social behavior to solve optimization.

    Figure 1.4 Source of inspiration in metaheuristic optimization algorithms.

    Although there are different sources of inspiration for the metaheuristic optimization algorithms, they have similarities in their structures. Therefore, they can also be classified into two main categories: evolutionary algorithms and swarm algorithms.

    1.4.1 Evolutionary Algorithms

    The evolutionary algorithms generally use an iterative procedure based on a biological evolution progress to solve optimization problems. Some of the evolutionary algorithms are described below.

    1.4.1.1 Genetic Algorithm

    GAs are a powerful optimization method based on the principles of genetics and natural selection (Holland, 1975). Holland (1975) was the first to use the crossover and recombination, mutation, and selection in the study of adaptive and artificial systems. These genetic operators form the essential part of GA for problem-solving. Up to now, many variants of GA have been developed and applied to a wide range of optimization problems (Nikjoofar and Zarghami, 2013; Rani et al., 2012). One of the main advantages of GA is that it is a gradient-free method with the flexibility to deal with various types of optimization, whether the objective function is stationary or nonstationary, linear or nonlinear, continuous or discontinuous, or with random noise. In GA, a population can simultaneously find the search space in many directions because multiple offsprings in the population act like independent agents. This feature idealizes the parallelization of the algorithms for implementation. Moreover, different parameters and groups of encoded strings can be manipulated at the same time. Despite several advantages, Gas have some disadvantages pertaining to the formulation of fitness function, the usage of population size, the choice of the important parameters, and the selection criteria of a new population. The convergence of GA can be seriously dependent on the appropriate choice of these parameters.

    1.4.1.2 Differential Evolution

    Differential evolution (DE) was developed by Storn and Price (1997). It is a vector-based evolutionary algorithm and can be considered as a further development to GAs. It is a stochastic search algorithm with a self-organizing tendency and does not use the information of derivatives. DE carries out operations over each component (or each dimension of the solution). Solutions are represented in terms of vectors, and then mutation and crossover are carried out using these vectors (Gandomi et al., 2012a). For example, in GAs, mutation is carried out at one site or multiple sites of a chromosome; but in DE, a difference vector of two randomly chosen vectors is used to perturb an existing vector. Such vectorized mutation can be viewed as a self-organizing search, directed toward optimality (Yang, 2008, 2010a). This kind of perturbation is carried out over each population vector, and thus can be expected to be more efficient. Similarly, crossover is also a vector-based component-wise exchange of chromosomes or vector segments.

    1.4.1.3 Harmony Search

    Harmony search (HS) algorithm is a music-inspired algorithm, based on the improvisation process of a musician (Geem et al. 2001). Previous reviews of the HS literature have focused on applications in civil engineering such as engineering optimization (Lee and Geem, 2005), design of structures (Lee et al., 2005), design of water-distribution networks (Geem, 2006), geometry design of geodesic domes (Saka, 2007), design of steel frames (Degertekin, 2008a,b), groundwater management problems (Ayvaz and Elci, 2012), and geotechnical engineering problems (Cheng and Geem, 2012).

    HS algorithms include a number of optimization operators, such as the harmony memory (HM), the harmony memory size (HMS), the harmony memory considering rate (HMCR), and the pitch adjusting rate (PAR). In the HS algorithm, the HM stores the feasible vectors, which are all in the feasible space. The HMR determines the number of vectors to be stored.

    During the optimization process, a new harmony vector is generated from the HM, based on memory considerations, pitch adjustments, and randomization. After generating a new harmony vector, if it is better than the worst harmony in the HM, judged in terms of the objective function value, the new harmony is included in the HM and the existing worst harmony is excluded from the HM. Pitch adjustment is similar to the mutation operator in GAs. Although adjusting pitch has a similar role, it is limited to a certain local pitch adjustment and thus corresponds to a local search. The use of randomization can drive the system further to explore various regions with high solution diversity so as to find the global optimality.

    1.4.2 Swarm-Intelligence-Based Algorithms

    Swarm-intelligence-based algorithms use the collective behavior of animals such as birds, insects, or fishes. Here, we introduce briefly some of the most widely used swarm algorithms.

    1.4.2.1 Particle Swarm Optimization

    The PSO algorithm, inspired by social behavior simulation, was initially proposed by Kennedy and Eberhart (1995). PSO used the idea that social sharing of information among members may have some evolutionary advantage (Kennedy et al., 2001). PSO has been applied to many real-world problems (Talatahari et al., 2012a; Yang, 2008, 2010a). A standard PSO algorithm is initialized with a population (swarm) of random potential solutions (particles). Each particle iteratively moves across the search space and is attracted to the position of the best fitness historically achieved by the particle itself (local best) and by the best among the neighbors of the particle (global best) (Kaveh and Talatahari, 2009a). In fact, in the PSO, instead of using more traditional genetic operators, each particle adjusts its flying according to its own flying experience and its companions’ flying experience (Hadidi et al., 2011; Kaveh and Talatahari, 2008, 2009b). Chaos theory can also improve the performance of the PSO by tuning its main constants or random variables (Gandomi et al., 2013a).

    The original PSO, developed by Kennedy and Eberhart (1995), used an equation to calculate the velocity of each particle according to previous velocity, direction of the best position of each particle itself and direction of the best swarm, and then update the particle position.

    After many numerical simulations, Shi (1998) added a weighting/inertia factor to the velocity equation to control the trade-off between the global exploration and the local exploitation abilities of the flying particles.

    A well-chosen weight can stabilize the swarm as well as speed up the convergence. By using the linearly decreasing inertia weight, the PSO lacks global search ability at the end of run even when the global search ability is required to jump out of the local minimum in some cases. Nevertheless, the results shown in literature illustrate that by using a linearly decreasing inertia weight the performance of the PSO can be improved greatly and have better results than that of both a simple PSO and an evolutionary programming as reported in Angeline (1998) and Shi and Eberhart (1999). Eusuff and Lansey (2003) combined the benefits of the local search tool of the PSO and the idea of mixing information from parallel local searches (Duan et al., 1993) to solve global optimization problems. They called this algorithm as shuffled frog-leaping (SFL) algorithm.

    1.4.2.2 Ant Colony Optimization

    In 1992, Dorigo developed a paradigm known as the ant colony optimization (ACO), a cooperative search technique that mimics the foraging behavior of real-life ant colonies (Dorigo, 1992; Dorigo et al., 1996). The ant algorithms mimic the characteristics of real ants that can rapidly establish the shortest route from food source to their nest and vice versa. Ants start searching the area surrounding their nest in a random manner. Ethologists observed that ants can construct the shortest path from their colony to the feed source and back using pheromone trails (Deneubourg and Goss, 1989; Goss et al., 1990). When ants encounter an obstacle, at first, there is an equal probability for all ants to move right or left, but after a while, the number of ants choosing the shorter path increases because of the increase in the amount of the pheromone on that path. With the increase in the number of ants and pheromone on the shorter path, all of the ants will choose and move along the shorter one (Kaveh and Talatahari, 2010a, Talatahari et al., 2012b).

    In fact, real ants use their pheromone trails as a medium for communication of information among them. When an isolated ant comes across some food source in its random sojourn, it deposits a quantity of pheromone on that location. Other randomly moving ants in the neighborhood can detect this marked pheromone trail. Furthermore, these ants can follow this trail with a very high degree of probability and simultaneously enhance the trail by depositing their own pheromone. More and more ants follow the pheromone-rich trail and the probability of the trail being followed by other ants is further enhanced by the increased trail deposition. This is an autocatalytic (positive feedback) process that favors the path along which more ants previously traversed. The ant algorithms are based on the indirect communication capabilities of the ants. In ACO algorithms, virtual ants are deputed to generate rules by using heuristic information or visibility and the principle of indirect pheromone communication capabilities for iterative improvement of rules.

    The general procedure of the ACO algorithm manages the scheduling of three steps: initialization, solution construction, and pheromone updating. The initialization of the ACO includes two parts: the first part is initialization of the pheromone trail. In the second part, a number of ants are arbitrarily placed on the nodes chosen randomly. Then each of the distributed ants will perform a tour on the graph by constructing a path according to the node transition rule described next.

    For generation of a solution, each ant constructs a complete solution to the problem according to a probabilistic state transition rule. The state transition rule depends mainly on the state of the pheromone and visibility of ants. Visibility is an additional ability used to make this method more efficient. When every ant has constructed a solution, the intensity of pheromone trails on each edge is updated by the pheromone updating rule, which is applied in two phases: first, an evaporation phase where a fraction of the pheromone evaporates, and then a reinforcement phase, where the elitist ant, which has the best solution among the others, deposits an amount of pheromone. At the end of each movement, local pheromone update reduces the level of pheromone trail on paths selected by the ant colony during the preceding iteration.

    1.4.2.3 Bee Algorithms

    Bee algorithms are another class of metaheuristic algorithms that mimic the behavior of bees (Karaboga, 2005; Yang, 2005, 2008). Different variants of bee algorithms use slightly different characteristics of the behavior of bees. For example, in the honeybee-based algorithms, forager bees are allocated to different food sources (or flower patches) so as to maximize the total nectar intake (Karaboga, 2005; Nakrani and Tovey, 2004; Pham et al., 2006; Yang, 2005). In the virtual bee algorithm (VBA), developed by Yang (2005), pheromone concentrations can be linked with the objective functions more directly. On the other hand, the artificial bee colony (ABC) optimization algorithm was first developed by Karaboga (2005). In the ABC algorithm, the bees in a colony are divided into three groups. Unlike the honey bee algorithm, which has two groups of the bees (forager bees and observer bees), bees in ABC are more specialized (Afshar et al., 2007; Karaboga, 2005).

    In the ABC algorithm, the colony of the artificial honey bees contains three groups of bees including employed bees (forager bees), onlooker bees (observer bees), and scouts. The first half of the colony consists of the employed artificial bees and the second half includes the onlookers. The position of a food source represents a possible solution to the considered optimization problem and the amount of nectar at the food source corresponds to the quality or fitness of the associated solution. At first, the ABC algorithm generates a randomly distributed, predefined number of initial population. After initialization, the population of the positions (solutions) is subjected to repeated cycles of the search process of the employed bees, onlooker bees, and scout bees. An employed bee produces a modification on the position (solution) in its memory depending on the local information (visual information) and tests the nectar amount (fitness value) of the new food source (new solution). Provided that the nectar amount of the new source is higher than that of the previous one, the bee memorizes the new position and forgets the old one. Otherwise, it keeps the position of the previous source in its memory. When all the employed bees complete the search process, they share the nectar information of the food sources and their position information with the onlooker bees in the dance area.

    1.4.2.4 Firefly Algorithm

    The firefly algorithm (FA) was first developed by Yang (2008, 2009), and was based on the flashing patterns and behavior of fireflies. In essence, FA uses the following three idealized rules: (i) fireflies are unisexual so that one firefly will be attracted to other fireflies regardless of their sex; (ii) The attractiveness is proportional to the brightness and they both decrease as their distance increases. Thus, for any two flashing fireflies, the less bright one will move toward the brighter one. If neither is brighter, they will each move randomly. (iii) The brightness of a firefly is determined by the landscape of the objective function.

    A demo version of FA implementation, without Lévy flights, can be found at Mathworks file exchange website.¹ FA has attracted much attention (Apostolopoulos and Vlachos, 2011; Gandomi et al., 2011c; Sayadi et al., 2010; Talatahari et al., 2012c; Yang et al., 2012). A discrete version of FA can efficiently solve NP-hard scheduling problems (Sayadi et al., 2010), while a detailed analysis has demonstrated the efficiency of FA over a wide range of test problems, including multiobjective load dispatch problems (Apostolopoulos and Vlachos, 2011). A chaos-enhanced FA with a basic method for automatic parameter tuning is also developed (Yang, 2011b), and the use of various chaotic maps can significantly improve the performance of FA, though different chaotic maps may have different effects (Gandomi et al., 2013b).

    1.4.2.5 Cuckoo Search

    Cuckoo search (CS) is one of the latest nature-inspired metaheuristic algorithms, developed by Yang and Deb (2009). CS is based on the brood parasitism of some cuckoo species. In addition, this algorithm is enhanced by the so-called Lévy flights (Pavlyukevich, 2007), rather than by simple isotropic random walks. Recent studies show that CS is potentially far more efficient than PSO and GAs (Yang and Deb, 2010). For simplicity in describing the standard CS, we now use the following three idealized rules: (i) each cuckoo lays one egg at a time, and dumps it in a randomly chosen nest; (ii) the best nests with high-quality eggs will be carried over to the next generations; and (iii) the number of available host nests is fixed, and the egg laid by a cuckoo is discovered by the host bird with a probability between 0 and 1.

    A Matlab implementation is given by the author, and can be downloaded.² CS is very efficient in solving engineering optimization problems (Gandomi et al., 2011d, 2012c).

    1.4.2.6 Bat Algorithm

    Bat algorithm (BA) is a relatively new metaheuristic, developed by Yang (2010b). It was inspired by the echolocation behavior of microbats. Microbats use a type of sonar, called echolocation, to detect prey, avoid obstacles, and locate their roosting crevices in the dark. These bats emit a very loud sound pulse and listen for the echo that bounces back from the surrounding objects. Their pulses vary in properties and can be correlated with their hunting strategies, depending on the species. Most bats use short, frequency-modulated signals to sweep through about an octave, while others more often use constant-frequency signals for echolocation. Their signal bandwidth varies depends on the species and is often increased by using more harmonics.

    BA has been extended to multiobjective bat algorithm (MOBA) by Yang (2011d), and preliminary results suggest that it is very efficient (Yang and Gandomi, 2012; Gandomi et al.,

    Enjoying the preview?
    Page 1 of 1