Parallel Computational Fluid Dynamics '99: Towards Teraflops, Optimization and Novel Formulations
By D. Keyes, A. Ecer, N. Satofuka and
()
About this ebook
Related to Parallel Computational Fluid Dynamics '99
Related ebooks
Nonlinear Ordinary Differential Equations in Transport Processes Rating: 0 out of 5 stars0 ratingsComputational Fluid-Structure Interaction: Methods and Applications Rating: 0 out of 5 stars0 ratingsComplex Variable Methods in Elasticity Rating: 0 out of 5 stars0 ratingsNumerical Methods for Two-Point Boundary-Value Problems Rating: 0 out of 5 stars0 ratingsMatrix Methods Applied to Engineering Rigid Body Mechanics Rating: 0 out of 5 stars0 ratingsFinite Element Method: Physics and Solution Methods Rating: 0 out of 5 stars0 ratingsDifferential Quadrature and Differential Quadrature Based Element Methods: Theory and Applications Rating: 0 out of 5 stars0 ratingsSolutions for Biot's Poroelastic Theory in Key Engineering Fields: Theory and Applications Rating: 0 out of 5 stars0 ratingsDynamic Plasticity Rating: 0 out of 5 stars0 ratingsModelling of Mechanical Systems: Structural Elements Rating: 4 out of 5 stars4/5Free Vibrations of Circular Cylindrical Shells Rating: 0 out of 5 stars0 ratingsTurbulent Buoyant Jets and Plumes: HMT: The Science & Applications of Heat and Mass Transfer. Reports, Reviews & Computer Programs Rating: 0 out of 5 stars0 ratingsNumerical Solution of Systems of Nonlinear Algebraic Equations Rating: 5 out of 5 stars5/5Functional Analysis and Numerical Mathematics Rating: 0 out of 5 stars0 ratingsMathematical Methods for Wave Phenomena Rating: 0 out of 5 stars0 ratingsPartial Differential Equations: An Introduction Rating: 2 out of 5 stars2/5Demystifying Numerical Models: Step-by Step Modeling of Engineering Systems Rating: 2 out of 5 stars2/5Design Optimization of Fluid Machinery: Applying Computational Fluid Dynamics and Numerical Optimization Rating: 0 out of 5 stars0 ratingsPeriodic Differential Equations: An Introduction to Mathieu, Lamé, and Allied Functions Rating: 0 out of 5 stars0 ratingsLinear Differential and Difference Equations: A Systems Approach for Mathematicians and Engineers Rating: 0 out of 5 stars0 ratingsThe Summation of Series Rating: 4 out of 5 stars4/5Riemannian Geometry Rating: 0 out of 5 stars0 ratingsThe Numerical Solution of Ordinary and Partial Differential Equations Rating: 2 out of 5 stars2/5Hypersonic Flow Theory Rating: 0 out of 5 stars0 ratingsWave Propagation in Layered Anisotropic Media: with Application to Composites Rating: 0 out of 5 stars0 ratingsFinite Element Techniques in Structural Mechanics Rating: 0 out of 5 stars0 ratingsAn Introduction to Applied Optimal Control Rating: 0 out of 5 stars0 ratings
Mechanical Engineering For You
Troubleshooting and Repairing Diesel Engines, 5th Edition Rating: 3 out of 5 stars3/5How to Repair Briggs and Stratton Engines, 4th Ed. Rating: 0 out of 5 stars0 ratingsMachinery's Handbook Pocket Companion: Quick Access to Basic Data & More from the 31st Edition Rating: 0 out of 5 stars0 ratingsHandbook of Mechanical and Materials Engineering Rating: 5 out of 5 stars5/5Zinn & the Art of Mountain Bike Maintenance: The World's Best-Selling Guide to Mountain Bike Repair Rating: 0 out of 5 stars0 ratings301 Top Tips for Design Engineers: To Help You 'Measure Up' in the World of Engineering Rating: 5 out of 5 stars5/5FreeCAD Basics Tutorial Rating: 3 out of 5 stars3/5Machinery's Handbook Guide: A Guide to Tables, Formulas, & More in the 31st Edition Rating: 5 out of 5 stars5/5Newnes Workshop Engineer's Pocket Book Rating: 5 out of 5 stars5/5Basic Engineering Mechanics Explained, Volume 1: Principles and Static Forces Rating: 5 out of 5 stars5/5Mechanical Engineering Rating: 5 out of 5 stars5/5Airplane Flying Handbook: FAA-H-8083-3C (2024) Rating: 4 out of 5 stars4/5Mechanical Design Engineering Handbook Rating: 4 out of 5 stars4/5Making Things Move DIY Mechanisms for Inventors, Hobbyists, and Artists Rating: 0 out of 5 stars0 ratingsThe CIA Lockpicking Manual Rating: 5 out of 5 stars5/5Mechanical Engineer's Handbook Rating: 4 out of 5 stars4/520 Solid State Projects for the Car & Garage Rating: 0 out of 5 stars0 ratingsBasic Machines and How They Work Rating: 4 out of 5 stars4/5Small Gas Engine Repair, Fourth Edition Rating: 0 out of 5 stars0 ratingsRewinding Small Motors Rating: 4 out of 5 stars4/5The Art of Welding: Featuring Ryan Friedlinghaus of West Coast Customs Rating: 0 out of 5 stars0 ratingsRobotics, Mechatronics, and Artificial Intelligence: Experimental Circuit Blocks for Designers Rating: 5 out of 5 stars5/5Oil and Gas Pipelines: Integrity and Safety Handbook Rating: 0 out of 5 stars0 ratingsPlane Sense: A Beginner's Guide to Owning and Operating Private Aircraft FAA-H-8083-19A Rating: 0 out of 5 stars0 ratingsA Dynamical Theory of the Electromagnetic Field Rating: 0 out of 5 stars0 ratingsAlbert Einstein's Theory Of Relativity Explained Simply Rating: 0 out of 5 stars0 ratings
Reviews for Parallel Computational Fluid Dynamics '99
0 ratings0 reviews
Book preview
Parallel Computational Fluid Dynamics '99 - D. Keyes
USA
LIST OF PARTICIPANTS in Parallel CFD’99
Ramesh Agarwal, Wichita State University
Rakhim Aitbayev, University of Colorado, Boulder
Hasan Akay, IUPUI
Giorgio Amati, CASPUR
W. Kyle Anderson, NASA Langley Rsearch Center
Rustem Aslan, Istanbul Technical University
Mike Ashworth, CLRC
Abdelkader Baggag, ICASE
Satish Balay, Argonne National Laboratory
Oktay Baysal, Old Dominion University
Robert Biedron, NASA Langley Research Center
George Biros, Carnegie Mellon University
Daryl Bonhaus, NASA Langley Research Center
Gunther Brenner, University of Erlangen
Edoardo Bucchignani, CIRA SCPA
Xing Cai, University of Oslo
Doru Caraeni, Lund Institute of Technology
Mark Carpenter, NASA Langley Research Center
Po-Shu Chen, ICASE
Jiadong Chen, IUPUI
Guilhem Chevalier, CERFACS
Stanley Chien, IUPUI
Peter Chow, Fujitsu
Mark Cross, University of Greenwich
Wenlong Dai, University of Minnesota
Eduardo D’Azevedo, Oak Ridge National Laboratory
Anil Deane, University of Maryland
Ayodeji Demuren, Old Dominion University
Jean-Antoine Désideri, INRIA Boris Diskin, ICASE
Florin Dobrian, Old Dominion University
Akin Ecer, IUPUI
David Emerson, CLRC
Karl Engel, Daimler Chrysler
Huiyu Feng, George Washington University
Paul Fischer, Argonne National Laboratory
Randy Franklin, North Carolina State University
Martin Galle, DLR
Marc Garbey, University of Lyon
Alfred Geiger, HLRS
Aytekin Gel, University of West Virginia
Omar Ghattas, Carnegie Mellon University
William Gropp, Argonne National Laboratory
X. J. Gu, CLRC
Harri Hakula, University of Chicago
Xin He, Old Dominion University
Paul Hovland, Argonne National Laboratory
Weicheng Huang, UIUC
Xiangyu Huang, College of William & Mary
David Hysom, Old Dominion University
Cos Ierotheou, University of Greenwich
Eleanor Jenkins, North Carolina State University
Claus Jenssen, SINTEF
Andreas Kahari, Uppsala University
Boris Kaludercic, Computational Dynamics Limited
Dinesh Kaushik, Argonne National Laboratory
Shinichi Kawai, NSDA
David Keyes, Old Dominion University
Matthew Knepley, Purdue University
Suleyman Kocak, IUPUI
John Kroll, Old Dominion University
Stefan Kunze, University of Tubingen
Chia-Chen Kuo, NCHPC
Trond Kvamsdal, SINTEF
Lawrence Leemis, College of William & Mary
Wu Li, Old Dominion University
David Lockhard, NASA Langley Research Center
Josip Loncaric, ICASE
Lian Peet Loo, IUPUI
Isaac Lopez, NASA Glenn Research Center
Li Shi Luo, ICASE
James Martin, ICASE
Dimitri Mavriplis, ICASE
Peter McCorquodale, Lawrence Berkeley National Laboratory
James McDonough, University of Kentucky
Lois McInnes, Argonne National Laboratory
Piyush Mehrotra, ICASE
N. Duane Melson, NASA Langley Research Center
Razi Nalim, IUPUI
Eric Nielsen, NASA Langley Research Center
Stefan Nilsson, Chalmers University of Technology
Jacques Périaux, INRIA
Alex Pothen, Old Dominion University
Alex Povitsky, ICASE
Luie Rey, IBM Austin
Jacques Richard, Illinois State University, Chicago
Jacqueline Rodrigues, University of Greenwich
Kevin Roe, ICASE
Cord Rossow, DLR
David Rudy, NASA Langley Research Center
Jubaraj Sahu, Army Research Laboratory
John Salmon, California Institute of Technology
Widodo Samyono, Old Dominion University
Nobuyuki Satofuka, Kyoto Institute of Technology
Punyam Satya-Narayana, Raytheon
Erik Schnetter, University of Tübingen
Kara Schumacher Olson, Old Dominion University
John Shadid, Sandia National Laboratory
Kenjiro Shimano, Musashi Institute of Technology
Manuel Soria, University Politecnico Catalunya
Linda Stals, ICASE
Andreas Stathopoulous, College of William & Mary
Azzeddine Soulaimani, Ecole Supérieure
A. (Suga) Sugavanam, IBM Dallas
Samuel Sundberg, Uppsala University
Madhava Syamlal, Fluent Incorporated
James Taft, NASA Ames Research Center
Danesh Tafti, UIUC
Aoyama Takashi, National Aerospace Laboratory
Ilker Tarkan, IUPUI
Virginia Torczon, College of William & Mary
Damien Tromeur-Dervout, University of Lyon
Aris Twerda, Delft University of Technology
Ali Uzun, IUPUI
George Vahala, College of William & Mary
Robert Voigt, College of William & Mary
Johnson Wang, Aerospace Corporation
Tadashi Watanabe, JAERI
Chip Watson, Jefferson Laboratory
Peter Wilders, Technological University of Delft
Kwai Wong, University of Tennessee
Mark Woodgate, University of Glasgow
Paul Woodward, University of Minnesota
Yunhai Wu, Old Dominion University
Shishen Xie, University of Houston
Jie Zhang, Old Dominion University
Conference Photograph
PLENARY PAPERS
Parallel multigrid solution and optimization in compressible flow simulation and design
J.-A. Désidéri, L. Fournier, S. Lanteri and N. Marco, INRIA Projet Sinus, 2004 Route des Lucioles, 06902 Sophia-Antipolis Cedex
B. Mantel and J. Périaux, Dassault Aviation, France
J.F. Wang, Nanjing Institute of Aeronautics and Astronautics, China
This paper describes recent achievements regarding the development of parallel multi-grid (PMG) methods and parallel genetic algorithms (PGAs) in the framework of compressible flow simulation and optimization. More precisely, this work is part of a broad research objective aimed at building efficient and robust optimization strategies for complex multi-disciplinary shape-design problems in aerodynamics. Ultimately, such a parallel optimization technique should combine the following ingredients:
I1. parallel multigrid methods for the acceleration of the underlying flow calculations;
I2. domain decomposition algorithms for an efficient and mathematically well posed distribution of the global optimization process on a network of processors;
I3. robust parallel optimization techniques combining non-deterministic algorithms (genetic algorithms in the present context) and efficient local optimization algorithms to accelerate the convergence to the optimal solution;
I4. distributed shape parametrization techniques.
In this contribution, we address mainly topics I1 and I3 in the context of optimal airfoil design.
1 PARALLEL STRATEGIES IN GENETIC ALGORITHMS
1.1 Introduction
Genetic algorithms (GAs) are search algorithms based on mechanisms simulating natural selection. They rely on the analogy with Darwin’s principle of survival of the fittest. John Holland, in the 1970’s, introduced the idea according to which difficult optimization problems could be solved by such an evolutionary approach. The technique operates on a population of potential solutions represented by strings of binary digits (called chromosomes or individuals) which are submitted to several semi-stochastic operators (selection, crossover and mutation). The population evolves during the generations according to the fitness value of the individuals; then, when a stationary state is reached, the population has converged to an/the optimized solution (see [3] for an introduction to the subject). GAs differ from classical optimization procedures, such as the steepest descent or conjugate gradient method, in many ways:
- the entire parameter set is coded;
- the iteration applies to an entire population of potential solutions, in contrast to classical algorithms, in which a single candidate solution is driven to optimality by successive steps;
- the iteration is an evolution
step, or new generation, conducted by semi-stochastic operators;
- the search space is investigated (more) globally, enhancing robustness;
- two keywords are linked to GAs : exploration and exploitation. Exploration of the search space is important at the beginning of the GA process, while exploitation is desirable when the GA process is close to the global optimum.
GAs have been introduced in aerodynamics shape design problems for about fifteen years (see Kuiper et al. [4], Périaux et al. in [9], Quagliarella in [10] and Obayashi in [8], who present 3D results for a transonic flow around a wing geometry). The main concern related to the use of GAs for aerodynamic design is the computational effort needed for the accurate evaluation of a design configuration that, in the case of a crude application of the technique, might lead to unacceptable computer time if compared with more classical algorithms. In addition, hard problems need larger populations and this translates directly into higher computational costs. It is a widely accepted position that GAs can be effectively parallelized and can in principle take full advantage of (massively) parallel computer architectures. This point of view is above all motivated by the fact that within a generation (iteration) of the algorithm, the fitness values associated with each individual of the population can be evaluated in parallel.
In this study, we have developed a shape optimum design methodology that combines the following ingredients:
- the underlying flow solver discretizes the Euler or full Navier-Stokes equations using a mixed finite element/finite volume formulation on triangular meshes. Time integration to steady state is achieved using a linearized Euler implicit scheme which results in the solution of a linear system for advancing the solution to the next time step;
- a binary-coded genetic algorithm is used as the main optimization kernel. In our context, the population of individuals is represented by airfoil shapes. The shape parametrization strategy is based on Bézier curves.
1.2 Parallelization strategy
Several possible strategies can be considered for the parallelization of the GA-based shape design optimization described above:
- a first strategy stems from the following remark: within a given generation of a GA, the evaluation of the fitness values associated with the population of individuals defines independent processes. This makes GAs particularly well suited for massively parallel systems; we also note that a parent/chuld approach is a standard candidate for the implementation of this first level of parallelism, especially when the size of the populations is greater than the available number of processors;
- a second strategy consists of concentrating the parallellization efforts on the process underlying a fitness value evaluation, here the flow solver. This approach finds its main motivation in the fact that, when complex field analysers are used in conjunction with a GA, then the aggregate cost of fitness values evaluations can represent between 80 to 90% of the total optimization time. A SPMD paradigm is particularly well suited for the implementation of this strategy;
- the third option combines the above two approaches and clearly yields a two-level parallelization strategy which has been considered here and which will be detailed in the sequel. Our choice has been motivated by the following remarks: (1) a parallel version of the two-dimensional flow solver was available and adapted to the present study; (2) we have targetted a distributed memory SPMD implementation and we did not want the resulting optimization tool to be limited by memory capacity constraints, especially since the present study will find its sequel in its adaptation to 3D shape optimization problems, based on more complex aerodynamical models (Navier-Stokes equations coupled with a turbulence model); and (3) we believe that the adopted parallelization strategy will define a good starting-point for the construction and evaluation of sub-populations based parallel genetic algorithms (PGAs).
In our context, the parallelization strategy adopted for the flow solver combines domain partitioning techniques and a message-passing programming model [6]. The underlying mesh is assumed to be partitioned into several submeshes, each one defining a subdomain. Basically, the same base
serial code is executed within every subdomain. Applying this parallelization strategy to the flow solver results in modifications occuring in the main time-stepping loop in order to take into account one or several assembly phases of the subdomain results. The coordination of subdomain calculations through information exchange at artificial boundaries is implemented using calls to functions of the MPI library.
The paralellization described above aims at reducing the cost of the fitness function evaluation for a given individual. However, another level of parallelism can clearly be exploited here and is directly related to the binary tournament approach and the crossover operator. In practice, during each generation, individuals of the current population are treated pairwise; this applies to the selection, crossover, mutation and fitness function evaluation steps. Here, the main remark is that for this last step, the evaluation of the fitness functions associated with the two selected individuals, defines independent operations. We have chosen to exploit this fact using the notion of process groups which is one of the main features of the MPI environment. Two groups are defined, each of them containing the same number of processes; this number is given by the number of subdomains in the partitioned mesh. Now, each group is responsible for the evaluation of the fitness function for a given individual. We note in passing that such an approach based on process groups will also be interesting in the context of sub-populations based PGAs (see [1] for a review on the subject); this will be considered in a future work.
1.3 An optimum shape design case
The method has been applied to a direct optimization problem consisting in designing the shape of an airfoil, symbolically denoted by γ, to reduce the shock-induced drag, CD, while preserving the lift, CL, corresponding to the RAE2822 airfoil, immersed in an Eulerian flow at 2° of incidence and a freestream Mach number of 0.73. Thus, the cost functional was given the following form:
(1)
The non-linear convergence tolerance has been fixed to 10−6. The computational mesh consists of 14747 vertices (160 vertices on the airfoil) and 29054 triangles. Here, each chromosome
represents a candidate airfoil defined by a Bézier spline whose support is made of 7+7 control points at prescribed abscissas for the upper and lower surfaces. A population of 30 individuals has been considered. After 50 generations, the shape has evolved and the shock has been notably reduced; the initial and final flows (iso-Mach values) are shown on Figure 1. Additionally, initial and final values of CD and CL are given in Table 1.
Table 1
Drag reduction: initial and optimized values of the CDand CLcoefficients
Figure 1 Drag reduction: initial and optimized flows (steady iso-Mach lines)
The calculations have been performed on the following systems: an SGI Origin 2000 (equipped with 64 MIPS R10000/195 Mhz processors) and an experimental Pentium Pro (P6/200 Mhz, running the LINUX system) cluster where the interconnection is realized through FastEthernet (100 Mbits/s) switches. The native MPI implementation has been used on the SGI Origin 2000 system while MPICH 1.1 has been used on the Pentium Pro cluster. Performance results are given for 64 bit arithmetic computations.
We compare timing measurements for the overall optimization using one and two process groups. Timings are given for a fixed number of generations (generally 5 optimization iterations). In Tables 2 and 3 below, Ng and Np respectively denote the number of process groups and the total number of processes (Ng = 2 and Np = 4 means 2 processes for each of the two groups), CPU
is the total CPU time, Flow
is the accumulated flow solver time, and Elapsed
is the total elapsed time (the distinction between the CPU and the elapsed times is particularly relevant for the Pentium Pro cluster). Finally, S(Np) is the parallel speed-up ratio Elapsed(Ng = 1, Np = 5)/Elapsed(Ng, Np), the case Ng = 1, Np = 5 serving as a reference. For the multiple processes cases, the given timing measures (CPU
and Flow
) always correspond to the maximum value over the per-process measures.
Table 2
Parallel perfomance results on the SGI Origin 2000
Table 3
Parallel perfomance results on the Pentium Pro cluster
For both architectures, the optimal speed-up of 2 is close to being achieved. For the Pentium Pro cluster, the communication penalty is larger, and this favors the usage of fewer processors and more groups. For the SGI Origin 2000, the situation is different: communication only involves memory access (shared memory), and parallelization remains efficient as the number of processors increases; moreover, additional gains are achieved due to the larger impact of cache memory when subdomains are small.
1.4 High-lift multi-element airfoil optimization by GAs and multi-agent strategies
In this section, we report on numerical experiments conducted to optimize the configuration (shape and position) of a high-lift multi-element airfoil by both conventional GAs and more novel ones, based on multi-agent strategies better fit to parallel computations. These experiments have been also described in [7] in part.
The increased performance requirements for high-lift systems as well as the availability of (GA-based) optimization methods tend to renew the emphasis on multi-element aerodynamics. High-lift systems, as depicted on Figure 2, consist of a leading-edge device (slat) whose effect is to delay stall angle, and a trailing-edge device (flap) to increase the lift while maintaining a high L/D ratio.
Figure 2 Geometry of multi-element airfoil including slat, main body and flap (shape and position definition)
The lift coefficient CL of such an airfoil is very sensitive to the flow features around each element and its relative position to the main body; in particular, the location of the separation point can change rapidly due to the wake/boundary-layer interaction. As a result, the functional is non-convex and presents several local optima, making use of a robust algorithm necessary to a successful optimization.
Here, the 2D flow computation is conducted by the Dassault-Aviation code Damien
which combines an inviscid flow calculation by a panel method with a wake/boundary-layer interaction evaluation. This code, which incorporates data concerning transition criteria, separated zones, and wake/boundary-layer interaction, has been thoroughly calibrated by assessments and global validation through comparisons with ONERA wind-tunnel measurements. As a result, viscous effects can be computed, and this provides at a very low cost a fairly accurate determination of the aerodynamics coefficients. As a counterpart of this simplified numerical model, the flow solver is non differentiable and can only be treated as a black box in the optimization process. Evolutionary algorithms are thus a natural choice to conduct the optimization in a situation of this type.
Figure 3 relates to a first experiment in which only the 6 parameters determining the positions relative to the main body (deflection angle, overlap and gap) of the two high-lift devices (slat and flap) have been optimized by a conventional GA, similar to the one of the previous section. Provided reasonable ranges are given for each parameter, an optimum solution is successfully found by the GA, corresponding to an improved lift coefficient of 4.87.
Figure 3 Initial (top) and final (bottom) configurations in position optimization problem
The second experiment is the first step in the optimization of the entire configuration consisting of the shapes of the three elements and both positions of slat and flap. More precisely, only the inverse problem consisting in reconstructing the pressure field is considered presently. If γS, γB and γF denote the design variables associated with the slat, main body and flap respectively, and γ = (γS, γB, γF) represents a candidate configuration, one minimizes the following functional:
(2)
in which, for example:
(3)
is a positive integral extending over one element only (slat) but reflecting in the integrand the interaction of the whole set of design variables. Here, Pt is the given target pressure; similar definitions are made for JB(γ) and JF(γ).
In order to reduce the number of design variables and enforce smoothness of the geometry, shapes are represented by Bézier curves.
This inverse problem has been solved successfully by the GA first by a global
algorithm in which the chromosomes contain the coded information associated with all design variables indiscriminately. The convergence history of this experiment is indicated on Figure 4 for the first 200 generations.
Figure 4 Optimization of shape and position design variables by various strategies all involving the same number of cost functional evaluations
An alternative to this approach is provided by optimization algorithms based on (pseudo) Nash strategies in which the design variables are a priori partitioned into appropriate subsets. The population of a given subset evolves independently at each new generation according to its own GA, with the remaining design variables being held fixed and equal to the best elements in their respective populations found at the previous generation. Evidently, in such an algorithm, the computational task of the different GAs, or players
to use a term of game theory, can be performed in parallel [11].
Many different groupings of the design variables can be considered, two of which are illustrated on Figure 4: a 3-player strategy (slat shape and position; body shape; flap shape and position) and a 5-player strategy (slat, body and flap shapes, slat and flap positions). The two algorithms achieve about the same global optimum, but the parameters of the GAs (population size) have been adjusted so that the endpoints of the two convergence paths correspond to the same number of functional evaluations as 100 generations of the global algorithm. Thus, this experiment simulates a comparison of algorithms at equal serial
cost. This demonstrates the effectiveness of the multi-agent approach, which achieves the global optimum by computations that could evidently be performed in parallel.
We terminate this section by two remarks concerning Nash strategies.
First, in a preliminary experiment related to the slat/flap position inverse problem, a 2-player game in which the main body (γB) is fixed, an attempt was made to let the population of γS found at the previous generation,) and symmetrically for γF . Figure 5 indicates that in such a case, the algorithm fails to achieve the desired global optimum.
Figure 5 Effect of the fitness definition on the convergence of slat/flap position parameters
Second, observe that in the case of a general cost function, a Nash equilibrium in which a minimum is found with respect to each subgroup of variables, the other variables being held fixed, does not necessarily realize a global minimum. For example, in the trivial case of a function f(x, y) of two real variables, a standard situation in which the partial functions:
(4)
achieve local minima at x* and y* respectively, is realized typically when:
(5)
and this does not imply that the Hessian matrix be positive definite. However, in the case of an inverse problem in which each individual positive component of the cost function is driven to 0, the global optimum is indeed achieved.
2 PARALLEL MULTIGRID ACCELERATION
2.1 Introduction
Clearly, reducing the time spent in flow calculations (to the minimum) is crucial to make GAs a viable alternative to other optimization techniques. One possible strategy to achieve this goal consists in using a multigrid method to accelerate the solution of the linear systems resulting from the linearized implicit time integration scheme. As a first step, we have developed parallel linear multigrid algorithms for the acceleration of compressible steady flow calculations, independently of the optimization framework. This is justified by the fact that the flow solver is mainly used as a black box by the GA. The starting point consists of an existing flow solver based on the averaged compressible Navier-Stokes equations, coupled with a k – turbulence model [2]. The spatial discretization combines finite element and finite volume concepts and is designed for unstructured triangular meshes.
Steady state solutions of the resulting semi-discrete equations are obtained by using an Euler implicit time advancing strategy which has the following features: linearization (approximate linearization of the convective fluxes and exact differentiation of the viscous terms); preconditioning (the Jacobian matrix is based on a first-order Godunov scheme); and local time stepping and CFL law (a local time step is computed on each control volume). Each pseudo time step requires the solution of two sparse linear systems (respectively, for the mean flow variables and for the variables associated with the turbulence model).
The multigrid strategy is adopted to gain efficiency in the solution of the two subsystems. In the present method, the coarse-grid approximation is based on the construction of macro-elements, more specifically macro control-volumes by volume-agglomeration
. Starting from the finest mesh, a greedy
coarsening algorithm is applied to generate automatically the coarse discretizations (see Lallemand et al. [5]).
Parallelism is introduced in the overall flow solver by using a strategy that combines mesh partitioning techniques and a message passing programming model. The MPI environment is used for the implementation of the required communication steps. Both the discrete fluxes calculation and the linear systems solution are performed on a submesh basis; in particular, for the basic linear multigrid algorithm which is multiplicative (i.e. the different levels are treated in sequence with inter-dependencies between the partial results produced on the different levels), this can be viewed as an intra-level parallelization which concentrates on the smoothing steps performed on each member of the grid hierarchy. A necessary and important step in this adaptation was the construction of appropriate data structures for the distribution of coarse grid calculations. Here, this has been achieved by developing a parallel variant of the original greedy
type coarsening algorithm, which now includes additional communication steps for a coherent construction of the communication data structures on the partitioned coarse