Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Sparse Matrix Technology
Sparse Matrix Technology
Sparse Matrix Technology
Ebook649 pages6 hours

Sparse Matrix Technology

Rating: 3 out of 5 stars

3/5

()

Read preview

About this ebook

Sparse Matrix Technology presents the methods, concepts, ideas, and applications of sparse matrix technology. The text provides the fundamental methods, procedures, techniques, and applications of sparse matrix technology in software development. The book covers topics on storage schemes and computational techniques needed for sparse matrix technology; sparse matrix methods and algorithms for the direct solution of linear equations; and algorithms for different purposes connected with sparse matrix technology. Engineers, programmers, analysts, teachers, and students in the computer sciences will find the book interesting.
LanguageEnglish
Release dateJun 28, 2014
ISBN9781483270401
Sparse Matrix Technology

Related to Sparse Matrix Technology

Related ebooks

Mathematics For You

View More

Related articles

Related categories

Reviews for Sparse Matrix Technology

Rating: 3 out of 5 stars
3/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Sparse Matrix Technology - Sergio Pissanetzky

    Pablo

    Introduction

    Information from many fields of human activity is frequently represented in the form of matrices. A matrix is a regular array of numbers. There are several definitions of sparse matrix in the specialized literature. All have in common the concept that a matrix is sparse when many of its elements are equal to zero. Some authors use the concept of limit: a matrix of order n is sparse when it has O(n) nonzeros, which means that the number of elements different from zero is proportional to n for n sufficiently large. A given matrix, however, has a given n rather than a sufficiently large n. The definition is useful only for theoretical purposes, such as trying to assess the asymptotic behaviour of an algorithm. Brayton et al. (1970) suggest a constant number of nonzero entries per row, typically 2 to 10. An alternative was proposed by Alvarado (1979): for a matrix of order n to be sparse, the number of nonzeros must be n¹ + γ, where γ < 1. Typical values of γ are 0.2 for electrical problems or 0.5 for a band matrix associated with a grid. A matrix of order 1000 with γ = 0.9 has 501 187 nonzeros, and one may wonder whether such a matrix should be considered as sparse or not.

    A more practical approach to the definition of a sparse matrix relies on the interplay of the three fundamental ingredients: the matrix, the algorithm and the computer. The concept is heuristic in nature: a matrix is sparse when it is worthwhile to take explicit advantage of the existence of many zeros. Any sparse matrix can be processed as if it were dense, and, conversely, any dense or full matrix can be processed by the sparse matrix algorithms. The same numerical results would be obtained in both cases but at a higher computational cost. Attributing the property of sparseness to a matrix is equivalent to contending that an algorithm exists which takes advantage of sparsity and renders computation with the matrix cheaper than otherwise.

    A sparse matrix can be seen as an array of numbers generally not regular, but we associate the numbers with the positions of a much larger regular array because we are used to thinking in terms of regular arrays and to employing regular arrays as a frame of reference for developing algorithms. Consider the following system of eight linear equations with eight unknowns:

    The following matrix of coefficients is associated with the system:

    The system has only 16 coefficients, but we are using 64 entries! From this viewpoint, it even seems quite unnatural to fill the irrelevant positions with zeros and then try to take advantage of the existence of so many zeros. Rigorously speaking, a sparse matrix should not be thought of as a matrix at all, but rather as a graph, where each equation–unknown pair is associated with a vertex and each coefficient with an edge. This is why graph theory plays such an important role in Sparse Matrix Technology. It is also why Sparse Matrix Technology itself has originated. A sparse matrix, being a set of numbers lacking regularity, cannot be represented in the memory of a computer in the same simple way as full matrices are. If we store the numerical values of the coefficients of our system of equations, we must also store the number of the equation and the number of the unknown corresponding to each coefficient, along with the value of the coefficient. In sparse matrix jargon, we say that we store the values of the nonzeros plus indexing information telling where each nonzero belongs in the regular array. This additional information constitutes an overhead and is the price paid for avoiding storing the zeros.

    Matrix algorithms must be devised in such a way that only nonzeros are processed, and that irrelevant operations like addition with or multiplication by zero are avoided by taking advantage of the previous knowledge of the positions of the nonzeros. The number of operations performed by the computer during execution of the algorithm is thus proportional to the number of nonzeros, rather than to the number of elements of the matrix. Notice that it would not be correct to store all the elements including zeros and then to skip operations involving zeros by means of an IF statement. The IF statement would be unnecessarily executed n² times or more, making the algorithm quadratic in the order n of the matrix. A good sparse matrix algorithm uses the knowledge it has of the positions of the nonzeros for performing only the necessary operations. The order of a good algorithm is in many cases as low as n when the sparse matrix has a constant number of nonzeros per row.

    A sparse matrix which represents given information has a certain given sparsity. However, when the sparse matrix is generated by an algorithm as an intermediate result during a more extensive reckoning, we may ask whether ways exist for improving sparsity by generating fewer nonzeros. The most important case where the answer is yes, is Gauss elimination. Gauss elimination impairs the sparsity of the original matrix by introducing new nonzeros. The resulting matrix is less sparse than the original one, but the fill-in depends drastically on the order in which the pivots are chosen. A good sparse matrix algorithm tries to preserve sparsity by keeping the fill-in as low as possible.

    We have thus stated the three leading concepts which have guided the development of most of Sparse Matrix Technology: to store only the nonzeros, to operate only on the nonzeros, and to preserve sparsity. Of course not every sparse matrix algorithm achieves these ends. Only the more sophisticated ones do. Many storage schemes admit a certain proportion of zeros in storage, and the algorithms process them as if they were nonzeros. An algorithm which stores and processes fewer zeros is more complicated and difficult to program, and is convenient only when the matrix is sufficiently large. There exists a whole range of algorithms, from full matrix to strictly sparse matrix algorithms, thus providing the various degrees of sophistication, simplicity and efficiency necessary in practice.

    CHAPTER 1

    Fundamentals

    Publisher Summary

    This chapter discusses certain computational techniques of sufficiently general use to be considered as fundamentals of sparse matrix technology. The chapter discusses structures and internal representations used for storing lists of numbers, graphs, and various types of sparse matrices and sparse block-partitioned matrices. Symbolic and numerical processing of sparse matrices and dynamic storage allocation are examined. Also, the computational algebra of sparse vectors is discussed, as a row of a sparse matrix is a sparse vector and most of sparse matrix algebra requires that operations be performed with sparse vectors. Sparse matrix technology frequently requires the storage and manipulation of lists of items, where item may be an integer number, a real or complex number, or an entity having a more complicated structure such as a matrix, an array or a vertex of a graph together with the corresponding edges or branching information. The selection of a storage scheme depends on the operations to be performed, as the effectiveness of a certain operation may vary widely from one storage scheme to another.

    1.1. Introduction

    1.2. Storage of arrays, lists, stacks and queues

    1.3. Storage of lists of integers

    1.4. Representation and storage of graphs

    1.5. Diagonal storage of band matrices

    1.6. Envelope storage of symmetric matrices

    1.7. Linked sparse storage schemes

    1.8. The sparse row-wise format

    1.9. Ordered and unordered representations

    1.10. Sherman’s compression

    1.11. Storage of block-partitioned matrices

    1.12. Symbolic processing and dynamic storage schemes

    1.13. Merging sparse lists of integers

    1.14. The multiple switch technique

    1.15. Addition of sparse vectors with the help of an expanded real accumulator

    1.16. Addition of sparse vectors with the help of an expanded integer array of pointers

    1.17. Scalar product of two sparse vectors with the help of an array of pointers

    1.1 Introduction

    In this chapter we discuss certain computational techniques of sufficiently general use to be considered as Fundamentals of Sparse Matrix Technology. The chapter begins with a description of structures and internal representations used for storing lists of numbers, graphs, and various types of sparse matrices and sparse block-partitioned matrices. The aim is to introduce precisely those ideas which are relevant to the subject and to illustrate them with simple examples. Symbolic and numerical processing of sparse matrices and dynamic storage allocation are examined. Finally, the computational algebra of sparse vectors is discussed, since a row of a sparse matrix is a sparse vector and most of sparse matrix algebra requires that operations be performed with sparse vectors.

    1.2 Storage of arrays, lists, stacks and queues

    Sparse matrix technology frequently requires the storage and manipulation of lists of items, where item may be an integer number, a real or complex number, or an entity having a more complicated structure such as a matrix, an array or a vertex of a graph together with the corresponding edges or branching information. Examples of operations commonly performed with lists are: adding an item at the end of the list, deleting an item from the end of the list, inserting or deleting an item in the middle or at the beginning of the list, finding the position of a certain item or of the next item to a certain item, sorting, ordering, etc. The selection of a storage scheme depends on the operations to be performed, since the effectiveness of a certain operation may vary widely from one storage scheme to another.

    The simplest data structure is the array, with which we shall assume the reader to be sufficiently familiar. Examples are A(I), B(I, J), etc. Numbers can be directly stored in an array. Alternatively, an array may contain pointers to items of a more complex nature which are actually stored elsewhere. All elements of an array are directly accessible in a time which is independent of the size of the array. However, it must be borne in mind that the computer memory is one-dimensional, and therefore the use of double or multiple indices must be paid for. It should also be noted that, in virtual memory machines, an array may be stored in peripheral storage and may not be readily accessible.

    A linear linked list is a set of cells linked together in some order. Each cell contains an item of the list and a pointer which indicates where the next cell is located. As an example, consider that we wish to store the numbers a, b, c and d, in that order, in an array A(I). The storage could look as follows, where x indicates irrelevant values:

    The array A(I) contains the actual items, while NEXT(I) indicates the position of the next item. A list head IP which points to the position of the first item is also necessary. In this case the head is 5. At position 5 we find the first item A(5) = a, and NEXT(5) = 2 indicates that the next item is to be found at position 2. In this way the list can be followed. A terminator has to be stored in the last cell to indicate where the list terminates; in the example above 0 is used as the terminator. Alternatively, we may keep in storage the total number of items in the list, and use this number to find out when the list is exhausted. An empty list can be conveniently indicated by a list head which points to no position in the arrays, for example a nonpositive number.

    Items can be inserted or deleted at any point in a simple manner. For example, let us assume that the number e has to be inserted between b and c. Let us also assume that we know that cell 3 is empty and that b is at position 2. The following procedure will do the job:

    The procedure to delete an element, say c from position 7 of the original linked list, is even simpler:

    Of course, what we really need to know is that the item preceding c is at position 2, so that this procedure actually deletes "the item which is after b" rather than c. If an item must be inserted before the first item, or the first item must be deleted, the head has to be redefined. Inserting or deleting items does not modify the order in which the remaining items are stored.

    When lists are stored in arrays, it is important to keep track of the positions of the array which are empty. A common practice is just to link them together forming another list, which in turn requires another list head. The two lists are certainly disjoint, and can thus be stored in the same arrays. The following data structure results when the empty positions of our example are linked and IE is the head of the new list:

    The procedures for inserting or deleting an item now become slightly more complicated and are left to the reader as an exercise.

    The linked list becomes a circular linked list when in the last position we store the pointer to the initial position instead of the terminator. A circular list does not have a beginning or an end, but it still requires a list head separately stored, which may now point to any of the occupied positions. In our example, we would have NEXT(4) = 5, so a comes after d, and the head could be 7, in which case the list would be c, d, a, b. Circular lists do not require a terminator, the end of the list being recognized by the fact that 7, the value of the head, is found in NEXT(2). Elements can be deleted or added, order being preserved, and empty positions can be linked together for a circular list in the same way as for a linear list.

    A bidirectional linked list, either linear or circular, is obtained by adding another array which contains, for each cell, the location of the preceding cell. A bidirectional list can be traversed in both directions, and has the advantage that an item can be inserted or deleted without knowledge of the location of the preceding item. A bidirectional list requires two list heads if it is linear, one pointing to its beginning and the other to its end. When the bidirectional list is circular, a single list head is sufficient. Empty locations in a bidirectional list can be linked together, although it will seldom be necessary to use a bidirectional list for this purpose.

    A stack is a list stored and manipulated in a simplified manner (Aho et al., 1976). In a stack items are stored in consecutive locations, no linkage thus being necessary. A pointer is used to point to the location of the last item, the top of the stack. Stacks are used in situations where items should be added or deleted only at the top of the stack. To add or push a new item onto the stack, increase the pointer in one unit, check whether sufficient storage is available, and then store the item in the location to which the pointer points. To delete or pop the last item from the top of the stack, just decrease the pointer by one unit. An empty stack is recognized by the fact that the pointer contains 0. Stacks are used in several sparse matrix algorithms, examples being the tree partitioning algorithm of George to be discussed in Section 4.9, and Tarjan’s algorithm for block triangularization of a matrix, discussed in Chapter 5.

    A queue is a list of items stored in consecutive locations, in which items are added only at one end, the front, and deleted from the other end, the rear (Aho et al., 1976). Two pointers are used to indicate the locations of the front and rear of the queue. An empty queue is recognized by the fact that the rear pointer points to a location which is immediately after the location indicated by the front pointer. When a queue has a single item, both pointers indicate the same location. When the available storage length is exhausted, new items can be stored at the beginning of the storage, treating it as circular, provided the front does not overlap with the rear.

    All the storage schemes just discussed are based on the array data structure, the only one supported by Fortran. The elegant properties of linked lists can also be implemented in an efficient way using the record data structure supported, for example, by Pascal and Algolw. This scheme does not need indirect addressing and can allocate storage dynamically, at the price of some extra system storage. Some aspects of using the record data structure for sparse matrices have been outlined by N. Houbak (1981), and will be further examined in Section

    Enjoying the preview?
    Page 1 of 1