27.4: Fill-in and Reordering

Last updated
Save as PDF

Page ID: 55707

Masayuki Yano, James Douglass Penn, George Konidaris, & Anthony T Patera
Massachusetts Institute of Technology via MIT OpenCourseWare

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Begin Advanced Material

The previous section focused on the computational cost of solving a linear system governed by banded sparse matrices. This section introduces a few additional sparse matrices and also discussed additional concepts on Gaussian elimination for sparse systems.

Cyclic System

First, let us show that a small change in a physical system - and hence the corresponding linear system \(A\) - can make a large difference in the sparsity pattern of the factored matrix \(U\). Here, we consider a modified version of \(n\)-mass mass-spring system, where the first mass is connected to the last mass, as shown in Figure 27.5. We will refer to this system as a "cyclic" system, as the springs form a circle. Recall that a spring-mass system without the extra connection yields a tridiagonal system. With the extra connection between the first and the last mass, now the \((1, n)\) entry and \((n, 1)\) entry of the matrix are nonzero as shown in Figure \(27.6(\mathrm{a})\) (for \(n=25\) ); clearly, the matrix

Screen Shot 2022-03-28 at 12.06.14 PM.png — Figure 27.5: "Cyclic" spring-mass system with \(n=6\) masses.

Screen Shot 2022-03-28 at 12.06.27 PM.png

(a) original matrix \(A\)

Screen Shot 2022-03-28 at 12.06.41 PM.png

(b) beginning of step 5

Screen Shot 2022-03-28 at 12.06.46 PM.png

Figure 27.6: Illustration of Gaussian elimination applied to a \(25 \times 25\) "arrow" system. The red entries are "fill-in" introduced in the factorization process. The pivot for each step is marked by a red square.

is no longer tridiagonal. In fact, if apply our standard classification for banded matrices, the cyclic matrix would be characterized by its bandwidth of \(m_{\mathrm{b}}=n-1\).

Applying Gaussian elimination to the "cyclic" system, we immediately recognize that the \((1, n)\) entry of the original matrix "falls" along the last column, creating \(n-2\) fill-ins (see Figures \(27.6(\mathrm{~b})\) and \(27.6(\mathrm{c}))\). In addition, the original \((n, 1)\) entry also creates a nonzero entry on the bottom row, which moves across the matrix with the pivot as the matrix is factorized. As a result, the operation count for the factorization of the "cyclic" system is in fact similar to that of a pentadiagonal system: approximately \(14 n\) FLOPs. Applying back substitution to the factored matrix - which contains approximately \(3 n\) nonzeros - require \(5 n\) FLOPs. Thus, solution of the cyclic system - which has just two more nonzero entries than the tridiagonal system - requires more than twice the operations \((19 n\) vs. \(8 n)\). However, it is also important to note that this \(\mathcal{O}(n)\) operation count is a significant improvement compared to the \(\mathcal{O}\left(n\left(m_{\mathrm{b}}+1\right)^{2}\right)=\mathcal{O}\left(n^{3}\right)\) operation estimate based on classifying the system as a standard "outrigger" with a bandwidth \(m_{\mathrm{b}}=n-1\).

We note that the fill-in structure of \(U\) takes the form of a skyline defined by the envelope of the columns of the original matrix \(A\). This is a general principal.

Reordering

In constructing a linear system corresponding to our spring-mass system, we associated the \(j^{\text {th }}\) entry of the solution vector - and hence the \(j^{\text {th }}\) column of the matrix - with the displacement of the \(j^{\text {th }}\) mass (counting from the wall) and associated the \(i^{\text {th }}\) equation with the force equilibrium condition

Screen Shot 2022-03-28 at 12.09.07 PM.png

(a) \(A\) (natural, \(\mathrm{nnz}(A)=460)\)

Screen Shot 2022-03-28 at 12.09.17 PM.png

(b) \(U\) (natural, \(\operatorname{nnz}(U)=1009\) )

Screen Shot 2022-03-28 at 12.09.26 PM.png

Screen Shot 2022-03-28 at 12.09.40 PM.png

(d) \(U^{\prime}\left(\mathrm{AMD}, \mathrm{nnz}\left(U^{\prime}\right)=657\right)\)

Figure 27.7: Comparison of the sparsity pattern and Gaussian elimination fill-ins for a \(n=100\) "outrigger" system resulting from natural ordering and an equivalent system using the approximate minimum degree (AMD) ordering.

of the \(i^{\text {th }}\) mass. While this is arguably the most "natural" ordering for the spring-mass system, we could have associated a given column and row of the matrix with a different displacement and force equilibrium condition, respectively. Note that this "reordering" of the unknowns and equations of the linear system is equivalent to "swapping" the rows of columns of the matrix, which is formally known as permutation. Importantly, we can describe the same physical system using many different orderings of the unknowns and equations; even if the matrices appear different, these matrices describing the same physical system may be considered equivalent, as they all produce the same solution for a given right-hand side (given both the solution and right-hand side are reordered in a consistent manner).

Reordering can make a significant difference in the number of fill-ins and the operation count. Figure \(27.7\) shows a comparison of number of fill-ins for an \(n=100\) linear system arising from two different orderings of a finite different discretization of two-dimensional heat equation on a \(10 \times 10\) computational grid. An "outrigger" matrix of bandwidth \(m_{\mathrm{b}}=10\) arising from "natural" ordering is shown in Figure 27.7(a). The matrix has 460 nonzero entries. As discussed in the previous section, Gaussian elimination of the matrix yields an upper triangular matrix \(U\) with approximately \(n\left(m_{\mathrm{b}}+1\right)=1100\) nonzero entries (more precisely 1009 for this particular case), which is shown in Figure 27.7(b). An equivalent system obtained using the approximate minimum degree (AMD) ordering is shown in Figure 27.7(c). This newly reordered matrix \(A^{\prime}\) also has 460 nonzero entries because permuting (or swapping) rows and columns clearly does not change the number of nonzeros. On the other hand, application of Gaussian elimination to this reordered matrix yields an upper triangular matrix \(U\) shown in Figure \(27.7(\mathrm{~d})\), which has only 657 nonzero entries. Note that the number of fill-in has been reduced by roughly a factor of two: from \(1009-280=729\) for the "natural" ordering to \(657-280=377\) for the AMD ordering. (The original matrix \(A\) has 280 nonzero entries in the upper triangular part.)

In general, using an appropriate ordering can significantly reduced the number of fill-ins and hence the computational cost. In particular, for a sparse matrix arising from \(n\)-unknown finite difference (or finite element) discretization of two-dimensional PDEs, we have noted that "natural" ordering produces an "outrigger" system with \(m_{\mathrm{b}}=\sqrt{n}\); Gaussian elimination of the system yields an upper triangular matrix with \(n\left(m_{\mathrm{b}}+1\right) \approx n^{3 / 2}\) nonzero entries. On the other hand, the number of fill-ins for the same system with an optimal (i.e. minimum fill-in) ordering yields an upper triangular matrix with \(\mathcal{O}(n \log (n))\) unknowns. Thus, ordering can have significant impact in both the operation count and storage for large sparse linear systems.