!======================================================================== subroutine splib(A, colind, rwptr, n, & x, b, work, space, & tol, ireltv, maxits, pcmeth, solmeth, & scaling, Dl, Dr, & iparms, rparms, iredo, errflg, iunit, instlvl) implicit none !********************************************************************** ! ! SPLIB is a package of iterative solvers for sparse nonsymmetric linear ! systems, and has 13 iterative methods and 7 preconditioning methods. ! The purpose is to provide a collection that allows practical testing ! on systems of nontrivial size without massive amounts of coding ! effort. In particular, SPLIB relieves the user of the need to code ! up preconditioners, which are often the most difficult and error-prone ! part of iterative solvers. The price to be paid is that the matrix ! must be in a specific data structure: compressed sparse row. This stores ! the nonzero entries of A in a double precision singly dimensioned vector, ! a corresponding integer vector of the column indices, and an integer ! vector of length n+1 with pointers to the beginning of each row in the ! matrix ( entry n+1 points to one position beyond the end of the arrays ! for nonzero values and column indices). ! ! The target users for SPLIB are scientists and engineers in application ! areas who wish to experiment with iterative solvers in their codes to ! find which of the many methods may be practical for them. SPLIB is ! also useful for teaching iterative methods, allowing students to ! implement methods without having to develop the massive amounts of code ! needed for instrumentation, etc. ! ! The code is written in Fortran 77, with a few exceptions. ! Include files are used for some parameters and the memory manager. ! One function, getr.c, is necessarily written in C. That ! function accesses the getrusage structure and returns it to the ! Fortran 77 routine prtres. A timing function (mytime) ! is used. It is set to use etime(), but has commented out options ! for using mclock(), second(), etc. on machines without etime(). ! BLAS routines used include: ddot, daxpy, dcopy, dgemv, ! dasum, drot, dscal, drotg, and dtrsv. Routines lsame and xerbla ! from the LAPACK project are also provided in the directory "blas", ! so that SPLIB can be compiled and run on machines without a native ! BLAS library. LAPACK also supplies the function dlamch, for finding ! machine constants. Finally, "implicit none" is used in all the routines. ! ! This package differs from others in seven ways: ! 1) It uses a sophisticated memory managment module, which allows ! the package to be easily adapted and used by applications ! programs. All they need do is provide a work array, from which ! the necessary arrays are divvied out using pointers. This helps ! keep the calling sequence short. ! 2) It provides implementations of levels of fill LU preconditioning, ! well documented and with various options. ! 3) The matrices dealt with are always kept in standard data structures, ! so that they can be easily extracted and examined/processed using ! other tools, such as emily for visualization or Sparskit for basic ! manipulations. ! 4) Detailed instrumentation of the code is provided for, at the risk of ! losing some portability. In particular, this code allows tracking ! several different measures of error norms, including Oetli's backward ! error analysis based measure. Furthermore, the error norms are written ! out to files which can then be plotted via xgraph, giving graphical ! comparison of different methods and preconditioners. Finally, the ! norms are plotted against both iteration number and time, allowing ! a more fair comparison than the usual "number of iterations" results. ! The plotting against time is especially crucial, since methods can ! vary drastically in the amount of work per iteration. ! 5) Further instrumentation provides access to rusage structures, which ! allows information about memory utilization, page faults, context ! switches, and other important performance measures. ! 6) Implementation of the partial orthogonalization preconditioners of ! Bramley, Gallivan, and Wang. These generally produce better factors ! than the ILU type methods, and are more robust. However, they require ! more intermediate storage to compute. ! 7) Stringent tests for having too large of a preconditioned ! residual, for dividing by small quantities, etc. ! ! History of SPLIB: ! ----------------- ! ! This code has evolved through many hands. Originally it started ! out as RP-PACK, for examining row projection methods. It then ! was converted to a front end for CG-like iterative methods. ! Ulrike Meier and G. Skinner did most of the preliminary work, for ! the Center for Supercomputer Research and Development at the University ! of Illinois. They created a uniform calling sequence and data naming ! convention that tied many disparate codes together, as well as writing and ! testing heroic amounts of codes. Xiaoge Wang has done a great deal ! of work, especially on the incomplete preconditioners based on levels ! of fill-in, and all of the ECIMGS material. ! ! Youcef Saad wrote the original GMRES(k) solver, and the package ! heavily uses modified SPARSKIT routines provided by him and Edmond Chow. ! The memory manager is a modification of one developed by Dominique ! Pelletier of Ecole Polytechnique. The machine constant function, ! dlamch(), is from the LAPACK project headed by J. Dongarra. ! ! Usage of SPLIB: ! --------------- ! ! Compilation is via the makefile. There are two features to note: ! 1) You should first edit the include file intdbl.inc, to give the ! number of integer words per double precision word. For most ! machines this is intdbl = 2, but for Cray machines it is ! generally intdbl = 1. For 64 bit machines with long ! integers, this is also intdbl = 1. This is used to force ! double word alignment in the memory manager. ! 2) The functions in dlamch() will compute machine epsilon and ! safe invertible minimum with methods that cause under- and ! over-flow. If you have IEEE traps set, this will cause the ! program to abend. Turn off IEEE traps if they are by default ! on for your Fortran compiler. ! ! Compilation sets up the archive library splib.a, which you can ! link with your code. ! ! Calling SPLIB requires you to: ! 1) Set up the matrix in CSR format. ! The main sticking point in using SPLIB is that the matrix must be ! provided in compressed sparse row (CSR) format. This is necessary ! because the preconditioners need to be able to access the ! entries of the matrix. Unpreconditioned iterative methods based on ! Krylov subspaces generally only need to access the matrix via ! matrix-vector products, but they also generally do not work. The ! tradeoff here is that most applications scientists find it relatively ! easy to write a data structure conversion routine to put their ! coefficient matrix into a CSR format, but writing robust factorization ! based preconditioners are much more difficult. ! 2) Provide n-vectors x and b for the solution and right hand side: A x = b. ! 3) Provide an integer array work(1:space), aligned on a double word ! boundary. All of the additional storage needed by SPLIB will be ! carved out from this array. To assure double word alignment, you can ! declare it as a double precision array of length m, then pass it in ! with space = intdbl*m. ! 4) Set the control parameters. Defaults are in parenthesis, and are ! triggered by giving an invalid value for the parameter. ! solmeth -- solution method (CG-Stabilized, solmeth = 5) ! pcmeth -- preconditioning method (No preconditioning, pcmeth = 0) ! iredo -- whether or not to recompute the preconditioner. ! (0 for recompute, 1 to not). ! tol -- tolerance on residual (1.0e-6) ! ireltv -- whether or not to use relative residual for stopping ! (0 for not, 1 to use relative residual). ! maxits -- maximal allowed iterations (100) ! scaling -- scaling of the system (0) ! instlvl -- instrumentation level (0) ! In addition, some methods require setting iparms and rparms. For ! example, GMRES(m) requires a value m for the Krylov subspace size (but ! this is set to 10 by default). ILU(s) factorization requires a level ! of fill value, which by default is 0. Because of the wide variety of ! methods available, it is necessary to check the individual routines ! to find all the possibilities. ! 5) If instrumentation is to be used, a vector of 16 logical unit numbers ! must be provided. SPLIB will open files with those unit numbers, to ! write out summary data and Xgraph files tracing the iteration history. ! !------------------------------------------------------------------------------ ! ! Randall Bramley ! Department of Computer Science ! Indiana University, Bloomington ! bramley@cs.indiana.edu ! Wed Jul 12 10:03:18 EST 1995 ! !------------------------------------------------------------------------------ ! ! Parameters: ! ----------- ! ! intdbl : number of integer words per double precision word for ! target machine. This is used for alignment within ! the memory manager routines. ! ! Arguments: ! ---------- ! n : Order of the matrix A. ! A : double precision array for nonzero entries of matrix A; of ! length nnz = rwptr(n+1)-1. ! colind : Integer array of column indices for entries in A; of ! length nnz = rwptr(n+1)-1. ! rwptr : Integer array of pointers to beginning position of rows ! in arrays A and colind. Must be of length n+1. ! ! x : double precision vector of unknowns to be solved for. ! b : double precision right hand side vector of system to be solved: Ax = b. ! See lrhs and lsol for options on this vector. ! ! tol : tolerance on norm used in stopping test. ! maxits : Maximum number of iterations allowed. On return, contains actual ! number performed. ! ! solmeth : Solution method to use. ! solmeth = 1 ---> BiCG ! solmeth = 2 ---> CG on normal equations AA'y = b; x = A'y. ! solmeth = 3 ---> CG on normal equations A'A x = A'b ! solmeth = 4 ---> CGS ! solmeth = 5 ---> CGSTAB ! solmeth = 6 ---> GMRES(k) ! solmeth = 7 ---> transpose-free QMR ! solmeth = 8 ---> Bi-CG stabilized (Templates version of CG-Stab). ! solmeth = 9 ---> Templates version of GMRES ! solmeth = 10 ---> Jacobi ! solmeth = 11 ---> Gauss-Seidel ! solmeth = 12 ---> SOR ! solmeth = 13 ---> Orthomin(m) ! ! pcmeth : Preconditioning method to use. ! pcmeth = 0 ---> no preconditioner. ! pcmeth = 1 ---> ILU(s) ! pcmeth = 2 ---> MILU(s,rpcprm) ! pcmeth = 3 ---> ILUT(s,rpcprm) ! pcmeth = 4 ---> SSOR(rpcprm) ! pcmeth = 5 ---> TRID(s), where s is the block size. ! pcmeth = 6 ---> ILU(-1), the space saver version of ILU(0) ! pcmeth = 7 ---> ECIMGS. ! ! rparms : real preconditioner parameter, with interpretation depending ! on preconditioning method to be used. ! - tolerance for numerical dropping with ILUT; ! - omega parameter for SSOR(omega); ! - relaxation constant for MILU or ILU0. ! - sparsity pattern indicator for ECIMGS ! rparms = 0.0 Use nonzero pattern of A for L/U ! rparms = 1.0 Use a diagonal matrix as precond. ! 0.0 < rparms < 1.0 Use a pattern sparser than A's. ! rparms > 1.0 Use int(rparms) levels of fill-in. ! ! iparms : integer parameters, with interpretation depending on methods used. ! - iparms(1) is levels of fill for ILU(s), MILU(s), ILUT ! - iparms(1) is block size for TRID. ! - iparms(2) is maximum Kyrlov subspace size k ! for GMRES(k), Orthomin(k). ! ! instlvl : Level of instrumentation. The level is cummulative, so specifying ! output of Oetli/Prager norms implies residual norms, etc. ! instlvl < 0 ---> no output from splib ! instlvl = 0 ---> only summary data ! instlvl = 1 ---> residual norm data ! instlvl = 2 ---> preconditioned residual norm ! instlvl = 3 ---> relative residual norm ! instlvl = 4 ---> relative preconditioned residual norm ! instlvl = 5 ---> Oetli/Prager norms ! instlvl = 6 ---> error norms (if true solution vector is provided ! via a common block) ! ! iunit : vector of I/O unit numbers to use for output. ! The I/O unit numbers are provided in the vector iunit(16). ! The units are optional, depending on the error norm(s) you wish to ! plot/use. However, if you specify instlvl >= 0, ! the following units will be opened for writing: ! ! iunit(1): File for output of summary data ! iunit(2-3): Not used here ! iunit(4): || Ax - b || vs. iter ! iunit(5): || inv(M) (Ax-b) || vs. iter ! iunit(6): || Ax - b || / || r_0 || vs. iter ! iunit(7): || inv(M) (Ax-b) || / || inv(M) r_0 || vs. iter ! iunit(8): || Ax - b ||_inf / ( ||A||_inf ||x||_1 + ||b||_inf) ! vs. iter ! iunit(9): || x - xstar || vs. iter ! iunit(10): Not used. ! iunit(11): || Ax - b || vs. time ! iunit(12): || inv(M) (Ax-b) || vs. time ! iunit(13): || Ax - b || / || r_0 || vs. time ! iunit(14): || inv(M) (Ax-b) || / || inv(M) r_0 || vs. time ! iunit(15): || Ax - b ||_inf / ( ||A||_inf ||x||_1 + ||b||_inf) ! vs. time ! iunit(16): || x - xstar || vs. time ! ! work : Array of work storage, for preconditioners, solvers, and. ! instrumentation routines. The matrices and vectors used ! are all indexed from this integer array, using pointers. ! ! space : maximum number of integer words allowed for all problem ! dependent arrays. This includes the preconditioners ! and the additional work vectors needed in the solvers. ! On return, space holds the actual amount of storage ! required by SPLIB. ! ! iredo : Recompute preconditioner if redo = 0, reuse if redo = 1. ! This is useful if you wish to reuse a preconditioner from ! an earlier call to SPLIB. Converted internally to logical ! variable redo ! ! Variables (also see the descriptions in include file instr.inc): ! -------------------------------------------------------------- ! ! r : (int pointer) double precision residual vector. ! ! uptr, ! lu,jlu : Data structure for the preconditioner M, with different meanings ! for different preconditioners. All three of uptr, lu, jlu are ! simply pointers into the work() array in splib.f, but are ! used as the names of the double precision arrays when passed into the ! subroutines called by splib.f. ! ! ILU(s), MILU(s), or ILUt ---> ! (lu,jlu) gives the combined L/U factors for the ! preconditioner in a modified sparse row (MSR) format. ! L is unit lower triangular and U is upper triangular. ! The diagonal of U is stored in lu(1:n) inverted, to ! replace divisions in the algorithm with multiplications. ! Each i-th row of the combined L/U matrix in the (lu,jlu) ! data structure contains the i-th row of L (excluding ! the diagonal entry = 1), followed by the i-th row of u. ! uptr is a integer vector of length n containing pointers ! to the start position of each row of U in (lu, jlu). ! ! SSOR(omega) ---> ! lu is the scalar acceleration parameter omega. ! jlu is not used, and uptr is a vector of pointers to the ! diagonal entries in the CSR data structure for A. ! ! TRID(s) ---> ! This implements block diagonal preconditioning, where ! each diagonal block is from the corresponding tridiagonal ! part of the matrix A. The size of the blocks is given ! implicitly in the integer iskip, stored in jlu(n). ! Factors are set up for contiguous blocks of iskip rows ! and so by using iskip = n the factor can be for the entire ! matrix's tridiagonal part. Using iskip = 1 is an ! inefficient way of performing diagonal preconditioning, ! while 1 < iskip < n is for a block tridiagonal ! preconditioner. Gaussian elimination with pivoting is ! used for factoring each block, so it is necessary to store ! the pivot sequence in jlu(1:n-1). lu has the four vectors ! giving the nonzeros of the tridiagonal preconditioner. ! Vector uptr is not used for TRID. ! ! ILU0 ---> ! This implements the version of ILU(0) that does not change ! any off-diagonal entries of A, and so only needs to store ! the n-vector of diagonal pivots. lu(1:n) stores those ! pivots, and jlu(1:n) contains pointers to the diagonal ! entry in each row of A. ! ! ! next : points to next available location in array work() ! remain : number of integer words remaining in work() to be used. ! ! ireltv : variable to indicate whether stopping test is relative or not. ! converted internally to logical variable reltv ! iters : Actual number of iterations used, returned by solver. ! Second component contains Krylov subspace size for GMRES. ! krylov : Size of Krylov subspace to use in GMRES; k above. ! ipcprm : integer preconditioning parameter, usually levels of fill. ! For tridiag preconditioning, it means the block size. ! nzlu : Integer giving number of nonzeros in preconditioner. ! mneed : Integer parameter from preconditioner call indicating ! the amount of additional integer word storage needed. ! Warning: this is only an approximation in general. ! nnz : Number of nonzero entries in A. ! ! errflg : Integer variable indicating error conditions. For the ! solvers, the interpretation is: ! errflg = -m --> Not enough workspace provided; m more double ! precision words needed in array work(). ! errflg = 0 --> Successful return; tolerance achieved. ! errflg = 1 --> The initial guess satisfies the stopping test. ! This is often a clue that rhs not set properly. ! errflg = 2 --> Tolerance reset 8 times in attempt to ! get true residual small enough. ! errflg = 3 --> Convergence not achieved in maxits iterations. ! errflg = 4 --> Preconditioned residual is too large relative ! to the unpreconditioned residual. Indicates ! an unstable preconditioner. ! 4 < errflg < 9 --> Breakdown condition, usually from an attempt to ! divide by a near-zero number in algorithm. ! errflg = 10+k --> Error condition k occured in preconditioner. ! Note that any of conditions 0, 1, 2 may mean success. ! ! res : Final residual norm (2-norm) ! err : Final error norm (2-norm) ! bnorm : Two-norm of the right hand side vector b. Used for relative ! residual stopping tests: ||r_k||/||r_0|| < tol. ! sptime : Integer pointer to array of times in memory manager. ! Used for interacting with CADYF program. ! ! scaling : Job control parameter for scaling of system. ! scaling = 0 ---> perform no scaling ! scaling = 1 ---> set max(max(A)) = 1.0 ! scaling = 2 ---> set max of each row to be 1.0 ! scaling = 3 ---> scale each column so diagonal entry is 1.0 ! scaling = 4 ---> perform first scaling=2, then scaling=3 ! If scaling = 0 or 1, Dl and Dr can be dummy variables. ! If scaling = 2 or 4, Dl must provided (see below) ! If scaling = 3 or 4, Dr must provided (see below) ! Note: SPLIB will unscale the solution vector so that ! it is the solution to the original system when ! scaling = 3 or 4. ! ! Dl, Dr : Double arrays of length n, used to hold scaling of matrix. ! system actually solved is Dl*A*Dr y = Dl b. On return, the ! matrix A has been scaled and x = Dr*y. ! ! Instrumentation Common Block Variables: ! --------------------------------------- ! ! xstar : Vector with true solution, for error norm checking. ! title : Character array to hold title of Harwell/Boeing matrix. ! key : Character array for short title of Harwell/Boeing matrix. ! times : vector containing timing data, for plotting convergence ! history versus time. ! times(1) --> time spent in computing the preconditioner. ! times(2) --> time spent in iterative solver, including pc solves ! times(3) --> time spent in A*v multiplications ! (includes instrumentation calls to A*v) ! times(4) --> time spent in A'*v multiplications ! (includes instrumentation calls to A'*v) ! times(5) --> time spent in applying preconditioner ! (includes instrumentation calls to lusl) ! times(6) --> time spent in remainder of solver (not yet done) ! times(7) --> time to be excluded because spent in instrumentation. ! times(8) --> time to be excluded because of other initializations, ! or because of earlier runs with the same matrix. ! ! ! I/O Files ! --------- ! ! results : File containing summary of results from run. ! ! resi : File with residual norm vs. iteration number. ! rest : File with residual norm vs. elapsed time. ! ! reli : File with relative residual norm vs. iteration number. ! relt : File with relative residual norm vs. elapsed time. ! ! Mresi : File with preconditioned residual norm vs. iteration number. ! Mrest : File with preconditioned residual norm vs. elapsed time. ! ! Mreli : File with preconditioned relative residual norm ! vs. iteration number. ! Mrelt : File with preconditioned relative residual norm ! vs. elapsed time. ! ! oetli : File with Oettli and Prager norm vs. iteration number. ! oetlt : File with Oettli and Prager norm vs. elapsed time. ! ! errori : File with error norm vs. iteration number (if lsol = true). ! errort : File with error norm vs. time (if lsol = true). ! ! ! External Routines ! ----------------- ! ! bmux : Sparse matrix multiply (Sparskit) ! ddot : Blas routine for dotproduct ! dcopy : Blas routine for copy of a vector ! dasum : Blas routine for max norm of vector ! dnrm2 : Blas routine for two-norm of vector ! dgemv : Blas routine for matrix-vector product ! ! Workspace ! --------- ! ! The amount of workspace depends on the methods used, the levels ! of fill-in (and way of defining fill-in levels), and the number ! of Krylov subspace vectors used. The following guide excludes ! the storage for x, r, and b which appear in the calling ! sequence for all the routines. Note that GMRES(k), GMREST(k), and ! ORTHOMIN(k) depend on the size k of the Krylov subspace. Also note ! that the methods with fill-in will require more storage for lu, jlu, ! uptr, but the actual amount is not predictable in advance. The size ! shown is for the preconditioners with no fill-in allowed, and does ! not include the storage that depends on nnz. Even for zero levels ! of fill, ECIMGS requires intermediate storage that is unpredictable. ! ! Routine Reals Integers ! ------- ----- -------- ! ! (m)ilus n 2n ! ilut 2n + 1 3n ! ssor 1 n (but uptr is recycled for this) ! trid 4n n ! ilu0 n n ! ecimgs ? ? (problem dependent) ! ! Routine Reals ! ------- ----- ! bicg 6n } ! cgne 3n } ! cgnr 3n } Add 2n for all of these if ! cgs 6n } the instrumentation routine ! cgstab 6n } is to be used at all; the ! gmres(k) k*(n+k+5)+n + 1 } extra is for work arrays ! qmrtf 10n } used in residual computation. ! bicgs 6n } ! orthomin(k) (2k+6)*n } ! jacobi 0 } ! gauss 0 } ! SOR 0 } ! !======================================================================== ! Memory management module !======================================================================== include 'include/memory.inc' include 'include/intdbl.inc' include 'include/symbols.inc' !======================================================================== ! Declarations of variables !======================================================================== ! ! Double precision arrays and pointers: ! ------------------------------------- integer :: n, space double precision :: a(*), x(n), b(n), rparms(*) double precision :: Dl(n), Dr(n) integer :: lu, r, work(space), errflg, wu, wl ! Double precision scalars: ! ------------------------- double precision :: tol, rpcprm, loctol double precision :: res,bnorm ! Timing variables: ! ----------------- double precision :: mytime ! Function programs from BLAS: ! ---------------------------- double precision :: ddot,dasum,dnrm2 external :: ddot,dasum,dnrm2 ! Function programs from memory manager: ! ------------------------------------- integer :: addtbl, loctbl, sizeof, storof, getnext, memleft logical :: istbl external :: addtbl, loctbl, sizeof, storof, istbl external :: getnext, memleft ! Other function programs ! ----------------------- double precision :: Oetli external :: Oetli ! integer :: arrays: ! --------------- integer :: rwptr(n+1), colind(*) ! Integer array pointers: ! ----------------------- integer :: uptr, jlu, diag, itvecs integer :: next, remain integer :: jr, jwl, jwu, ofclu ! integer :: sptime ! Integers: ! --------- integer :: ierr, maxits, nnz, iparms(*) integer :: iters(2), mneed, wksz, loctot, num_entries integer :: k integer :: pcmeth, solmeth, krylov, nzlu integer :: alloc integer :: scaling integer :: storage ! Character variables: ! -------------------- character*8 :: from ! Logical variables: ! ------------------ logical :: reltv, redo, results, nogood, force integer :: ireltv, iredo ! Instrumentation variables: ! -------------------------- integer :: iunit(16) integer :: instlvl include 'include/heads.inc' !======================================================================== ! Beginning of Executable Statements !======================================================================== ! ------------------------------------------------------------ ! Convert to logical variables the redo and reltv input var's. ! ------------------------------------------------------------ redo = .true. if (iredo == 1) redo = .false. reltv = .true. if (ireltv == 0) reltv = .false. ! ------------------------------------------------------------ ! Determine if any printing is to be done by checking iunit(1) ! ------------------------------------------------------------ if (instlvl >= 0) then results = (instlvl >= 0) open(iunit(1), file='results.html') end if ! --------------------------------------------------------------- ! Initialize memory management. Have it write to 'results' file. ! --------------------------------------------------------------- call initbl(iunit(1),space) storage = getnext() ! ---------------------------------------------- ! Set up array to return times to CADYF program. ! ---------------------------------------------- from = 'splib' ! sptime = addtbl('sptime', 8, 'real', from) ! -------------------------------------------------- ! Set pointer to next available location in work(): ! -------------------------------------------------- next = getnext() remain = memleft() ! ------------------------------------------------------ ! Call getrusage data to initialize resource usage file: ! ------------------------------------------------------ if (results) call prtres(0,iunit(1)) ierr = 0 ! ----------------------------------------------------------- ! Set up strings giving solution method, preconditioner names ! ----------------------------------------------------------- call strings(solmeth, solver, pcmeth, precond) ! ---------------------------------------------------------------- ! Set number of nonzeros, and unpack parameter for preconditioners ! ---------------------------------------------------------------- nnz = rwptr(n+1) - 1 rpcprm = rparms(1) ipcprm = iparms(1) iters(1) = maxits if (solmeth == sGMRES .or. solmeth == sGMRESt .or. solmeth == sORTHM) then krylov = iparms(2) end if ! ----------------------------- ! Echo some info to output file ! ----------------------------- if (results) then write(iunit(1),2040) write(iunit(1),6020) n, nnz write(iunit(1),6050) solver,precond if (solmeth == sGMRES .or. solmeth == sGMRESt .or. solmeth == sORTHM) then write(iunit(1),6030) krylov end if if (scaling == NOSC) write(iunit(1),7000) if (scaling == MXSC) write(iunit(1),7010) if (scaling == RWSC) write(iunit(1),7020) if (scaling == COLSC) write(iunit(1),7030) if (scaling == RCSC) write(iunit(1),7040) write(iunit(1),6060) ipcprm write(iunit(1),6090) rpcprm write(iunit(1),6040) tol write(iunit(1),6070) maxits end if !================================================================================ ! Set default values if not specified by calling routine, ! and perform some simple validity checks. !================================================================================ ! ------------------------------------------ ! Valid solution method? Use CGSTAB if not. ! ------------------------------------------ if (solmeth .le. 0 .or. solmeth .gt. 14) then if (results) write(iunit(1),3000) solmeth = 5 end if ! --------------------------------------- ! Valid precond method? Use NoPC if not. ! --------------------------------------- if (pcmeth .lt. pNOPC .or. pcmeth .gt. pEIMGS ) then if (results) write(iunit(1),3010) pcmeth = pNOPC end if ! ------------------------------------------------- ! Valid maximal iterations allowed? Use 100 if not. ! ------------------------------------------------- if (maxits .le. 0 ) then if (results) write(iunit(1),3020) iters(1) = 100 end if ! ------------------------------------------------ ! Set local tolerance and then check for valid ! tolerance on residual. Use 1.0d-6 if not valid. ! ------------------------------------------------ loctol = tol if (tol .lt. 0.0d0) then if (results) write(iunit(1),3030) tol loctol = 1.0d-6 end if ! ---------------------------------------------------------------- ! If Jacobi, Gauss-Seidel, or SOR then cannot use preconditioning. ! ---------------------------------------------------------------- nogood = ((solmeth == 10) .or. (solmeth == 11) .or. & (solmeth == 12)) .and. (pcmeth .ne. pNOPC) if (nogood) then if (results) write(iunit(1),3040) pcmeth = pNOPC end if ! --------------------------------------------------------------- ! If GMRES, GMRESt, or Orthomin then Krylov subspace size must be ! at least equal to one. Reset to 10 otherwise. ! --------------------------------------------------------------- nogood = (solmeth == 6 .or. solmeth == 9 .or. solmeth == 13) & .and. (krylov .le. 0) if (nogood) then if (results) write(iunit(1),3050) krylov = 10 end if ! ---------------------------------------------------------- ! Set up space for the residual vector r if not already done ! ---------------------------------------------------------- if (.not. istbl('r', from)) then r = addtbl('r', n, 'real', from) else r = loctbl('r', from) end if remain = memleft() next = getnext() if (remain .le. 0) then errflg = -1.0d6 if (results) write(iunit(1),5040) next return end if ! --------------------------------------- ! Compute norm of right hand side vector. ! --------------------------------------- bnorm = dnrm2(n,b,1) if (results) write(iunit(1),2020) bnorm ! ---------------------------------------- ! Scale the matrix A and right hand side b ! ---------------------------------------- if (scaling .lt. NOSC .or. scaling .gt. RCSC) scaling = NOSC if (scaling == MXSC) call scala(MXSC, A, rwptr, colind, n, b, Dl) if (scaling == RWSC .or. scaling == RCSC) & call scala(RWSC, A, rwptr, colind, n, b, Dl) if (scaling == COLSC .or. scaling == RCSC) & call scala(COLSC, A, rwptr, colind, n, b, Dr) ! ------------------------ ! Initialize timing vector ! ------------------------ do k = 1, 8 times(k) = 0.0d0 end do times(1) = mytime() ! -------------------------------------------------------- ! Time to exclude from instrumentation routines, because ! it was spent in initialization or other solution methods ! for the same matrix. ! -------------------------------------------------------- times(8) = mytime() !======================================================================= ! *** Preconditioner Computation *** !======================================================================= ! ----------------------------------------------------------------- ! If we are to reuse the preconditioner from a previous run, skip ! computation of it. Otherwise need to go through and set it up. ! Later extend this to whether or not to compute structure as well. ! ----------------------------------------------------------------- if (redo .or. .not. istbl('lu',from)) then ! --------------------------------------------------- ! Free up space used by the previous preconditioner ! --------------------------------------------------- if (istbl('lu',from)) call rmvtbl(work(1),'lu',from) if (istbl('jlu',from)) call rmvtbl(work(1),'jlu',from) if (istbl('uptr', from)) call rmvtbl(work(1),'uptr' ,from) ! ----------------------------------------------------------------- ! Call the routines to set up the preconditioner. ! Note that ilus must align everything on double word boundaries, ! so it is assumed that mneed is multiple of intdbl if ierr .ne. 0. ! First set pointers in case no preconditioner is used. ! ----------------------------------------------------------------- next = getnext() remain = memleft() uptr = next jlu = next lu = next nzlu = 0 ! -------------------------------------- ! Preconditioner set up for ILU or MILU: ! -------------------------------------- if (pcmeth == pILUS) then call ilus (n,a,colind,rwptr,ipcprm,0.0d0,lu,jlu,uptr,work(next),remain, & ierr,mneed) end if if (pcmeth == pMILU) then call ilus (n,a,colind,rwptr, & ipcprm,rpcprm, & lu,jlu,uptr,work(next),remain, & ierr,mneed) end if ! ----------------------------------------------------------- ! Update memory management pointers if pcmeth is ILU or MILU: ! ----------------------------------------------------------- if ((pcmeth == pILUS .or. pcmeth == pMILU) .and. ierr == 0) then nzlu = work(jlu+next-1+n) - 1 mneed = (n + (1 + intdbl)*nzlu) - remain if (mneed .gt. 0) then ierr = 1 go to 200 end if from = '(M)ILU' uptr = addtbl('uptr', n, 'integer', from) jlu = addtbl('jlu', nzlu, 'integer', from) lu = addtbl('lu', nzlu, 'real', from) end if ! ------------------------------- ! Preconditioner set up for ILUT: ! ------------------------------- if (pcmeth == pILUT) then next = getnext() remain = memleft() loctot = next + remain - intdbl mneed = 2*intdbl *(n+1) + 4*n - remain if (mneed .gt. 0) then ierr = 1 go to 200 end if ! ----------------------------------------------------- ! Make sure allocated length n is a multiple of intdbl ! ----------------------------------------------------- alloc = n + mod(n,intdbl) ! ------------------------------------------------------- ! The integer work arrays need not be double word aligned, ! since they will be ignored outside of ilut.f ! ------------------------------------------------------- jwu = loctot - alloc jwl = jwu - alloc jr = jwl - alloc ! ------------------------------------------------------------ ! The double precision arrays need to be aligned, since they might cause ! data alignment errors within ilut.f otherwise. ! ------------------------------------------------------------ alloc = intdbl*(n+1) wl = jr - alloc - intdbl wl = wl + mod(wl-1,intdbl) wu = wl - alloc - intdbl wu = wu + mod(wu-1,intdbl) ! ------------------------------------------------------- ! Unless ilut fails, uptr will be allocated "permanently", ! so do it here. ! ------------------------------------------------------- from = 'ILUT' uptr = addtbl('uptr',n,'integer',from) ! ------------------------------------------------------------------ ! Partition the rest of available memory between jlu and lu, keeping ! in mind that for the same number of entries lu requires twice as ! many integer words. ! ------------------------------------------------------------------ next = getnext() remain = memleft() num_entries = (wu - next)/(intdbl + 1) jlu = next lu = next + num_entries lu = lu + mod(lu-1,intdbl) call ilut(n, a, colind, rwptr, & ipcprm, rpcprm, & work(lu), work(jlu), work(uptr), num_entries, & work(wu), work(wl), work(jr), work(jwl), & work(jwu), ierr) if (ierr .gt. 0) then if (results) write(iunit(1),5000) ierr go to 100 end if if (ierr == -1 .or. ierr == -5) then if (results) write(iunit(1),5030) ierr ierr = -ierr go to 100 end if if (ierr == -4) then if (results) write(iunit(1),5033) ipcprm ierr = -ierr go to 100 end if if (ierr .ne. 0) then mneed = (1 + intdbl) * ( 2*n*(ipcprm+1) + nnz) nzlu = work(jlu+n)-1 go to 100 end if ! ------------ ! Allocate jlu ! ------------ from = 'ILUT' mneed = num_entries*(1+intdbl) - remain if (mneed .gt. 0) then ierr = 1 go to 200 end if k = addtbl('jlu', num_entries, 'integer', from) ! ------------------------------------------------------ ! Allocate the lu array, and then shift the computed one ! up in the work array to fill in any empty spaces. ! ------------------------------------------------------ ofclu = addtbl('lu', num_entries, 'real', from) call dcopy(num_entries, work(lu), 1, work(ofclu), 1) lu = ofclu ! ----------------- ! End of ILUT setup ! ----------------- 100 continue end if ! ------------------------------- ! Preconditioner set up for SSOR: ! ------------------------------- if (pcmeth == pSSOR) then nzlu = n mneed = (intdbl + n) - remain if (mneed .gt. 0) then ierr = 1 go to 200 end if from = 'SSOR' jlu = addtbl('jlu', n, 'integer', from) lu = addtbl('lu', 1, 'real', from) uptr = jlu call ssor(n,a,colind,rwptr,rpcprm,work(lu), & work(jlu),work(jlu),work(next),mneed,ierr) end if ! --------------------------------- ! Preconditioner set up for TRDIAG: ! --------------------------------- if (pcmeth == pTRID) then nzlu = 4*n mneed = (n + (1 + intdbl)*nzlu) - remain if (mneed .gt. 0) then ierr = 1 go to 200 end if from = 'TRDIAG' jlu = addtbl('jlu', n, 'integer', from) lu = addtbl('lu', nzlu, 'real', from) uptr = jlu call tridiag(n,a,colind,rwptr, work(lu),work(jlu),ipcprm,ierr) end if ! ---------------------------------- ! Preconditioner set up for ILU(-1): ! ---------------------------------- if (pcmeth == pILU0) then nzlu = n mneed = (n + (1 + intdbl)*nzlu) - remain if (mneed .gt. 0) then ierr = 1 go to 200 end if from = 'ILU0' jlu = addtbl('jlu', n, 'integer', from) lu = addtbl('lu', nzlu, 'real', from) uptr = jlu call ilu0(n, a, colind, rwptr, rpcprm, work(lu), work(jlu), ierr) end if ! ------------------------------- ! Preconditioner set up for EIMGS ! -------------------------------- if (pcmeth == pEIMGS) then call ecimgs(n,a,colind,rwptr, rpcprm, jlu, & lu, uptr, work(next), & remain, ierr, mneed) nzlu = work(jlu+n+next-1) - 1 mneed = (n + (1 + intdbl)*nzlu) - remain if (mneed .gt. 0) then ierr = 1 go to 200 end if from = 'ECMGS' uptr = addtbl('uptr', n, 'integer', from) jlu = addtbl('jlu', nzlu, 'integer', from) lu = addtbl('lu', nzlu, 'real', from) end if next = getnext() remain = memleft() ! ------------------------------------------------- ! Error conditions from computing preconditioners: ! ------------------------------------------------- 200 continue if (ierr .ne. 0) then errflg = 10 + ierr ! ----------------------------------------------------- ! Error codes in TRID differ from other preconditioners ! ----------------------------------------------------- if (pcmeth == pTRID .and. results) then write(iunit(1),4040) return end if ! ----------------------- ! Small pivot encountered ! ----------------------- if (ierr .lt. 0 .and. results) write(iunit(1),4090) ierr ! ----------------- ! Not enough memory ! ----------------- if (ierr == 1 .and. results) then write(iunit(1),4080) mneed end if ! ----------------- ! Illegal parameter ! ----------------- if (ierr == 2 .and. results) write(iunit(1),4070) ! ---------------------------------------------------------- ! List memory usage and clean up mess in case of error in PC ! ---------------------------------------------------------- from = 'PCFail' if (results) call listbl(from) from = 'splib' if (istbl('lu',from)) call rmvtbl(work(1),'lu',from) if (istbl('jlu',from)) call rmvtbl(work(1),'jlu',from) if (istbl('uptr',from)) call rmvtbl(work(1),'uptr',from) return else ! ------------------------------------------------------ ! No error in preconditioner; output memory requirements ! ------------------------------------------------------ if (results) write(iunit(1),2030) nzlu, dfloat(nzlu)/dfloat(nnz) end if ! ---------------------------------------------------- ! If not recomputing preconditioner, just loctbl them: ! ---------------------------------------------------- else lu = loctbl('lu' , from) jlu = loctbl('jlu' , from) uptr = loctbl('uptr', from) ! -------------------------------------------------- ! End if block for (re)computing the preconditioner ! -------------------------------------------------- end if ! ================================================================== ! *** End Preconditioner Computation *** ! ================================================================== ! ---------------------------------- ! Time for computing preconditioner: ! ---------------------------------- times(1) = mytime() - times(1) ! ================================================================== ! *** Call Solver Routine *** ! ================================================================== times(2) = mytime() wksz = memleft()/intdbl ! if (solmeth == sBICG) then k = 6*n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'BiCG' itvecs = addtbl('itvecs', k, 'real', from) call bicg( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if if (solmeth == sCGNE) then k = 3*n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'CGNE' itvecs = addtbl('itvecs', k, 'real', from) call cgne( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if if (solmeth == sCGNR) then k = 3*n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'CGNR' itvecs = addtbl('itvecs', k, 'real', from) call cgnr( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if if (solmeth == sCGS) then k = 6*n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'CGS' itvecs = addtbl('itvecs', k, 'real', from) call cgs ( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr),& pcmeth, iunit, instlvl) end if if (solmeth == sCGSTAB) then k = 6*n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'CGStab' itvecs = addtbl('itvecs', k, 'real', from) call cgstab( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if if (solmeth == sGMRES) then iters(2) = krylov k = krylov*(n+krylov+5)+n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'GMRES' itvecs = addtbl('itvecs', k, 'real', from) call gmres( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if if (solmeth == sQMR) then k = 6*n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'QMR' itvecs = addtbl('itvecs', k, 'real', from) call qmrtf( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if if (solmeth == sBICGS) then k = 6*n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'BiCGS' itvecs = addtbl('itvecs', k, 'real', from) call bicgs( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if if (solmeth == sGMRESt) then iters(2) = krylov k = krylov*(n+krylov+5)+n from = 'GMRESt' itvecs = addtbl('itvecs', k, 'real', from) call gmrest( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if if (solmeth == sJACOBI) then k = n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'jacobi' itvecs = addtbl('itvecs', k, 'real', from) if (.not. istbl('diag', from)) then diag = addtbl('diag', n, 'integer', from) next = getnext() wksz = memleft()/intdbl end if call jacobi( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(diag), & pcmeth, iunit, instlvl) end if if (solmeth == sGAUSS) then k = n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'gauss' itvecs = addtbl('itvecs', k, 'real', from) if (.not. istbl('diag', from)) then diag = addtbl('diag', n, 'integer', from) next = getnext() wksz = memleft()/intdbl end if call gauss( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(diag), & pcmeth, iunit, instlvl ) end if if (solmeth == sSUCR) then k = n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'sor' itvecs = addtbl('itvecs', k, 'real', from) if (.not. istbl('diag', from)) then diag = addtbl('diag', n, 'integer', from) next = getnext() wksz = memleft()/intdbl end if call sor( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & rparms,work(jlu),work(diag), & pcmeth, iunit, instlvl) end if if (solmeth == sORTHM) then from = 'orthomin' iters(2) = krylov k = 2*krylov*n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if itvecs = addtbl('itvecs', k, 'real', from) call orthomin( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if if (solmeth == sCSCGS) then k = 17*n if (memleft() - intdbl*k .lt. 0) then errflg = memleft() - intdbl*k return end if from = 'CSCGS' force = .false. itvecs = addtbl('itvecs', k, 'real', from) call cscgstab( & a,colind,rwptr,n, & x,work(r),b, & loctol,reltv,iters,ierr,force, & work(next),wksz, & work(lu),work(jlu),work(uptr), & pcmeth, iunit, instlvl) end if ! ================================================================== ! *** End Solver Routines *** ! ================================================================== ! -------------------------- ! Time for iterating solver: ! -------------------------- times(2) = mytime() - times(2) - times(7) ! ------------------------------------------------- ! Write out message indicating returned error code. ! ------------------------------------------------- if (results) then if (ierr .lt. 0) write(iunit(1),2120) -intdbl*ierr if (ierr == 0) write(iunit(1),2130) if (ierr == 1) write(iunit(1),2140) if (ierr == 2) write(iunit(1),2150) if (ierr == 3) write(iunit(1),2160) if (ierr == 4) write(iunit(1),2180) if (ierr .gt. 4 .and. ierr .lt. 100) write(iunit(1),2170) ierr if (ierr == 100) then write(iunit(1),2190) Oetli(A,rwptr,colind,n, b, x, .false.) end if end if errflg = ierr ! if (results) write(iunit(1),2090) iters(1) ! ---------------------------- ! Compute final residual norms ! ---------------------------- call bmux(n,n,x,work(r),a,colind,rwptr) call daxpy(n,-1.0d0,b,1,work(r),1) res = dnrm2(n,work(r),1) if (results) write(iunit(1),2000) res ! -------------------------------------------- ! Output timing summary and total memory used: ! -------------------------------------------- space = getnext() - storage if (results) then write(iunit(1),2080) times(3),times(4),times(5) write(iunit(1),2070) times(1),times(2),times(1)+times(2) write(iunit(1),2085) space end if ! ---------------------------- ! Output memory usage summary: ! ---------------------------- from = 'summem' if (results) call listbl(from) if (istbl('diag', from)) call rmvtbl(work(1),'diag',from) if (istbl('itvecs',from)) call rmvtbl(work(1),'itvecs',from) ! ------------------------------------- ! Call getrusage data for timings file: ! ------------------------------------- if (results) call prtres(1,iunit(1)) maxits = iters(1) ! --------------------------- ! Indicate end of output file ! --------------------------- if (results) write(iunit(1),2050) ! ------------------------- ! Copy times over for CADYF ! ------------------------- ! call dcopy(8,times,1,work(sptime),1) ! ------------------------------ ! Unscale x vector to reflect Dr ! ------------------------------ if(scaling == COLSC .or. scaling == RCSC) call unscalx(n, x, Dr) ! ------------------------------------------ ! Unscale the matrix A and right hand side b ! ------------------------------------------ if (scaling == RWSC .or. scaling == RCSC) & call unscala(RWSC, A, rwptr, colind, n, b, Dl) if (scaling == COLSC .or. scaling == RCSC) & call unscala(COLSC, A, rwptr, colind, n, b, Dr) return !=========================================================================== ! *** Format Statements *** !=========================================================================== 1000 format(128(2(4(d16.8,3x),/),/)) 1010 format(128(8(i5,3x),/)) 1090 format(i4,3x,e11.5) 6010 format('

Linear System Information

',/,'',/) 6050 format('

Information about iterative method

',/,'',/) 2100 format('

Results of Execution

',/,'