### Table of contents

### Conditions of use

Copyright 1991-2023 CERFACS, CNRS, ENS Lyon, INP Toulouse, Inria, Mumps Technologies, University of Bordeaux.

This version of MUMPS is provided to you free of charge. It is
released under the CeCILL-C license

(see doc/CeCILL-C_V1-en.txt, doc/CeCILL-C_V1-fr.txt, and
https://cecill.info/licences/Licence_CeCILL-C_V1-en.html),
except for variants of AMD ordering and [sdcz]MUMPS_TRUNCATED_RRQR
derived from the LAPACK package distributed under BSD 3-clause
license (see headers of ana_orderings.F and [sdcz]lr_core.F),
and except for the external and optional ordering PORD provided
in a separate directory PORD (see PORD/README for License information).

You can acknowledge (using references [1] and [2]) the contribution of this package in any scientific publication dependent upon the use of the package. Please use reasonable endeavours to notify the authors of the package of this publication.

[1] P. R. Amestoy, I. S. Duff, J. Koster and J.-Y. L'Excellent, A fully asynchronous multifrontal solver using distributed dynamic scheduling, SIAM Journal on Matrix Analysis and Applications, Vol 23, No 1, pp 15-41 (2001).

[2] P. R. Amestoy, A. Buttari, J.-Y. L'Excellent and T. Mary, Performance and scalability of the block low-rank multifrontal factorization on multicore architectures, ACM Transactions on Mathematical Software, Vol 45, Issue 1, pp 2:1-2:26 (2019)

As a counterpart to the access to the source code and rights to copy, modify and redistribute granted by the license, users are provided only with a limited warranty and the software's author, the holder of the economic rights, and the successive licensors have only limited liability.

In this respect, the user's attention is drawn to the risks associated with loading, using, modifying and/or developing or reproducing the software by the user in light of its specific status of free software, that may mean that it is complicated to manipulate, and that also therefore means that it is reserved for developers and experienced professionals having in-depth computer knowledge. Users are therefore encouraged to load and test the software's suitability as regards their requirements in conditions enabling the security of their systems and/or data to be ensured and, more generally, to use and operate it in the same conditions as regards security.

The fact that you are presently reading this means that you have had knowledge of the CeCILL-C license and that you accept its terms.

### Download request submission

**To obtain the MUMPS package**, please fill in the form below with care,
and you will receive it by separate e-mail.
**Feedback** or questions related to MUMPS are welcome. Please use the
email address given at the beginning of the README file in the source code
archive.

#### CHECKSUM (SHA256) for last release archive :

1920426d543e34d377604070fde93b8d102aa38ebdf53300cbce9e15f92e2896### ChangeLog/History

#### ChangeLog

##### Changes from 5.6.0 to 5.6.1

- Fixed compatibility of parallel analysis with analysis by blocks/out-of-range
- Minor fix of INFO(2) value in a case of INFO(1)=-7 with parallel analysis
- WRITE_PROBLEM: avoid leading space when printing "%%MatrixMarket matrix [..]"
- Fix incorrect error code -74 instead of -76 (save-restore)
- Avoid a remaining hard-coded Fortran unit (save-restore)
- Avoid -Wundef warning in case __STDC_VERSION__ is not defined
- Avoid calling MPI_IPROBE in case of a single MPI process (performance improvement on matrices with small fronts)
- During analysis, write time for symbolic factorization
- Update GEMMT link in INSTALL file

##### Changes from 5.5.1 to 5.6.0

- Analysis by blocks and out-of-range entries compatible with parallel analysis
- Compact workarray S before solution phase (ICNTL(49)=1,2)
- JOB=-4 frees data from factorization and keeps results from analysis
- NEC vector engine version: tuned block low rank and GEMMT usage
- Reduced memory for symbolic datastructures (order-N arrays, arrowheads)
- Use new symbolic factorization (column counts) by default (+allow forest in case of Schur)
- Improved amalgamation algorithm to reduce the number tiny nodes
- Discard factors option is now compatible with BLR
- Discard L factors option in case of LU factorization available for in-core
- Count null negative pivots (INFOG(50))
- Compilation with -DBLR_MT is no longer needed (see also -DBLR_NOOPENMP)
- Improve performance of A-1 entries computation when #MPI > 1
- Fixed risk of hanging in case of OOC error (e.g. disk full)
- Free internal data earlier (e.g., factors in case of failed factorization, etc.)
- Avoid setenv/unsetenv on MinGW environments (Scotch versions >= 7)
- Fix error during ordering when processing matrices of order 1
- Out-Of-Core, values of ICNTL(22) different from 1 are treated as zero (in-core)
- BLKPTR ignored when ICNTL(15) < 0 (INFO(1:2)=-57 4 no longer occurs)
- BLR OpenMP fac_lr.F fix for possible non-increasing scheduling of loop indices
- Fix: id%LRGROUPS was not nullified in CMUMPS_FREE_ONENTRY_ANA_DRIVER
- Fixed parameter error 1202 in PDGETRS after restoring a factorization
- Workaround MPI_SSEND intel MPI 2021.6 issue
- NEC: Fix MUMPS_WRAP_GINP94 offloading to host
- Shared libraries: minor update in Makefiles
- Fixed missing printings of preprocessing constants in mumps_print_defined.F
- -DNOSCALAPACK: compilation without BLACS/ScaLAPACK (reduced performance)

##### Changes from 5.5.0 to 5.5.1

- 64-bit default integer version: avoid MPI buffer size > 2^31 (could lead to negative counts in MPI_Recv with Intel ilp64 MPI)
- Use ICNTL(3) unit instead of '*' to print parallel ordering time
- Output symbolic factorization choice (ICNTL(58)) on ICNTL(3) unit
- SCOTCH 7.x on Windows: use
__putenv instead of setenv (unsymmetric case, #MPI >1)__ __Fixed parallel analysis in case of 64-bit integers in parmetis and -i8 MUMPS____Limit recursivity depth in mumps__static_mapping.F (limit stack usage on trees with small nodes at the top and large nodes below)- Limit subroutine names to 32 characters
- -DWORKAROUNDILP64MPICUSTOMREDUCE introduced (platformMPI and 64-bit default integer)

##### Changes from 5.4.1 to 5.5.0

- New feature: analysis by blocks with compressed graph (ICNTL(15))
- New feature: symbolic factorization using column counts (ICNTL(58))
- Matrix scaling compatible with Schur complement feature
- Automatic setting of CNTL(3) and CNTL(5) compatible with Schur complement feature
- Fixed possible error in memory (INFOG(3)) and FLOPS (RINFOG(1)) estimations with Schur feature
- Memory allowed (ICNTL(23)) is per MPI process and compatible with WK_USER
- More compact storage of L factors for symmetric matrices
- Exploit multithreading in SCOTCH versions >= 7.0
- Support for NEC Aurora vector processors
- Possibility to build shared instead of static libraries
- Adjustment of BLR cluster size
- Shorter time for matrix redistribution
- Avoid a possible case of overflow in computation of MAXIS
- Fixed memory estimate of PTRAIW/PTRARW arrays
- -DDPRINT_BACKTRACE_ON_ABORT added in case of call to MUMPS_ABORT()
- More usage of dynamic memory to avoid -9 errors
- Fixed a possible "Internal error 3 in ZMUMPS_BLR_RETRIEVE_PANEL_LORU"
- Change construction of mumps_int_def.h to allow for cross compilation
- Fixed a possible error in deficiency detection on unsymmetric matrices
- Duplicated few loops during solve (OMP / non-OMP) for gfortran performance
- Fixed compilation issue with -DUPPER and -DMUMPS_WIN32 (5.4.1 regression)
- Find free Fortan units instead of imposing a choice (WRITE_PROBLEM, save/restore)
- Avoid freeing IRN/JCN when not allocated by MUMPS
- Fixed a potential memory leak in dynamic memory management
- -DALLOC_FROM_C introduced to fix dynamic memory management with Cray compilers

##### Changes from 5.4.0 to 5.4.1

- Added feature to dump a matrix/rhs in binary form (see id%WRITE_PROBLEM)
- Fixed error during constrained ordering (ICNTL(12)=3) leading to a segfault
- Fixed type 2 assembly by pieces in case of huge blocks of delayed pivots
- Avoid repeated small allocations in MUMPS_BUILD_SORT_INDEX
- Fixed automatic choice for ICNTL(6) when values are not provided at analysis
- Avoid an access to an uninitialized value (FLOP_FRFRONTS, MPI only, LDLT)
- Fixed error printing in case of error -53
- Fixed an erroneous -9 error in case of large Schur + discard factors
- -DWORKAROUNDINTELILP64OPENMPLIMITATION avoids -i8 compilation warnings
- Limited repeated amalgamations of large child with small parent

##### Changes from 5.3.5 to 5.4.0

- Modified default threshold for null pivot detection (CNTL(3))
- Improved performance of Schur complement computations
- Reduced memory of centralized analysis with distributed matrix
- Improved BLR clustering time during analysis
- Improved BLR performance when CNTL(1)=0.0 and ICNTL(36)=0
- Improved OOC performance in case CNTL(1)=0.0 or SYM=1
- Improved multitreaded performance on matrices with many 2x2 pivots
- Fixed a problem with parallel compilation (in case of: make -j all)
- Fixed some timings during solve that were printed as 0 when PAR=0
- Avoid possible integer overflow on INFOG(10)
- Fixed type 1 assembly by pieces in case static CB moves to dynamic storage
- Use with ITAC: avoid errors (e.g. in DMUMPS_UPPER_PREDICT...) due to MPI returning nonzero error codes when traces are activated
- Add a JOB=-200 call (no MPI communications, clean local OOC files)
- Fixed "Start processing the root ..." printing in case of Schur
- Fixed SYM_PERM in case of Schur on reducible matrices

##### Changes from 5.3.4 to 5.3.5

- Fixed 2x2 pivots bug from 5.3.4 release in MPI LDLT factorization
- Fixed ICNTL(8)=-2 option during analysis (code and documentation)
- Include basic fix for make -j

##### Changes from 5.3.3 to 5.3.4

- Fixed a rare bug (segfault) related to dynamic storage management on numerically difficult matrices
- Fixed a rare deadlock in BLR for symmetric matrices
- Fixed an uninitialized variable (which could lead to incorrect -19 error)
- Minor fix in userguide (CNTL(1) vs. ICNTL(1) in ICNTL(36) description)
- Fixed a possible runtime issue during solve, related to "TO_PROCESS" array

##### Changes from 5.3.1 to 5.3.3

- Assume ilp64 MPI interface only applies to Fortran in c_example.c
- Note on gfortran-10 compilation added
- Avoid intent on pointers (F2003-only)
- More robust multithreading for matrix reformatting (arrowheads)
- Fixed ICNTL(31) interpretation in case of repeated analysis
- Fixed multiple mpif.h inclusion (distributed rhs, ifort+openmpi)
- Fixed computation of effectively used memory statistics
- Minor fix in userguide
- Suppressed a !$OMP CRITICAL from solve phase (introduced in 5.3.0)

##### Changes from 5.3.0 to 5.3.1

- Improved multithreaded performance of BLR backward solve
- Fixed return code in build_mumps_int_def.c + openmp compilation (pgi)
- Forbid a loop vectorization in [sdcz]sol_c.F (segfault with ifort)

##### Changes from 5.2.1 to 5.3.0

- New feature: distributed right-hand sides
- Improved time for arrowheads construction (single MPI case, mainly)
- C interface: ability to know if MUMPS_INT is 64-bit from include file
- Improved BLR performance when CNTL(1)=0.0 and ICNTL(36)=1
- Fixed INFO(34),INFO(35),INFO(37),INFO(38) on processes with rank > 0
- More portable MPI_IS_IN_PLACE feature in libseq
- Fixed determinant computation when Cholesky ScaLapack is used
- Information on advancement (flops done) on each MPI process
- Allow rhs_sparse and irhs_sparse to be unassociated if nz_rhs=0
- Fixed INFO(30) and INFO(31) computation on MPI processes with rank > 0
- OMP collapsed loops: avoid FIRSTPRIVATE on internal loop bound (for pgi)
- Fix for compilers not freeing local allocatable arrays (64-bit metis)
- Fixed RINFO(5-6) and RINFOG(15-16) metrics (
**entries=>**bytes) - C interface: A_ELT/SCHUR/RHS/REDRHS/RHS_loc/SOL_loc may exceed 2^31 entries
- Local Schur (ICNTL(19)=2 or 3) may now exceed 2^31 entries
- Fixed internal dynamic storage of blocks with more than 2^31 entries
- Fixed a bug in the parallel analysis that limited scalability

##### Changes from 5.2.0 to 5.2.1

- Fixed a minor "Internal error in CMUMPS_DM_FREEALLDYNAMICCB"
- Default value of ICNTL(14) for MPI executions independent of SYM + slightly less aggressive than for 5.2.0
- Avoided accesses to uninitialized data in symmetric (2D root, BLR)
- Fixed some incorrect "out" intents for routine arguments
- Avoided CHUNK=0 in OMP loops even if loop not parallelized (pgi)
- Fixed COLSCA&ROWSCA declarations in [SDCZ]MUMPS_ANA_F
- Avoided a possible segfault in presence of NaN's in pivot search
- Minor update to userguide
- Fixed MPI_IN_PLACE usage in libseq (preventing compiler optimization)

##### Changes from 5.1.2 to 5.2.0

- Memory gains due to low-rank factorization are now effective, low-rank solve
- Internal dynamic storage possible in case static workspace too small
- Improved distributed memory usage and MPI granularity (some sym. matrices)
- Improved granularity (and performance) for symmetric matrices; ability to use [DSCZ]GEMMT kernel (BLAS extension) if available (see INSTALL)
- A-1 functionality: improved performance due to solution gathering
- Memory peak for analysis reduced (distributed-entry, 64-bit orderings)
- Time for analysis reduced by avoiding some preprocessing (when possible)
- More exploitation of RHS sparsity during forward substitution
- Ability to save/restore an instance to/from disk
- INFO and INFOG dimension extended from 40 to 80
- METIS_OPTIONS introduced for METIS users to define some specific Metis options
- MUMPS can be asked to call omp_set_num_threads with a value provided in ICNTL(16)
- Fixed: INFO(16)/INFOG(21)/INFOG(22) did not take into account the extra memory allocated due to memory allowed (ICNTL(23)>0); INFOG(8) was not correclty set
- Initialize only lower-diagonal part for workers in symmetric type 2 fronts
- Workaround a segfault at beg. of facto due to a gfortran-8 bug
- Fixed a bug in weighted matching algorithm when all matrix values are 0
- Portability: include stdint.h instead of int_types.h
- Forced some initializations to make C interface more valgrind-friendly
- Workaround intel 2017 vectorization bug in pivot search (symmetric+MPI+large matrices)
- Stop trying to send messages on COMM_LOAD in case of error (risk of deadlock)
- Avoided most array creation by compiler due to Fortran pointers
- Avoid two cases of int. overflow (KEEP(66), A-1 with large ICNTL(27))
- Fixed a bug with compressed ordering (ICNTL(12)=2) (regression from 5.0.0) and suppress compress ordering only in case of automatic setting

##### Changes from 5.1.1 to 5.1.2

- Corrected an overestimation of memory (regression from 5.1.0)
- Corrected/extended WORKAROUNDINTELILP64MPI2INTEGER mechanism (see INSTALL)
- Parallel analysis: fixed a bug, limited number of MPI processes on small problems, and reverted to sequential analysis on tiny problems. This is to avoid erroneous behavior and failures in the parallel ordering tools.
- Faster BLR clustering on matrices with quasi-dense rows (which are skipped)
- Improved performance of solve phase on very small matrices
- Solve phase with a single MPI process is more thread-safe
- Fixed compilation issue with opensolaris ([SDCZ]MUMPS_TRUNCATED_RRQR)
- Fixed minor bug in BLR factorization (uninitialized timer)
- Corrected minor compiler warnings
- Minor correction to userguide
- Add -DBLR_MT in Intel example Makefile

##### Changes from 5.1.0 to 5.1.1

- Fix in parallel analysis
- Stabilization of 5.1.0:
- Improved stability of Block-Low-Rank feature
- Corrected an incorrect deallocation of POSINRHSCOMP_COL
- Correction of a case of uninitialized data access in type 2 pivoting
- Suppressed occasional debug trace "write(6,*) " KEEP265= ", KEEP265"

##### Changes from 5.0.2 to 5.1.0

- New feature: selective 64-bit integers (introduced only where needed)
to process matrices with more than 2^{31}-1 entries.
-mixed 32/64 bit integers for API: NNZ/NNZ_loc 64-bit
(NZ/NZ_LOC kept temporarily for backward compatibility)
- both 32 or 64 bit integer versions of external orderings (Metis/ParMetis, SCOTCH/pt-SCOTCH, PORD), can be used
- Error -51 when a 32-bit external ordering is invoked on a graph larger than 2^{31}-1

- New feature: (experimental) factorization based on Block-Low-Rank format, (ICNTL(35) to activate it and CNTL(7) for low-rank precision)
- Improved performance on numerically hard matrices (LU and LDLt)
- "-DALLOW_NON_INIT" flag has disappeared and needs no longer be used
- Fixed incorrect deallocation in case of JOB=3/ICNTL(26)=1 followed by JOB=2
- Fixed compilation problem with Intel2017 + openMPI in [sdcz]ana_aux_par.F
- Minor correction of memory statistics for solve
- Use 64-bit integers where needed during the solve phase to enable large number of right-hand-sides (NRHS) in one block (i.e. ICNTL(27)xN can be larger than 2^{31}-1)
- Improved performance of solve phase
- Allow pivoting thresholds CNTL(1) equal to 1.0
- New error -52: when default Fortran integers are 64 bit, external orderings should also have 64-bit default integers
- New error -22, INFO(2)=16 when IRN_loc or JCN_loc not associated while ICNTL(18) is set to 3
- Missing O_BINARY flag was added to open binary files on MINGW systems
- New error -53 that could reflect a matrix structure change between analysis and factorization

##### Changes from 5.0.1 to 5.0.2

- Suppress error on id%SCHUR_CINTERFACE in mumps_driver.F when bound check is enabled and when using 2D block cyclic Schur complement feature (ICNTL(19)=2 or 3) from C or Matlab interfaces
- Problem of failed assertion in [SDCZ]MUMPS_TREAT_DESCBAND solved (static variable INODE_WAITED_FOR was not initialized and was not detected by valgrind)
- Correction of very minor memory leaks and access to uninitialized data
- A setting of INFO(1)=-1-17 should have been INFO(1)=-17
- Some settings of INFO(1)=-17 should have been INFO(1)=-20
- Suppress absolute tolerance 10^-20 in pivot selection for SYM=2; skip 2x2 pivot search if only 1 pivot candidate, avoid pivots that are subnormal numbers (their inverse is equal to infinity)
- Warning +2 now only occurs when solution is really close to 0
- Occasional bug in OOC and multiple instances solved
- Better selection of equations for bwd errors (W1 and W2) and better forward error estimates on some machines with 80-bit registers
- Improved users' guide (OOC files cleaning, permutation details, usage of multithreading, clarification of MegaByte unit)
- Cleaning of asynchronous messages after facto/solve was revisited and is more robust
- More robust suppression of integer overflow risk during solve for huge ICNTL(23)
- Improved performance of symbolic factorization in case of matrices with relatively dense rows and/or with large number of Lagrange multipliers
- Improved performance of numerical factorization phase during pivot search for symmetric indefinite matrices
- Use of -xcore-avx2 requires !DEC$ NOOPTIMIZE in MUMPS_BIT_GET4PROC with current versions of Intel compilers
- Suppressed some temporary array creation and implicit conversions

##### Changes from 5.0.0 to 5.0.1

- Iterative refinement convergence check corrected (problem introduced in 5.0.0)
- Used communicator provided by user instead of MPI_COMM_WORLD in two places (parallel analysis only)
- Matlab interface patched to avoid memory corruption in some situations (Schur, colsca/rowsca management)
- Corrected a case of error not properly processed which could cause a segfault instead of a standard "-9" error, or an abort on "ERR: ERROR: NBROWS > NBROWF"
- Amalgamation without fill forced for single children
- (rare) segfault related to assemblies of delayed columns in scalapack root node corrected
- Automatic strategy for ordering choice improved
- Further improvements to userguide (mainly iterative refinement, error analysis, discard factors and forward elimination during factorization)
- Error -51 also raised in case of integer overflow during parallel analysis

##### Changes from 4.10.0 to 5.0.0

- Userguide revisited
- Compatibility with Metis 5.1.0/ParMetis 4.0.3, and with SCOTCH/pt-SCOTCH 6.0
- Matlab interface updated (scaling vectors (COLSCA, ROWSCA) and A-1 feature ICNTL(30) are now available)
- Improved sequential and parallel performance for computing selected entries of A-1 (ICNTL(30))
- Workspace for solve phase, of size B x N per processor (B: block size controlled by ICNTL(27)) divided by almost #procs. Default value of B increased.
- Parallel symmetric indefinite elemental matrices: improved numerical behaviour
- Performance of solve phase improved
- Finer control of error analysis and iterative refinement (ICNTL(11))
- Memory for analysis phase (mapping) reduced.
- Better support for 64-bit integers (see INSTALL file)
- Error raised instead of silent integer overflow during analysis (but not during external orderings)
- Improvements and corrections to parallel analysis (ICNTL(28)), deterministic graph construction forced with -DDETERMINISTIC_PARALLEL_GRAPH
- Forward elimination (ICNTL(32)) can be done during factorization
- Possibility to use a workspace (WK_USER, LWK_USER) allocated by user
- Very occasional numerical bug in parallel out-of-core case corrected (thanks to EDF and Samtech for the validation)
- More efficient processing of sparse right-hand-sides (see ICNTL(20))
- Count for entries in factors now include parallel root node
- Amalgamation of the assembly tree revisited
- Scaling arrays (COLSCA, ROWSCA) also returned at C interface level
- OOC_NB_FILE_TYPE is part of the MUMPS structure, for a better management of multiple OOC instances
- Warning +2 set only once (could lead to incorrect +4 in case of iterative refinement + error analysis)
- Warning +4 has disappeared from documentation (since it was never occurring -- JCN never modified on exit)
- Error code -16 now raised for the case N=0 even on distributed matrices (thanks to P. Jolivet for noticing this)
- Use BLAS3 routines for efficiency even in case of BLAS2 operations (-DMUMPS_USE_BLAS2 allows the use of BLAS2 routines for such operations)
- Message "problem with NIV2_FLOPS message" should no more occur (there was still an occasional problem in 4.10.0)
- Improved determinant computation (ICNTL(33)) in case of singular matrix + scaling (where zero pivots are excluded)
- Trace ' PANEL: INIT and force STRAT_IO=' suppressed
- Some OpenMP directives added (multithreaded BLAS still needed)
- Later allocation of strips of distributed fronts with improved locality
- Front factorization algorithms redesigned (two levels of panels)
- Null pivot (ICNTL(24)) and null space detection ICNTL(25)) improved for unsymmetric matrices
- Fortran automatic arrays (e.g. in mumps_static_mapping.F) suppressed to avoid risks of stack overflows
- Routine names and filenames changed

##### Changes from 4.9.2 to 4.10.0

- Modified variable names and variable contents in Make.inc/Makefile* for Windows (Makefile.inc from an older version needs modifications, please do a diff)
- Option to discard factors during factorization when not needed (ICNTL(31))
- Option to compute the determinant (ICNTL(33))
- Experimental "A-1" functionality (ICNTL(30))
- Matlab interface updated for 64-bit machines
- Improved users' guide
- Suppressed a memory leak occurring when Scalapack is used and user does loops on JOB=6 without JOB=-2/JOB=-1 in-between
- Avoid occasional deadlock with huge values of ICNTL(14)
- Avoid problem of -17 error code during solve phase
- Avoid checking association of pointer arrays ISOL_loc and SOL_loc on procs with no components of solution (small problems)
- Some data structures were not free at the end of the parallel analysis. Bug fixed.
- Fixed unsafe test of overflow "IF (WFLG+N .LE. WFLG)"
- Large Schur complements sent by blocks if ICNTL(19)=1 (but options ICNTL(19)=2 or 3 are recommended when Schur complement is large)
- Corrected problem with sparse RHS + unsymmetric permutation + transpose solve (problem appeared in 4.9)
- Case where ICNTL(19)=2 or 3 and small value of SIZE_SCHUR causing problems in parallel solved.
- In case an error is detected, solved occasional problem of deallocating non-allocated local array PERM.
- Correction in computation of matrix norm in complex arithmetic (MPI_COMPLEX was used in place of MPI_REAL in MPI_REDUCE)
- Scaling works on singular matrices
- Compilation problem with -i8 solved
- MUMPS_INT used in OOC layer to facilitate compilation with 64 bit integers

##### Changes from 4.9.1 to 4.9.2

- Compressed orderings (ICNTL(12)=2) are now compatible with PORD and PT-Scotch
- Mapping problem on large numbers of MPI processes, leading to INFOG(1)=-135 on "special" matrices solved (problem appeared in 4.9.1)

##### Changes from 4.9 to 4.9.1

- Balancing on the processors of both work and memory improved. In a parallel environment memory consumption should be reduced and performance improved
- Modification of the amalgamation to solve both the problem of small root nodes and the problem of tiny nodes implying too many small MPI messages
- Corrected bug occurring on big-endian environments when passing a 64-bit integer argument in place of 32-bit one. This was causing problems in parallel, when ScaLAPACK is used, on IBM machines.
- Internal ERROR 2 in MUMPS_271 now impossible (was already not happening in practice)
- Solved compiler warnings (or even errors) related to the order of the declarations of arrays and array sizes
- Parallel analysis: fixed the problem due to the invocation of the size function on non-allocated pointers, corrected a bug due to initialization of pointers in the declaration statements, and improved the Makefiles
- Corrected bug in the reallocation of arrays
- Corrected several accesses to uninitialized variables
- Internal Error (4) in OOC (MUMPS_597) no more occurs
- Suppressed possible printing of "Internal WARNING 1 in CMUMPS_274"
- (Minor) numerical pivoting problem in parallel LDLt solved
- Estimated flops corrected when SYM=2 and Scalapack is used (because we use LU on the root node, not LDLt, in that case)
- Scaling option effectively used is now returned in INFOG(33) and ICNTL(8) is no more modified by the package
- INFO(25) is now correctly documented, new statistic INFO(27) added

##### Changes from 4.8.4 to 4.9

- Parallel analysis available
- Use of 64-bit integer addressing for large internal workarrays
- overflow in computation of INFO(9) in out-of-core corrected
- fixed Matlab and Scilab interfaces to sparse RHS functionality
- time cost of analysis reduced for "optimisation" matrices
- time to gather solution on processor 0 reduced and automatic copying of some routine arguments by some compilers resolved.
- extern "C" added to header file mpi.h of libseq for C++ compilers
- Problem with NZ_loc=0 and scaling with ifort 10 solved
- Statistics about current state of the factorization produced/printed even in case of error.
- Avoid using complex arrays as real workspace (complex versions)
- New error code -40 (instead of -10) when SYM=1 is used and ScaLAPACK detects a negative pivot
- Solved problem of "Internal error 1" in [SDCZ]MUMPS_264 and [SDCZ]MUMPS_274
- Solved undeterministic bug occurring with asynchronous OOC + panels when uninitialized memory access had value -7777
- Fixed a remaining problem with OOC filenames having more than 150 characters
- Fixed some problems related to the usage of intrinsic functions inside PARAMETER statements (HP-UX compilers)
- Fixed problem of explicit interface in [SDCZ]MUMPS_521
- Out-of-core strategy from 4.7.3 can be reactivated with -DOLD_OOC_NOPANEL
- Message "problem with NIV2_FLOPS message" should no more occur
- Avoid compilation problem with old versions of gfortran

##### Changes from 4.8.3 to 4.8.4

- Absolute threshold criterion for null pivot detection added to CNTL(3)
- Problems related to messages "Increase small buffer size ..." solved.
- New option for ICNTL(8) to scale matrices. Default scaling cheaper to compute
- Problem of filename clash with unsymmetric matrices on Windows platforms solved
- Allow for longer filenames for temporary OOC files
- Strategy to update blocksize during factorization of frontal matrices modified to avoid too large messages during pipelined factorization (that could lead to a -17 error code)
- Messages corresponding to delayed pivots can now be sent in several packets. This avoids some other cases of error -17
- One rare case of deadlock solved
- Corrected values and sign of INFO(8) and INFO(20)

##### Changes from 4.8.2 to 4.8.3

- Fix compilation issues on Windows platforms
- Fix ranlib issue with libseq on MacOSX platforms
- Fix a few problems of uninitialized variables

##### Changes from 4.8.1 to 4.8.2

- Problem of wrong argument in the call to [sdcz]mumps_246 solved
- Limit occurrence of error -11 in the in-core case
- Problem with the use of SIZE on an unassociated pointer solved
- Problem with distributed solution combined with non-working host solved
- Fix generation of MM matrices
- Fix of a minor bug in OOC error management
- Fix portability issues on usleep

##### Changes from 4.8.0 to 4.8.1

- New distributed scaling is now on by default for distributed matrices
- Error management corrected in case of 32-bit overflow during factorization
- SEPARATOR is now defined as "\\" in Windows version
- Bug fix in OOC panel version

##### Changes from 4.7.3 to 4.8.0

- Parallel scalings algorithms available
- Possibility to dump a matrix in matrix-market format from both C and Fortran interfaces
- Correction when dumping a distributed matrix in matrix-market format
- Minor numerical stability problem in some LDL^t parallel factorizations corrected.
- Memory usage significantly reduced in both parallel and sequential (limit communication buffers, in-place assembly for assembled matrices, overlapping during stack).
- Better alignment properties of mumps_struc.h
- Reduced time for static mapping during the analysis phase.
- Correction in dynamic scheduler
- "Internal error 2 in DMUMPS_26" no more occurs, even if SIZE_SCHUR=0
- Corrections in the management of ICNTL(25), some useful code was protected with -Dtry_null_space and not compiled.
- Scaling arrays are now declared real even in complex versions
- Out-of-core functionality storing factors on disk
- Possibility to tell MUMPS how much memory the package is allowed to allocate (ICNTL(23))
- Estimated and effective number of entries in factors returned to user
- API change: MAXS and MAXIS have disappeared from the interface, please use ICNTL(14) and ICNTL(23) to control the memory usage
- Error code -11 raised less often, especially in out-of-core executions
- Error code -14 should no more occur
- Memory used at the solve phase is now returned to the user
- Possibility to control the blocking size for multiple right-hand sides (strong impact on performance, in particular for out-of-core executions)
- Solved problems of 32-bit integer overflows during analysis related to memory estimations.
- New error code -37 related to integer overflows during factorization
- Compile one single arithmetic with make s, make d, make c or make z, examples are now in examples/, test/ has disappeared.
- Arithmetic-independent parts are isolated into a libmumps_common.a, that must now be linked too (see examples/Makefile).

##### Changes from 4.7.2 to 4.7.3

- detection of null pivots for unsymmetric matrices corrected
- improved pivoting in parallel symmetric solver
- possible problem when Schur on and out-of-core : Schur was splitted
- type of parameters of intrinsic function MAX not compatible in single precision arithmetic versions.
- minor changes for Windows
- correction with reduced RHS functionality in parallel case

##### Changes from 4.7.1 to 4.7.2

- negative loads suppressed in mumps distribution

##### Changes from 4.7 to 4.7.1

- Release number in Fortran interface corrected
- "Negative load !!" message replaced by a warning

##### Changes from 4.6.4 to 4.7

- New functionality: build reduced RHS / use partial solution
- New functionality: detection of zero pivots
- Memory reduced (especially communication buffers)
- Problem of integer overflow "MEMORY_SENT" corrected
- Error code -20 used when receive buffer too small (instead of -17 in some cases)
- Erroneous memory access with singular matrices (since 4.6.3) corrected
- Minor bug correction in hybrid scheduler
- Parallel solution step uses less memory
- Performance and memory usage of solution step improved
- String containing the version number now available as a component of the MUMPS structure
- Case of error "-9964" has disappeared

##### Changes from 4.6.3 to 4.6.4

- Avoid name clashes (F_INT, ...) when C interface is used and user wants to include, say, smumps_c.h, zmumps_c.h (etc.) at the same time
- Avoid large array copies (by some compilers) in distributed matrix entry functionality
- Default ordering less dependent on number of processors
- New garbage collector for contribution blocks
- Original matrix in "arrowhead form" on candidate processors only (assembled case)
- Corrected bug occurring rarely, on large number of processors, and that depended on value of uninitialized data
- Parallel LDL^t factorization numerically improved
- Less memory allocation in mapping phase (in some cases)

##### Changes from 4.6.2 to 4.6.3

- Reduced memory usage for symmetric matrices (compressed CB)
- Reduced memory allocation for parallel executions
- Scheduler parameters for parallel executions modified
- Memory estimates (that were too large) corrected with 2Dcyclic Schur complement option
- Portability improved (C/Fortran interfacing for strings)
- The situation leading to Warning "RHS associated in MUMPS_301" no more occurs.
- Parameters INFO/RINFO from the Scilab/Matlab API are now called INFOG/RINFOG in order to match the MUMPS user's guide.

##### Changes from 4.6.1 to 4.6.2

- Metis ordering now available with Schur option
- Schur functionality correctly working with Scilab interface
- Occasional SIGBUS problem on single precision versions corrected

##### Changes from 4.6 to 4.6.1

- Problem with hybrid scheduler and elemental matrix entry corrected
- Improved numerical processing of symmetric matrices with quasi-dense rows
- Better use of Blacs/Scalapack on processor grids smaller than MPI_COMM_WORLD
- Block sizes improved for large symmetric matrices

##### Changes from 4.5.6 to 4.6

- Official release with Scilab and Matlab interfaces available
- Correction in 2x2 pivots for symmetric indefinite complex matrices
- New hybrid scheduler active by default

##### Changes from 4.5.5 to 4.5.6

- Preliminary developments for an out-of-core code (not yet available)
- Improvement in parallel symmetric indefinite solver
- Preliminary distribution of a SCILAB and a MATLAB interface to MUMPS.

##### Changes from 4.5.4 to 4.5.5

- Improved tree management
- Improved weighted matching preprocessing: duplicates allowed, overflow avoided, dense rows
- Improved strategy for selecting default ordering
- Improved node amalgamation

##### Changes from 4.5.3 to 4.5.4

- Double complex version no more depends on double precision version.
- Simplification of some complex indirections in mumps_cv.F that were causing difficultiels to some compilers.

##### Changes from 4.5.2 to 4.5.3

- Correction of a minor problem leading to INFO(1)=-135 in some cases.

##### Changes from 4.5.1 to 4.5.2

- correction of two uninitialized variables in proportional mapping

##### Changes from 4.5.0 to 4.5.1

- better management of contribution messages
- minor modifications in symmetric preprocessing step

##### Changes from 4.4.0 to 4.5.0

- improved numerical features for symmetric indefinite matrices
- two-by-two pivots
- symmetric scaling
- ordering based on compressed graph preserving two by two pivots
- constrained ordering

- 2D cyclic Schur better validated
- problems resulting from automatic array copies done by compiler corrected
- reduced memory requirement for maximum transversal features

##### Changes from 4.3.4 to 4.4.0

- 2D block cyclic Schur complement matrix
- symmetric indefinite matrices better handled
- Right-hand side vectors can be sparse
- Solution can be kept distributed on the processors
- Metis allowed for element-entry
- Parallel performance and memory usage improved:
- load is updated more often for type 2 nodes
- scheduling under memory constraints
- reduced message sizes in symmetric case
- some linear searches avoided when sending contributions

- Avoid array copies in the call to the partial mapping routine (candidates); such copies appeared with intel compiler version 8.0.
- Workaround MPI_AllReduce problem with booleans if mpich and MUMPS are compiled with different compilers
- Reduced message sizes for CB blocks in symmetric case
- Various minor improvements

##### Changes from 4.3.3 to 4.3.4

- Copies of some large CB blocks suppressed in local assemblies from child to parent
- gathering of solution optimized in solve phase

##### Changes from 4.3.2 to 4.3.3

- Control parameters of symbolic factorization modified.
- Global distribution time and arrowheads computation slightly optimized.
- Multiple Right-Hand-Side implemented.

##### Changes from 4.3.1 to 4.3.2

- Thresholds for symbolic factorization modified.
- Use merge sort for candidates (faster)
- User's communicator copied when entering MUMPS
- Code to free CB areas factorized in various places
- One array suppressed in solve phase

##### Changes from 4.3 to 4.3.1

- Memory leaks in PORD corrected
- Minor compilation problem on T3E solved
- Avoid taking into account absolute criterion CNTL(3) for partial LDLt factorization when whole column is known (relative stability is enough).
- Symbol MPI_WTICK removed from mpif.h
- Bug wrt inertia computation INFOG(12) corrected

##### Changes from 4.2beta to 4.3

- C INTERFACE CHANGE: comm_fortran must be defined from the calling program, since MUMPS uses a Fortran communicator (see user guide).
- LAPACK library is no more required
- User guide improved
- Default ordering changed
- Return number of negative diagonal elements in LDLt factorization (except for root node if treated in parallel)
- Rank-revealing options no more available by default
- Improved parallel performance
- new incremental mechanism for load information
- new communicator dedicated to load information
- improved candidate strategy
- improved management of SMP platforms

- Include files can be used in both free and fixed forms
- Bug fixes:
- some uninitialized values
- pbs with size of data on t3e
- minor problems corrected with distributed matrix entry
- count of negative pivots corrected
- AMD for element entries
- symbolic factorization
- memory leak in tree reordering and in solve step

- Solve step uses less memory (and should be more efficient)

##### Changes from 4.1.6 to 4.2beta

- More precisions available (single, double, complex, double complex).
- Uniprocessor version available (doesn't require MPI installed)
- Interface changes (Users of MUMPS 4.1.6 will have to slightly
modify their codes):
- MUMPS -> ZMUMPS, CMUMPS, SMUMPS, DMUMPS depending the precision
- the Schur complement matrix should now be allocated by the user before the call to MUMPS
- NEW: C interface available.
- ICNTL(6)=6 in 4.1.6 (automatic choice) is now ICNTL(6)=7 in 4.2

- Tighter integration of new ordering packages (for assembled matrices),
see the description of ICNTL(7):
- AMF,
- Metis,
- PORD,

- Memory usage decreased and memory scalability improved.
- Problem when using multiple instances solved.
- Various improvments and bug fixes.

##### Changes from 4.1.4 to 4.1.6

- Modifications/Tuning done by P.Amestoy during his visit at NERSC.
- Additional memory and communication statistics.
- minor pbs solved.

##### Changes from 4.0.4 to 4.1.4

- Tuning on Cray T3e (and minor debugging)
- Improved strategy for asynchronous communications (irecv during factorization)
- Improved Dynamic scheduling and splitting strategies
- New maximal transversal strategies
- New Option (default) automatic decision for scaling and maximum transversal

#### Release history

Release 5.6.1 | : July 2023 |

Release 5.6.0 | : April 2023 |

Release 5.5.1 | : July 2022 |

Release 5.5.0 | : April 2022 |

Release 5.4.1 | : August 2021 |

Release 5.4.0 | : April 2021 |

Release 5.3.5 | : October 2020 |

Release 5.3.4 | : September 2020 |

Release 5.3.3 | : June 2020 |

Release 5.3.2[.x] | : May 2020, internal/experimental |

Release 5.3.1 | : April 2020 |

Release 5.3.0 | : April 2020 |

Release 5.2.1 | : June 2019 |

Release 5.2.0 | : April 2019 |

Release 5.1.2 | : October 2017 |

Release 5.1.1 | : March 2017 |

Release 5.1.0 | : Feb 2017, internal release (limited diffusion) |

Release 5.0.2 | : July 2016 |

Release 5.0.1 | : July 2015 |

Release 5.0.0 | : February 2015 |

Release 4.10.0 | : May 2011 |

Release 4.9.2 | : November 2009 |

Release 4.9.1 | : October 2009 |

Release 4.9 | : July 2009 |

Release 4.8.4 | : December 2008 |

Release 4.8.3 | : September 2008 |

Release 4.8.2 | : September 2008 |

Release 4.8.1 | : August 2008 |

Release 4.8.0 | : July 2008 |

Release 4.7.3 | : May 2007 |

Release 4.7.2 | : April 2007 |

Release 4.7.1 | : April 2007 |

Release 4.7 | : April 2007 |

Release 4.6.4 | : January 2007 |

Release 4.6.3 | : June 2006 |

Release 4.6.2 | : April 2006 |

Release 4.6.1 | : February 2006 |

Release 4.6 | : January 2006 |

Release 4.5.6 | : December 2005, internal release |

Release 4.5.5 | : October 2005 |

Release 4.5.4 | : September 2005 |

Release 4.5.3 | : September 2005 |

Release 4.5.2 | : September 2005 |

Release 4.5.1 | : September 2005 |

Release 4.5.0 | : July 2005 |

Releases 4.3.3 -- 4.4.3 | : internal releases |

Release 4.3.2 | : November 2003 |

Release 4.3.1 | : October 2003 |

Release 4.3 | : July 2003 |

Release 4.2 (beta) | : December 2002 |

Release 4.1.6 | : March 2000 |

Release 4.0.4 | : Wed Sept 22, 1999 <-- Final version from PARASOL |