MUMPS Download Page

Conditions of use
Download request submission
History / Changelog

Conditions of use

This version of MUMPS is provided to you free of charge. It is released under the CeCILL-C license
(see doc/CeCILL-C_V1-en.txt, doc/CeCILL-C_V1-fr.txt, and https://cecill.info/licences/Licence_CeCILL-C_V1-en.html), except for variants of AMD ordering and [sdcz]MUMPS_TRUNCATED_RRQR derived from the LAPACK package distributed under BSD 3-clause license (see headers of ana_orderings.F and [sdcz]lr_core.F), and except for the external and optional ordering PORD provided in a separate directory PORD (see PORD/README for License information).

You can acknowledge (using references [1] and [2]) the contribution of this package in any scientific publication dependent upon the use of the package. Please use reasonable endeavours to notify the authors of the package of this publication.

[1] P. R. Amestoy, I. S. Duff, J. Koster and J.-Y. L'Excellent, A fully asynchronous multifrontal solver using distributed dynamic scheduling, SIAM Journal on Matrix Analysis and Applications, Vol 23, No 1, pp 15-41 (2001).

[2] P. R. Amestoy, A. Buttari, J.-Y. L'Excellent and T. Mary, Performance and scalability of the block low-rank multifrontal factorization on multicore architectures, ACM Transactions on Mathematical Software, Vol 45, Issue 1, pp 2:1-2:26 (2019)

As a counterpart to the access to the source code and rights to copy, modify and redistribute granted by the license, users are provided only with a limited warranty and the software's author, the holder of the economic rights, and the successive licensors have only limited liability.

In this respect, the user's attention is drawn to the risks associated with loading, using, modifying and/or developing or reproducing the software by the user in light of its specific status of free software, that may mean that it is complicated to manipulate, and that also therefore means that it is reserved for developers and experienced professionals having in-depth computer knowledge. Users are therefore encouraged to load and test the software's suitability as regards their requirements in conditions enabling the security of their systems and/or data to be ensured and, more generally, to use and operate it in the same conditions as regards security.

The fact that you are presently reading this means that you have had knowledge of the CeCILL-C license and that you accept its terms.

BibTeX references

Download request submission

To obtain the MUMPS package, please fill in the form below with care, and you will receive it by separate e-mail. Feedback or questions related to MUMPS are welcome. Please use the email address given at the beginning of the README file in the source code archive.

CHECKSUM (SHA256) for last release archive :

e91b6dcd93597a34c0d433b862cf303835e1ea05f12af073b06c32f652f3edd8

ChangeLog/History

ChangeLog

Changes from 5.8.0 to 5.8.1

Fixed explicit interface expected in [sdcz]ana_driver.F for ftn compiler
Fixed error during MPI communication due to irregular blocks in block format
Fixed compiler semantic error detected by amdflang 6.2 in atomic statement
Fixed error when rank-revealing (ICNTL(56)) on at analysis and off at facto
Fixed performance of ICNTL(56) for highly reducible matrices
Fixed valgrind warning in [sdcz]mumps_arrow_treat_recv_buf_1th

Changes from 5.7.3 to 5.8.0

Memory estimations in an MPI context improved using stricter proportional mapping
Improved multithreading during scaling and reformat/distribute matrix
MPI buffer size not anymore constrained by factored panel sizes
Fixed a memory leak (COLSCA_LOC) in LDLT with save-restart
Enable different numbers of threads per MPI
Fixed a case where error -74 was raised instead of -77
More OpenMP parallelism when compacting factors of frontal matrices
Fixed a minor issue in BLR clustering when size of halo is 0
Provide statistics on iterative refinement settings and timings
Fixed error with binary format of WRITE_PROBLEM feature under WINDOWS
Avoid pivoting with SYM=1 even when user forces CNTL(1)>0
More multithreading in low intensity arithmetic kernels
Fixed error with ICNTL(37)=1
Fixed possible access to uninitialized var for SYM=0 + OOC + static pivoting
Fixed a valgrind warning in fac_front_*type1.F

Changes from 5.7.2 to 5.7.3

Fixed regression for I/O's larger than 2GB during solve (out-of-core)
Fixed size of internal workspace in case of OOC+BLR with ICNTL(37)=1
Fix: out-of-range values of ICNTL(58) are treated as 2, as documented

Changes from 5.7.1 to 5.7.2

Fixed a bound check error with ICNTL(37)=1 in case of symmetric numerically difficult problems
Fixed compilation with -DAVOID_MPI_IN_PLACE

Changes from 5.7.0 to 5.7.1

Fixed dynamic library creation for PORD
Avoid need to link with deprecated LAPACK routine [csdz]geqpf
Fixed a numerical error with ICNTL(37)=1 in case of numerically difficult problems and very limited workspace

Changes from 5.6.2 to 5.7.0

New feature: Evolution of the distributed RHS: exploit sparsity (empty rows)
New feature: more efficient multithreading (see ICNTL(48))
New feature: Improved null space detection with rank revealing (ICNTL(56))
New statistics about pivots and number of swaps
Reduced time spent in BLR clustering
OOC: add support for files > 2GB to avoid opening too many files (requires -DMUMPS_WINLARGEFILES in Windows environments)
Increased various string max sizes: OOC_PREFIX(255), SAVE_PREFIX(255), WRITE_PROBLEM(1023), OOC_TMPDIR(1023), SAVE_DIR(1023),
-DMUMPS_SCOTCHIMPORTOMPTHREADS: uses MUMPS OMP threads in SCOTCH
Improved size of BLR groups with analysis by blocks
Allow for larger N (avoid need for 2*N to fit in a 32-bit integer)
Improved performance on small matrices (avoid costly string comparisons when computing RINFO(7,8) and RINFOG(17,18))
-DNO_SAVE_RESTORE: suppress SAVE_RESTORE, RINFO(7,8), RINFOG(17,18)
Avoid calling MPI_IPROBE during solve in case of a single MPI process
MPI_ALLREDUCE workaround in case it fails for large buffers
Raise error -55 in case Nloc_RHS>0 on non-working host
Raise new error -88 in case of error during SCOTCH ordering
Fixed bound-check error in copies for 64 bit integer orderings and empty graph
Raise error -69 in case of installation with incorrect integer sizes
Compilation with -DNOSCALAPACK: suppress dependency on NUMROC
Fix for METIS > 5.1.0: METIS options indices no longer hardcoded
Fixed a bug in solve when restarting (JOB=8) with a new process
Fixed error with exploit sparsity during fwd phase on reducible matrices and when Scalapack is used
Makefiles: introduced SHARED_OPT=-shared to allow defining different option

Changes from 5.6.1 to 5.6.2

Fixed time for symbolic factorization
Fixed parallel analysis when PAR=0, 1 MPI per node, #MPI not a power of 2
-DAVOID_MPI_IN_PLACE avoids MPI_IN_PLACE usage (in case not supported by MPI)
Fixed a bug in case of parallel analysis+analysis by blocks+scalapack off
Added missing $(PLAT) for libpord in makefile
Fixed message printing with error -17

Changes from 5.6.0 to 5.6.1

Fixed compatibility of parallel analysis with analysis by blocks/out-of-range
Minor fix of INFO(2) value in a case of INFO(1)=-7 with parallel analysis
WRITE_PROBLEM: avoid leading space when printing "%%MatrixMarket matrix [..]"
Fix incorrect error code -74 instead of -76 (save-restore)
Avoid a remaining hard-coded Fortran unit (save-restore)
Avoid -Wundef warning in case __STDC_VERSION__ is not defined
Avoid calling MPI_IPROBE during factorization in case of a single MPI process (performance improvement on matrices with small fronts)
During analysis, write time for symbolic factorization
Update GEMMT link in INSTALL file

Changes from 5.5.1 to 5.6.0

Analysis by blocks and out-of-range entries compatible with parallel analysis
Compact workarray S before solution phase (ICNTL(49)=1,2)
JOB=-4 frees data from factorization and keeps results from analysis
NEC vector engine version: tuned block low rank and GEMMT usage
Reduced memory for symbolic datastructures (order-N arrays, arrowheads)
Use new symbolic factorization (column counts) by default (+allow forest in case of Schur)
Improved amalgamation algorithm to reduce the number tiny nodes
Discard factors option is now compatible with BLR
Discard L factors option in case of LU factorization available for in-core
Count null negative pivots (INFOG(50))
Compilation with -DBLR_MT is no longer needed (see also -DBLR_NOOPENMP)
Improve performance of A-1 entries computation when #MPI > 1
Fixed risk of hanging in case of OOC error (e.g. disk full)
Free internal data earlier (e.g., factors in case of failed factorization, etc.)
Avoid setenv/unsetenv on MinGW environments (Scotch versions >= 7)
Fix error during ordering when processing matrices of order 1
Out-Of-Core, values of ICNTL(22) different from 1 are treated as zero (in-core)
BLKPTR ignored when ICNTL(15) < 0 (INFO(1:2)=-57 4 no longer occurs)
BLR OpenMP fac_lr.F fix for possible non-increasing scheduling of loop indices
Fix: id%LRGROUPS was not nullified in CMUMPS_FREE_ONENTRY_ANA_DRIVER
Fixed parameter error 1202 in PDGETRS after restoring a factorization
Workaround MPI_SSEND intel MPI 2021.6 issue
NEC: Fix MUMPS_WRAP_GINP94 offloading to host
Shared libraries: minor update in Makefiles
Fixed missing printings of preprocessing constants in mumps_print_defined.F
-DNOSCALAPACK: compilation without BLACS/ScaLAPACK (reduced performance)

Changes from 5.5.0 to 5.5.1

64-bit default integer version: avoid MPI buffer size > 2^31 (could lead to negative counts in MPI_Recv with Intel ilp64 MPI)
Use ICNTL(3) unit instead of '*' to print parallel ordering time
Output symbolic factorization choice (ICNTL(58)) on ICNTL(3) unit
SCOTCH 7.x on Windows: use putenv instead of setenv (unsymmetric case, #MPI >1)
Fixed parallel analysis in case of 64-bit integers in parmetis and -i8 MUMPS
Limit recursivity depth in mumpsstatic_mapping.F (limit stack usage on trees with small nodes at the top and large nodes below)
Limit subroutine names to 32 characters
-DWORKAROUNDILP64MPICUSTOMREDUCE introduced (platformMPI and 64-bit default integer)

Changes from 5.4.1 to 5.5.0

New feature: analysis by blocks with compressed graph (ICNTL(15))
New feature: symbolic factorization using column counts (ICNTL(58))
Matrix scaling compatible with Schur complement feature
Automatic setting of CNTL(3) and CNTL(5) compatible with Schur complement feature
Fixed possible error in memory (INFOG(3)) and FLOPS (RINFOG(1)) estimations with Schur feature
Memory allowed (ICNTL(23)) is per MPI process and compatible with WK_USER
More compact storage of L factors for symmetric matrices
Exploit multithreading in SCOTCH versions >= 7.0
Support for NEC Aurora vector processors
Possibility to build shared instead of static libraries
Adjustment of BLR cluster size
Shorter time for matrix redistribution
Avoid a possible case of overflow in computation of MAXIS
Fixed memory estimate of PTRAIW/PTRARW arrays
-DDPRINT_BACKTRACE_ON_ABORT added in case of call to MUMPS_ABORT()
More usage of dynamic memory to avoid -9 errors
Fixed a possible "Internal error 3 in ZMUMPS_BLR_RETRIEVE_PANEL_LORU"
Change construction of mumps_int_def.h to allow for cross compilation
Fixed a possible error in deficiency detection on unsymmetric matrices
Duplicated few loops during solve (OMP / non-OMP) for gfortran performance
Fixed compilation issue with -DUPPER and -DMUMPS_WIN32 (5.4.1 regression)
Find free Fortan units instead of imposing a choice (WRITE_PROBLEM, save/restore)
Avoid freeing IRN/JCN when not allocated by MUMPS
Fixed a potential memory leak in dynamic memory management
-DALLOC_FROM_C introduced to fix dynamic memory management with Cray compilers

Changes from 5.4.0 to 5.4.1

Added feature to dump a matrix/rhs in binary form (see id%WRITE_PROBLEM)
Fixed error during constrained ordering (ICNTL(12)=3) leading to a segfault
Fixed type 2 assembly by pieces in case of huge blocks of delayed pivots
Avoid repeated small allocations in MUMPS_BUILD_SORT_INDEX
Fixed automatic choice for ICNTL(6) when values are not provided at analysis
Avoid an access to an uninitialized value (FLOP_FRFRONTS, MPI only, LDLT)
Fixed error printing in case of error -53
Fixed an erroneous -9 error in case of large Schur + discard factors
-DWORKAROUNDINTELILP64OPENMPLIMITATION avoids -i8 compilation warnings
Limited repeated amalgamations of large child with small parent

Changes from 5.3.5 to 5.4.0

Modified default threshold for null pivot detection (CNTL(3))
Improved performance of Schur complement computations
Reduced memory of centralized analysis with distributed matrix
Improved BLR clustering time during analysis
Improved BLR performance when CNTL(1)=0.0 and ICNTL(36)=0
Improved OOC performance in case CNTL(1)=0.0 or SYM=1
Improved multitreaded performance on matrices with many 2x2 pivots
Fixed a problem with parallel compilation (in case of: make -j all)
Fixed some timings during solve that were printed as 0 when PAR=0
Avoid possible integer overflow on INFOG(10)
Fixed type 1 assembly by pieces in case static CB moves to dynamic storage
Use with ITAC: avoid errors (e.g. in DMUMPS_UPPER_PREDICT...) due to MPI returning nonzero error codes when traces are activated
Add a JOB=-200 call (no MPI communications, clean local OOC files)
Fixed "Start processing the root ..." printing in case of Schur
Fixed SYM_PERM in case of Schur on reducible matrices

Changes from 5.3.4 to 5.3.5

Fixed 2x2 pivots bug from 5.3.4 release in MPI LDLT factorization
Fixed ICNTL(8)=-2 option during analysis (code and documentation)
Include basic fix for make -j

Changes from 5.3.3 to 5.3.4

Fixed a rare bug (segfault) related to dynamic storage management on numerically difficult matrices
Fixed a rare deadlock in BLR for symmetric matrices
Fixed an uninitialized variable (which could lead to incorrect -19 error)
Minor fix in userguide (CNTL(1) vs. ICNTL(1) in ICNTL(36) description)
Fixed a possible runtime issue during solve, related to "TO_PROCESS" array

Changes from 5.3.1 to 5.3.3

Assume ilp64 MPI interface only applies to Fortran in c_example.c
Note on gfortran-10 compilation added
Avoid intent on pointers (F2003-only)
More robust multithreading for matrix reformatting (arrowheads)
Fixed ICNTL(31) interpretation in case of repeated analysis
Fixed multiple mpif.h inclusion (distributed rhs, ifort+openmpi)
Fixed computation of effectively used memory statistics
Minor fix in userguide
Suppressed a !$OMP CRITICAL from solve phase (introduced in 5.3.0)

Changes from 5.3.0 to 5.3.1

Improved multithreaded performance of BLR backward solve
Fixed return code in build_mumps_int_def.c + openmp compilation (pgi)
Forbid a loop vectorization in [sdcz]sol_c.F (segfault with ifort)

Changes from 5.2.1 to 5.3.0

New feature: distributed right-hand sides
Improved time for arrowheads construction (single MPI case, mainly)
C interface: ability to know if MUMPS_INT is 64-bit from include file
Improved BLR performance when CNTL(1)=0.0 and ICNTL(36)=1
Fixed INFO(34),INFO(35),INFO(37),INFO(38) on processes with rank > 0
More portable MPI_IS_IN_PLACE feature in libseq
Fixed determinant computation when Cholesky ScaLapack is used
Information on advancement (flops done) on each MPI process
Allow rhs_sparse and irhs_sparse to be unassociated if nz_rhs=0
Fixed INFO(30) and INFO(31) computation on MPI processes with rank > 0
OMP collapsed loops: avoid FIRSTPRIVATE on internal loop bound (for pgi)
Fix for compilers not freeing local allocatable arrays (64-bit metis)
Fixed RINFO(5-6) and RINFOG(15-16) metrics (entries=>bytes)
C interface: A_ELT/SCHUR/RHS/REDRHS/RHS_loc/SOL_loc may exceed 2^31 entries
Local Schur (ICNTL(19)=2 or 3) may now exceed 2^31 entries
Fixed internal dynamic storage of blocks with more than 2^31 entries
Fixed a bug in the parallel analysis that limited scalability

Changes from 5.2.0 to 5.2.1

Fixed a minor "Internal error in CMUMPS_DM_FREEALLDYNAMICCB"
Default value of ICNTL(14) for MPI executions independent of SYM + slightly less aggressive than for 5.2.0
Avoided accesses to uninitialized data in symmetric (2D root, BLR)
Fixed some incorrect "out" intents for routine arguments
Avoided CHUNK=0 in OMP loops even if loop not parallelized (pgi)
Fixed COLSCA&ROWSCA declarations in [SDCZ]MUMPS_ANA_F
Avoided a possible segfault in presence of NaN's in pivot search
Minor update to userguide
Fixed MPI_IN_PLACE usage in libseq (preventing compiler optimization)

Changes from 5.1.2 to 5.2.0

Memory gains due to low-rank factorization are now effective, low-rank solve
Internal dynamic storage possible in case static workspace too small
Improved distributed memory usage and MPI granularity (some sym. matrices)
Improved granularity (and performance) for symmetric matrices; ability to use [DSCZ]GEMMT kernel (BLAS extension) if available (see INSTALL)
A-1 functionality: improved performance due to solution gathering
Memory peak for analysis reduced (distributed-entry, 64-bit orderings)
Time for analysis reduced by avoiding some preprocessing (when possible)
More exploitation of RHS sparsity during forward substitution
Ability to save/restore an instance to/from disk
INFO and INFOG dimension extended from 40 to 80
METIS_OPTIONS introduced for METIS users to define some specific Metis options
MUMPS can be asked to call omp_set_num_threads with a value provided in ICNTL(16)
Fixed: INFO(16)/INFOG(21)/INFOG(22) did not take into account the extra memory allocated due to memory allowed (ICNTL(23)>0); INFOG(8) was not correclty set
Initialize only lower-diagonal part for workers in symmetric type 2 fronts
Workaround a segfault at beg. of facto due to a gfortran-8 bug
Fixed a bug in weighted matching algorithm when all matrix values are 0
Portability: include stdint.h instead of int_types.h
Forced some initializations to make C interface more valgrind-friendly
Workaround intel 2017 vectorization bug in pivot search (symmetric+MPI+large matrices)
Stop trying to send messages on COMM_LOAD in case of error (risk of deadlock)
Avoided most array creation by compiler due to Fortran pointers
Avoid two cases of int. overflow (KEEP(66), A-1 with large ICNTL(27))
Fixed a bug with compressed ordering (ICNTL(12)=2) (regression from 5.0.0) and suppress compress ordering only in case of automatic setting

Changes from 5.1.1 to 5.1.2

Corrected an overestimation of memory (regression from 5.1.0)
Corrected/extended WORKAROUNDINTELILP64MPI2INTEGER mechanism (see INSTALL)
Parallel analysis: fixed a bug, limited number of MPI processes on small problems, and reverted to sequential analysis on tiny problems. This is to avoid erroneous behavior and failures in the parallel ordering tools.
Faster BLR clustering on matrices with quasi-dense rows (which are skipped)
Improved performance of solve phase on very small matrices
Solve phase with a single MPI process is more thread-safe
Fixed compilation issue with opensolaris ([SDCZ]MUMPS_TRUNCATED_RRQR)
Fixed minor bug in BLR factorization (uninitialized timer)
Corrected minor compiler warnings
Minor correction to userguide
Add -DBLR_MT in Intel example Makefile

Changes from 5.1.0 to 5.1.1

Fix in parallel analysis
Stabilization of 5.1.0:
- Improved stability of Block-Low-Rank feature
- Corrected an incorrect deallocation of POSINRHSCOMP_COL
- Correction of a case of uninitialized data access in type 2 pivoting
- Suppressed occasional debug trace "write(6,*) " KEEP265= ", KEEP265"

Changes from 5.0.2 to 5.1.0

New feature: selective 64-bit integers (introduced only where needed) to process matrices with more than 2^{31}-1 entries. -mixed 32/64 bit integers for API: NNZ/NNZ_loc 64-bit (NZ/NZ_LOC kept temporarily for backward compatibility)
- both 32 or 64 bit integer versions of external orderings (Metis/ParMetis, SCOTCH/pt-SCOTCH, PORD), can be used
- Error -51 when a 32-bit external ordering is invoked on a graph larger than 2^{31}-1
New feature: (experimental) factorization based on Block-Low-Rank format, (ICNTL(35) to activate it and CNTL(7) for low-rank precision)
Improved performance on numerically hard matrices (LU and LDLt)
"-DALLOW_NON_INIT" flag has disappeared and needs no longer be used
Fixed incorrect deallocation in case of JOB=3/ICNTL(26)=1 followed by JOB=2
Fixed compilation problem with Intel2017 + openMPI in [sdcz]ana_aux_par.F
Minor correction of memory statistics for solve
Use 64-bit integers where needed during the solve phase to enable large number of right-hand-sides (NRHS) in one block (i.e. ICNTL(27)xN can be larger than 2^{31}-1)
Improved performance of solve phase
Allow pivoting thresholds CNTL(1) equal to 1.0
New error -52: when default Fortran integers are 64 bit, external orderings should also have 64-bit default integers
New error -22, INFO(2)=16 when IRN_loc or JCN_loc not associated while ICNTL(18) is set to 3
Missing O_BINARY flag was added to open binary files on MINGW systems
New error -53 that could reflect a matrix structure change between analysis and factorization

Changes from 5.0.1 to 5.0.2

Suppress error on id%SCHUR_CINTERFACE in mumps_driver.F when bound check is enabled and when using 2D block cyclic Schur complement feature (ICNTL(19)=2 or 3) from C or Matlab interfaces
Problem of failed assertion in [SDCZ]MUMPS_TREAT_DESCBAND solved (static variable INODE_WAITED_FOR was not initialized and was not detected by valgrind)
Correction of very minor memory leaks and access to uninitialized data
A setting of INFO(1)=-1-17 should have been INFO(1)=-17
Some settings of INFO(1)=-17 should have been INFO(1)=-20
Suppress absolute tolerance 10^-20 in pivot selection for SYM=2; skip 2x2 pivot search if only 1 pivot candidate, avoid pivots that are subnormal numbers (their inverse is equal to infinity)
Warning +2 now only occurs when solution is really close to 0
Occasional bug in OOC and multiple instances solved
Better selection of equations for bwd errors (W1 and W2) and better forward error estimates on some machines with 80-bit registers
Improved users' guide (OOC files cleaning, permutation details, usage of multithreading, clarification of MegaByte unit)
Cleaning of asynchronous messages after facto/solve was revisited and is more robust
More robust suppression of integer overflow risk during solve for huge ICNTL(23)
Improved performance of symbolic factorization in case of matrices with relatively dense rows and/or with large number of Lagrange multipliers
Improved performance of numerical factorization phase during pivot search for symmetric indefinite matrices
Use of -xcore-avx2 requires !DEC$ NOOPTIMIZE in MUMPS_BIT_GET4PROC with current versions of Intel compilers
Suppressed some temporary array creation and implicit conversions

Changes from 5.0.0 to 5.0.1

Iterative refinement convergence check corrected (problem introduced in 5.0.0)
Used communicator provided by user instead of MPI_COMM_WORLD in two places (parallel analysis only)
Matlab interface patched to avoid memory corruption in some situations (Schur, colsca/rowsca management)
Corrected a case of error not properly processed which could cause a segfault instead of a standard "-9" error, or an abort on "ERR: ERROR: NBROWS > NBROWF"
Amalgamation without fill forced for single children
(rare) segfault related to assemblies of delayed columns in scalapack root node corrected
Automatic strategy for ordering choice improved
Further improvements to userguide (mainly iterative refinement, error analysis, discard factors and forward elimination during factorization)
Error -51 also raised in case of integer overflow during parallel analysis

Changes from 4.10.0 to 5.0.0

Userguide revisited
Compatibility with Metis 5.1.0/ParMetis 4.0.3, and with SCOTCH/pt-SCOTCH 6.0
Matlab interface updated (scaling vectors (COLSCA, ROWSCA) and A-1 feature ICNTL(30) are now available)
Improved sequential and parallel performance for computing selected entries of A-1 (ICNTL(30))
Workspace for solve phase, of size B x N per processor (B: block size controlled by ICNTL(27)) divided by almost #procs. Default value of B increased.
Parallel symmetric indefinite elemental matrices: improved numerical behaviour
Performance of solve phase improved
Finer control of error analysis and iterative refinement (ICNTL(11))
Memory for analysis phase (mapping) reduced.
Better support for 64-bit integers (see INSTALL file)
Error raised instead of silent integer overflow during analysis (but not during external orderings)
Improvements and corrections to parallel analysis (ICNTL(28)), deterministic graph construction forced with -DDETERMINISTIC_PARALLEL_GRAPH
Forward elimination (ICNTL(32)) can be done during factorization
Possibility to use a workspace (WK_USER, LWK_USER) allocated by user
Very occasional numerical bug in parallel out-of-core case corrected (thanks to EDF and Samtech for the validation)
More efficient processing of sparse right-hand-sides (see ICNTL(20))
Count for entries in factors now include parallel root node
Amalgamation of the assembly tree revisited
Scaling arrays (COLSCA, ROWSCA) also returned at C interface level
OOC_NB_FILE_TYPE is part of the MUMPS structure, for a better management of multiple OOC instances
Warning +2 set only once (could lead to incorrect +4 in case of iterative refinement + error analysis)
Warning +4 has disappeared from documentation (since it was never occurring -- JCN never modified on exit)
Error code -16 now raised for the case N=0 even on distributed matrices (thanks to P. Jolivet for noticing this)
Use BLAS3 routines for efficiency even in case of BLAS2 operations (-DMUMPS_USE_BLAS2 allows the use of BLAS2 routines for such operations)
Message "problem with NIV2_FLOPS message" should no more occur (there was still an occasional problem in 4.10.0)
Improved determinant computation (ICNTL(33)) in case of singular matrix + scaling (where zero pivots are excluded)
Trace ' PANEL: INIT and force STRAT_IO=' suppressed
Some OpenMP directives added (multithreaded BLAS still needed)
Later allocation of strips of distributed fronts with improved locality
Front factorization algorithms redesigned (two levels of panels)
Null pivot (ICNTL(24)) and null space detection ICNTL(25)) improved for unsymmetric matrices
Fortran automatic arrays (e.g. in mumps_static_mapping.F) suppressed to avoid risks of stack overflows
Routine names and filenames changed

Changes from 4.9.2 to 4.10.0

Modified variable names and variable contents in Make.inc/Makefile* for Windows (Makefile.inc from an older version needs modifications, please do a diff)
Option to discard factors during factorization when not needed (ICNTL(31))
Option to compute the determinant (ICNTL(33))
Experimental "A-1" functionality (ICNTL(30))
Matlab interface updated for 64-bit machines
Improved users' guide
Suppressed a memory leak occurring when Scalapack is used and user does loops on JOB=6 without JOB=-2/JOB=-1 in-between
Avoid occasional deadlock with huge values of ICNTL(14)
Avoid problem of -17 error code during solve phase
Avoid checking association of pointer arrays ISOL_loc and SOL_loc on procs with no components of solution (small problems)
Some data structures were not free at the end of the parallel analysis. Bug fixed.
Fixed unsafe test of overflow "IF (WFLG+N .LE. WFLG)"
Large Schur complements sent by blocks if ICNTL(19)=1 (but options ICNTL(19)=2 or 3 are recommended when Schur complement is large)
Corrected problem with sparse RHS + unsymmetric permutation + transpose solve (problem appeared in 4.9)
Case where ICNTL(19)=2 or 3 and small value of SIZE_SCHUR causing problems in parallel solved.
In case an error is detected, solved occasional problem of deallocating non-allocated local array PERM.
Correction in computation of matrix norm in complex arithmetic (MPI_COMPLEX was used in place of MPI_REAL in MPI_REDUCE)
Scaling works on singular matrices
Compilation problem with -i8 solved
MUMPS_INT used in OOC layer to facilitate compilation with 64 bit integers

Changes from 4.9.1 to 4.9.2

Compressed orderings (ICNTL(12)=2) are now compatible with PORD and PT-Scotch
Mapping problem on large numbers of MPI processes, leading to INFOG(1)=-135 on "special" matrices solved (problem appeared in 4.9.1)

Changes from 4.9 to 4.9.1

Balancing on the processors of both work and memory improved. In a parallel environment memory consumption should be reduced and performance improved
Modification of the amalgamation to solve both the problem of small root nodes and the problem of tiny nodes implying too many small MPI messages
Corrected bug occurring on big-endian environments when passing a 64-bit integer argument in place of 32-bit one. This was causing problems in parallel, when ScaLAPACK is used, on IBM machines.
Internal ERROR 2 in MUMPS_271 now impossible (was already not happening in practice)
Solved compiler warnings (or even errors) related to the order of the declarations of arrays and array sizes
Parallel analysis: fixed the problem due to the invocation of the size function on non-allocated pointers, corrected a bug due to initialization of pointers in the declaration statements, and improved the Makefiles
Corrected bug in the reallocation of arrays
Corrected several accesses to uninitialized variables
Internal Error (4) in OOC (MUMPS_597) no more occurs
Suppressed possible printing of "Internal WARNING 1 in CMUMPS_274"
(Minor) numerical pivoting problem in parallel LDLt solved
Estimated flops corrected when SYM=2 and Scalapack is used (because we use LU on the root node, not LDLt, in that case)
Scaling option effectively used is now returned in INFOG(33) and ICNTL(8) is no more modified by the package
INFO(25) is now correctly documented, new statistic INFO(27) added

Changes from 4.8.4 to 4.9

Parallel analysis available
Use of 64-bit integer addressing for large internal workarrays
overflow in computation of INFO(9) in out-of-core corrected
fixed Matlab and Scilab interfaces to sparse RHS functionality
time cost of analysis reduced for "optimisation" matrices
time to gather solution on processor 0 reduced and automatic copying of some routine arguments by some compilers resolved.
extern "C" added to header file mpi.h of libseq for C++ compilers
Problem with NZ_loc=0 and scaling with ifort 10 solved
Statistics about current state of the factorization produced/printed even in case of error.
Avoid using complex arrays as real workspace (complex versions)
New error code -40 (instead of -10) when SYM=1 is used and ScaLAPACK detects a negative pivot
Solved problem of "Internal error 1" in [SDCZ]MUMPS_264 and [SDCZ]MUMPS_274
Solved undeterministic bug occurring with asynchronous OOC + panels when uninitialized memory access had value -7777
Fixed a remaining problem with OOC filenames having more than 150 characters
Fixed some problems related to the usage of intrinsic functions inside PARAMETER statements (HP-UX compilers)
Fixed problem of explicit interface in [SDCZ]MUMPS_521
Out-of-core strategy from 4.7.3 can be reactivated with -DOLD_OOC_NOPANEL
Message "problem with NIV2_FLOPS message" should no more occur
Avoid compilation problem with old versions of gfortran

Changes from 4.8.3 to 4.8.4

Absolute threshold criterion for null pivot detection added to CNTL(3)
Problems related to messages "Increase small buffer size ..." solved.
New option for ICNTL(8) to scale matrices. Default scaling cheaper to compute
Problem of filename clash with unsymmetric matrices on Windows platforms solved
Allow for longer filenames for temporary OOC files
Strategy to update blocksize during factorization of frontal matrices modified to avoid too large messages during pipelined factorization (that could lead to a -17 error code)
Messages corresponding to delayed pivots can now be sent in several packets. This avoids some other cases of error -17
One rare case of deadlock solved
Corrected values and sign of INFO(8) and INFO(20)

Changes from 4.8.2 to 4.8.3

Fix compilation issues on Windows platforms
Fix ranlib issue with libseq on MacOSX platforms
Fix a few problems of uninitialized variables

Changes from 4.8.1 to 4.8.2

Problem of wrong argument in the call to [sdcz]mumps_246 solved
Limit occurrence of error -11 in the in-core case
Problem with the use of SIZE on an unassociated pointer solved
Problem with distributed solution combined with non-working host solved
Fix generation of MM matrices
Fix of a minor bug in OOC error management
Fix portability issues on usleep

Changes from 4.8.0 to 4.8.1

New distributed scaling is now on by default for distributed matrices
Error management corrected in case of 32-bit overflow during factorization
SEPARATOR is now defined as "\\" in Windows version
Bug fix in OOC panel version

Changes from 4.7.3 to 4.8.0

Parallel scalings algorithms available
Possibility to dump a matrix in matrix-market format from both C and Fortran interfaces
Correction when dumping a distributed matrix in matrix-market format
Minor numerical stability problem in some LDL^t parallel factorizations corrected.
Memory usage significantly reduced in both parallel and sequential (limit communication buffers, in-place assembly for assembled matrices, overlapping during stack).
Better alignment properties of mumps_struc.h
Reduced time for static mapping during the analysis phase.
Correction in dynamic scheduler
"Internal error 2 in DMUMPS_26" no more occurs, even if SIZE_SCHUR=0
Corrections in the management of ICNTL(25), some useful code was protected with -Dtry_null_space and not compiled.
Scaling arrays are now declared real even in complex versions
Out-of-core functionality storing factors on disk
Possibility to tell MUMPS how much memory the package is allowed to allocate (ICNTL(23))
Estimated and effective number of entries in factors returned to user
API change: MAXS and MAXIS have disappeared from the interface, please use ICNTL(14) and ICNTL(23) to control the memory usage
Error code -11 raised less often, especially in out-of-core executions
Error code -14 should no more occur
Memory used at the solve phase is now returned to the user
Possibility to control the blocking size for multiple right-hand sides (strong impact on performance, in particular for out-of-core executions)
Solved problems of 32-bit integer overflows during analysis related to memory estimations.
New error code -37 related to integer overflows during factorization
Compile one single arithmetic with make s, make d, make c or make z, examples are now in examples/, test/ has disappeared.
Arithmetic-independent parts are isolated into a libmumps_common.a, that must now be linked too (see examples/Makefile).

Changes from 4.7.2 to 4.7.3

detection of null pivots for unsymmetric matrices corrected
improved pivoting in parallel symmetric solver
possible problem when Schur on and out-of-core : Schur was splitted
type of parameters of intrinsic function MAX not compatible in single precision arithmetic versions.
minor changes for Windows
correction with reduced RHS functionality in parallel case

Changes from 4.7.1 to 4.7.2

negative loads suppressed in mumps distribution

Changes from 4.7 to 4.7.1

Release number in Fortran interface corrected
"Negative load !!" message replaced by a warning

Changes from 4.6.4 to 4.7

New functionality: build reduced RHS / use partial solution
New functionality: detection of zero pivots
Memory reduced (especially communication buffers)
Problem of integer overflow "MEMORY_SENT" corrected
Error code -20 used when receive buffer too small (instead of -17 in some cases)
Erroneous memory access with singular matrices (since 4.6.3) corrected
Minor bug correction in hybrid scheduler
Parallel solution step uses less memory
Performance and memory usage of solution step improved
String containing the version number now available as a component of the MUMPS structure
Case of error "-9964" has disappeared

Changes from 4.6.3 to 4.6.4

Avoid name clashes (F_INT, ...) when C interface is used and user wants to include, say, smumps_c.h, zmumps_c.h (etc.) at the same time
Avoid large array copies (by some compilers) in distributed matrix entry functionality
Default ordering less dependent on number of processors
New garbage collector for contribution blocks
Original matrix in "arrowhead form" on candidate processors only (assembled case)
Corrected bug occurring rarely, on large number of processors, and that depended on value of uninitialized data
Parallel LDL^t factorization numerically improved
Less memory allocation in mapping phase (in some cases)

Changes from 4.6.2 to 4.6.3

Reduced memory usage for symmetric matrices (compressed CB)
Reduced memory allocation for parallel executions
Scheduler parameters for parallel executions modified
Memory estimates (that were too large) corrected with 2Dcyclic Schur complement option
Portability improved (C/Fortran interfacing for strings)
The situation leading to Warning "RHS associated in MUMPS_301" no more occurs.
Parameters INFO/RINFO from the Scilab/Matlab API are now called INFOG/RINFOG in order to match the MUMPS user's guide.

Changes from 4.6.1 to 4.6.2

Metis ordering now available with Schur option
Schur functionality correctly working with Scilab interface
Occasional SIGBUS problem on single precision versions corrected

Changes from 4.6 to 4.6.1

Problem with hybrid scheduler and elemental matrix entry corrected
Improved numerical processing of symmetric matrices with quasi-dense rows
Better use of Blacs/Scalapack on processor grids smaller than MPI_COMM_WORLD
Block sizes improved for large symmetric matrices

Changes from 4.5.6 to 4.6

Official release with Scilab and Matlab interfaces available
Correction in 2x2 pivots for symmetric indefinite complex matrices
New hybrid scheduler active by default

Changes from 4.5.5 to 4.5.6

Preliminary developments for an out-of-core code (not yet available)
Improvement in parallel symmetric indefinite solver
Preliminary distribution of a SCILAB and a MATLAB interface to MUMPS.

Changes from 4.5.4 to 4.5.5

Improved tree management
Improved weighted matching preprocessing: duplicates allowed, overflow avoided, dense rows
Improved strategy for selecting default ordering
Improved node amalgamation

Changes from 4.5.3 to 4.5.4

Double complex version no more depends on double precision version.
Simplification of some complex indirections in mumps_cv.F that were causing difficultiels to some compilers.

Changes from 4.5.2 to 4.5.3

Correction of a minor problem leading to INFO(1)=-135 in some cases.

Changes from 4.5.1 to 4.5.2

correction of two uninitialized variables in proportional mapping

Changes from 4.5.0 to 4.5.1

better management of contribution messages
minor modifications in symmetric preprocessing step

Changes from 4.4.0 to 4.5.0

improved numerical features for symmetric indefinite matrices
- two-by-two pivots
- symmetric scaling
- ordering based on compressed graph preserving two by two pivots
- constrained ordering
2D cyclic Schur better validated
problems resulting from automatic array copies done by compiler corrected
reduced memory requirement for maximum transversal features

Changes from 4.3.4 to 4.4.0

2D block cyclic Schur complement matrix
symmetric indefinite matrices better handled
Right-hand side vectors can be sparse
Solution can be kept distributed on the processors
Metis allowed for element-entry
Parallel performance and memory usage improved:
- load is updated more often for type 2 nodes
- scheduling under memory constraints
- reduced message sizes in symmetric case
- some linear searches avoided when sending contributions
Avoid array copies in the call to the partial mapping routine (candidates); such copies appeared with intel compiler version 8.0.
Workaround MPI_AllReduce problem with booleans if mpich and MUMPS are compiled with different compilers
Reduced message sizes for CB blocks in symmetric case
Various minor improvements

Changes from 4.3.3 to 4.3.4

Copies of some large CB blocks suppressed in local assemblies from child to parent
gathering of solution optimized in solve phase

Changes from 4.3.2 to 4.3.3

Control parameters of symbolic factorization modified.
Global distribution time and arrowheads computation slightly optimized.
Multiple Right-Hand-Side implemented.

Changes from 4.3.1 to 4.3.2

Thresholds for symbolic factorization modified.
Use merge sort for candidates (faster)
User's communicator copied when entering MUMPS
Code to free CB areas factorized in various places
One array suppressed in solve phase

Changes from 4.3 to 4.3.1

Memory leaks in PORD corrected
Minor compilation problem on T3E solved
Avoid taking into account absolute criterion CNTL(3) for partial LDLt factorization when whole column is known (relative stability is enough).
Symbol MPI_WTICK removed from mpif.h
Bug wrt inertia computation INFOG(12) corrected

Changes from 4.2beta to 4.3

C INTERFACE CHANGE: comm_fortran must be defined from the calling program, since MUMPS uses a Fortran communicator (see user guide).
LAPACK library is no more required
User guide improved
Default ordering changed
Return number of negative diagonal elements in LDLt factorization (except for root node if treated in parallel)
Rank-revealing options no more available by default
Improved parallel performance
- new incremental mechanism for load information
- new communicator dedicated to load information
- improved candidate strategy
- improved management of SMP platforms
Include files can be used in both free and fixed forms
Bug fixes:
- some uninitialized values
- pbs with size of data on t3e
- minor problems corrected with distributed matrix entry
- count of negative pivots corrected
- AMD for element entries
- symbolic factorization
- memory leak in tree reordering and in solve step
Solve step uses less memory (and should be more efficient)

Changes from 4.1.6 to 4.2beta

More precisions available (single, double, complex, double complex).
Uniprocessor version available (doesn't require MPI installed)
Interface changes (Users of MUMPS 4.1.6 will have to slightly modify their codes):
- MUMPS -> ZMUMPS, CMUMPS, SMUMPS, DMUMPS depending the precision
- the Schur complement matrix should now be allocated by the user before the call to MUMPS
- NEW: C interface available.
- ICNTL(6)=6 in 4.1.6 (automatic choice) is now ICNTL(6)=7 in 4.2
Tighter integration of new ordering packages (for assembled matrices), see the description of ICNTL(7):
- AMF,
- Metis,
- PORD,
Memory usage decreased and memory scalability improved.
Problem when using multiple instances solved.
Various improvments and bug fixes.

Changes from 4.1.4 to 4.1.6

Modifications/Tuning done by P.Amestoy during his visit at NERSC.
Additional memory and communication statistics.
minor pbs solved.

Changes from 4.0.4 to 4.1.4

Tuning on Cray T3e (and minor debugging)
Improved strategy for asynchronous communications (irecv during factorization)
Improved Dynamic scheduling and splitting strategies
New maximal transversal strategies
New Option (default) automatic decision for scaling and maximum transversal

Release history

Release 5.8.1	: July 2025
Release 5.8.0	: May 2025
Release 5.7.3	: July 2024
Release 5.7.2	: June 2024
Release 5.7.1	: May 2024
Release 5.7.0	: April 2024
Release 5.6.2	: October 2023
Release 5.6.1	: July 2023
Release 5.6.0	: April 2023
Release 5.5.1	: July 2022
Release 5.5.0	: April 2022
Release 5.4.1	: August 2021
Release 5.4.0	: April 2021
Release 5.3.5	: October 2020
Release 5.3.4	: September 2020
Release 5.3.3	: June 2020
Release 5.3.2[.x]	: May 2020, internal/experimental
Release 5.3.1	: April 2020
Release 5.3.0	: April 2020
Release 5.2.1	: June 2019
Release 5.2.0	: April 2019
Release 5.1.2	: October 2017
Release 5.1.1	: March 2017
Release 5.1.0	: Feb 2017, internal release (limited diffusion)
Release 5.0.2	: July 2016
Release 5.0.1	: July 2015
Release 5.0.0	: February 2015
Release 4.10.0	: May 2011
Release 4.9.2	: November 2009
Release 4.9.1	: October 2009
Release 4.9	: July 2009
Release 4.8.4	: December 2008
Release 4.8.3	: September 2008
Release 4.8.2	: September 2008
Release 4.8.1	: August 2008
Release 4.8.0	: July 2008
Release 4.7.3	: May 2007
Release 4.7.2	: April 2007
Release 4.7.1	: April 2007
Release 4.7	: April 2007
Release 4.6.4	: January 2007
Release 4.6.3	: June 2006
Release 4.6.2	: April 2006
Release 4.6.1	: February 2006
Release 4.6	: January 2006
Release 4.5.6	: December 2005, internal release
Release 4.5.5	: October 2005
Release 4.5.4	: September 2005
Release 4.5.3	: September 2005
Release 4.5.2	: September 2005
Release 4.5.1	: September 2005
Release 4.5.0	: July 2005
Releases 4.3.3 -- 4.4.3	: internal releases
Release 4.3.2	: November 2003
Release 4.3.1	: October 2003
Release 4.3	: July 2003
Release 4.2 (beta)	: December 2002
Release 4.1.6	: March 2000
Release 4.0.4	: Wed Sept 22, 1999 <-- Final version from PARASOL

MUMPS:

MUltifrontal Massively Parallel sparse direct Solver