Hermitian to tridiagonal¶
The standard approach for computing eigenpairs of dense Hermitian matrices begins by performing a unitary similarity transformation which reduces the matrix to real symmetric tridiagonal form (usually through Householder transformations). The following routines perform said reduction on a Hermitian matrix and store the scaled Householder vectors in place of the introduced zeroes.
Note
While so-called Successive Band Reduction approaches, which reduce the matrix to tridiagonal form using a two-stage process, are sometimes preferred, they are not yet supported within Elemental. Please see [ELPA2011] for such an implementation.
Reduction¶
Elemental currently provides two different basic strategies for the reduction of a Hermitian matrix to tridiagonal form via unitary similarity:
Run a pipelined algorithm designed for general (rectangular) process grids.
Redistribute the matrix so that it is owned by a perfect square number of processes, perform a fast reduction to tridiaogal form, and redistribute the data back to the original process grid. This algorithm is nearly identical to that of [HJS1999] and [Stanley1997].
There is clearly a small penalty associated with the extra redistributions
necessary for the second approach, but the benefit from using a square process
grid is usually quite signficant. By default, HermitianTridiag()
will
run the standard algorithm (approach 1) unless the matrix is already distributed
over a square process grid. The reasoning is that good performance depends upon
a “good” ordering of the square (say, \(\hat p \times \hat p\)) subgrid,
though usually either a row-major or column-major ordering of the first
\(\hat p^2\) processes suffices.
-
enum
HermitianTridiagApproach
¶ -
enumerator
HERMITIAN_TRIDIAG_NORMAL
¶ Run the pipelined rectangular algorithm.
-
enumerator
HERMITIAN_TRIDIAG_SQUARE
¶ Run the square grid algorithm on the largest possible square process grid.
-
enumerator
HERMITIAN_TRIDIAG_DEFAULT
¶ If the given process grid is already square, run the square grid algorithm, otherwise use the pipelined non-square approach.
Note
A properly tuned
HERMITIAN_TRIDIAG_SQUARE
approach is almost always fastest, so it is worthwhile to test it with both theCOLUMN_MAJOR
andROW_MAJOR
subgrid orderings, as described below.Note
The first algorithm heavily depends upon the performance of distributed
Symv()
, so users interested in maximizing the performance of the first algorithm will likely want to investigate different values for the local blocksizes through the routineSetLocalSymvBlocksize<T>( int blocksize )
; the default value is 64.-
enumerator
C++ API¶
-
void
HermitianTridiag
(UpperOrLower uplo, Matrix<F> &A, Matrix<F> &t)¶
-
void
HermitianTridiag
(UpperOrLower uplo, AbstractDistMatrix<F> &A, AbstractDistMatrix<F> &t, const HermitianTridiagCtrl &ctrl = HermitianTridiagCtrl())¶ Overwrites the main and sub (super) diagonal of the real matrix A with its similar symmetric tridiagonal matrix and stores the scaled Householder vectors below (above) its tridiagonal entries. Complex Hermitian reductions have the added complication of needing to also store the scalings for the Householder vectors (the scaling can be inferred since the Householder vectors must be unit length) if they are to be applied (in the column vector t).
-
void
herm_tridiag
::
ExplicitCondensed
(UpperOrLower uplo, Matrix<F> &A)¶
-
void
herm_tridiag
::
ExplicitCondensed
(UpperOrLower uplo, AbstractDistMatrix<F> &A, const HermitianTridiagCtrl &ctrl = HermitianTridiagCtrl())¶ Returns just the (appropriate triangle of the) resulting tridiagonal matrix.
C API¶
-
ElHermitianTridiagCtrlDefault
(ElHermitianTridiagCtrl* ctrl)¶ Sets approach to
HERMITIAN_TRIDIAG_SQUARE
and order toROW_MAJOR
.
Single-precision¶
-
ElError
ElHermitianTridiag_s
(ElUpperOrLower uplo, ElMatrix_s A, ElMatrix_s t)¶
-
ElError
ElHermitianTridiagDist_s
(ElUpperOrLower uplo, ElDistMatrix_s A, ElDistMatrix_s t)¶
-
ElError
ElHermitianTridiagXDist_s
(ElUpperOrLower uplo, ElDistMatrix_s A, ElDistMatrix_s t, ElHermitianTridiagCtrl ctrl)¶
-
ElError
ElHermitianTridiagExplicitCondensed_s
(ElUpperOrLower uplo, ElMatrix_s A)¶
-
ElError
ElHermitianTridiagExplicitCondensedDist_s
(ElUpperOrLower uplo, ElDistMatrix_s A)¶
-
ElError
ElHermitianTridiagExplicitCondensedXDist_s
(ElUpperOrLower uplo, ElDistMatrix_s A, ElHermitianTridiag ctrl)¶
Double-precision¶
-
ElError
ElHermitianTridiag_d
(ElUpperOrLower uplo, ElMatrix_d A, ElMatrix_d t)¶
-
ElError
ElHermitianTridiagDist_d
(ElUpperOrLower uplo, ElDistMatrix_d A, ElDistMatrix_d t)¶
-
ElError
ElHermitianTridiagXDist_d
(ElUpperOrLower uplo, ElDistMatrix_d A, ElDistMatrix_d t, ElHermitianTridiagCtrl ctrl)¶
-
ElError
ElHermitianTridiagExplicitCondensed_d
(ElUpperOrLower uplo, ElMatrix_d A)¶
-
ElError
ElHermitianTridiagExplicitCondensedDist_d
(ElUpperOrLower uplo, ElDistMatrix_d A)¶
-
ElError
ElHermitianTridiagExplicitCondensedXDist_d
(ElUpperOrLower uplo, ElDistMatrix_d A, ElHermitianTridiag ctrl)¶
Single-precision complex¶
-
ElError
ElHermitianTridiag_c
(ElUpperOrLower uplo, ElMatrix_c A, ElMatrix_c t)¶
-
ElError
ElHermitianTridiagDist_c
(ElUpperOrLower uplo, ElDistMatrix_c A, ElDistMatrix_c t)¶
-
ElError
ElHermitianTridiagXDist_c
(ElUpperOrLower uplo, ElDistMatrix_c A, ElDistMatrix_c t, ElHermitianTridiagCtrl ctrl)¶
-
ElError
ElHermitianTridiagExplicitCondensed_c
(ElUpperOrLower uplo, ElMatrix_c A)¶
-
ElError
ElHermitianTridiagExplicitCondensedDist_c
(ElUpperOrLower uplo, ElDistMatrix_c A)¶
-
ElError
ElHermitianTridiagExplicitCondensedXDist_c
(ElUpperOrLower uplo, ElDistMatrix_c A, ElHermitianTridiag ctrl)¶
Double-precision complex¶
-
ElError
ElHermitianTridiag_z
(ElUpperOrLower uplo, ElMatrix_z A, ElMatrix_z t)¶
-
ElError
ElHermitianTridiagDist_z
(ElUpperOrLower uplo, ElDistMatrix_z A, ElDistMatrix_z t)¶
-
ElError
ElHermitianTridiagXDist_z
(ElUpperOrLower uplo, ElDistMatrix_z A, ElDistMatrix_z t, ElHermitianTridiagCtrl ctrl)¶
-
ElError
ElHermitianTridiagExplicitCondensed_z
(ElUpperOrLower uplo, ElMatrix_z A)¶
-
ElError
ElHermitianTridiagExplicitCondensedDist_z
(ElUpperOrLower uplo, ElDistMatrix_z A)¶
-
ElError
ElHermitianTridiagExplicitCondensedXDist_z
(ElUpperOrLower uplo, ElDistMatrix_z A, ElHermitianTridiag ctrl)¶
Applying the change of basis¶
Apply (from the left or right) the implicitly defined unitary matrix (or its adjoint) represented by the Householder transformations stored within the specified triangle of A and their scalings are stored in the vector t.
C++ API¶
-
void
herm_tridiag
::
ApplyQ
(LeftOrRight side, UpperOrLower uplo, Orientation orientation, const Matrix<F> &A, const Matrix<F> &t, Matrix<F> &B)¶
-
void
herm_tridiag
::
ApplyQ
(LeftOrRight side, UpperOrLower uplo, Orientation orientation, const AbstractDistMatrix<F> &A, const AbstractDistMatrix<F> &t, AbstractDistMatrix<F> &B)¶
C API¶
Single-precision¶
-
ElError
ElApplyQAfterHermitianTridiag_z
(ElLeftOrRight side, ElUpperOrLower uplo, ElOrientation orientation, ElConstMatrix_z A, ElConstMatrix_z t, ElMatrix_z B)¶
-
ElError
ElApplyQAfterHermitianTridiagDist_z
(ElLeftOrRight side, ElUpperOrLower uplo, ElOrientation orientation, ElConstDistMatrix_z A, ElConstDistMatrix_z t, ElDistMatrix_z B)¶
Double-precision¶
-
ElError
ElApplyQAfterHermitianTridiag_d
(ElLeftOrRight side, ElUpperOrLower uplo, ElOrientation orientation, ElConstMatrix_d A, ElConstMatrix_d t, ElMatrix_d B)¶
-
ElError
ElApplyQAfterHermitianTridiagDist_d
(ElLeftOrRight side, ElUpperOrLower uplo, ElOrientation orientation, ElConstDistMatrix_d A, ElConstDistMatrix_d t, ElDistMatrix_d B)¶
Single-precision complex¶
-
ElError
ElApplyQAfterHermitianTridiag_c
(ElLeftOrRight side, ElUpperOrLower uplo, ElOrientation orientation, ElConstMatrix_c A, ElConstMatrix_c t, ElMatrix_c B)¶
-
ElError
ElApplyQAfterHermitianTridiagDist_c
(ElLeftOrRight side, ElUpperOrLower uplo, ElOrientation orientation, ElConstDistMatrix_c A, ElConstDistMatrix_c t, ElDistMatrix_c B)¶
Double-precision complex¶
-
ElError
ElApplyQAfterHermitianTridiag_z
(ElLeftOrRight side, ElUpperOrLower uplo, ElOrientation orientation, ElConstMatrix_z A, ElConstMatrix_z t, ElMatrix_z B)
-
ElError
ElApplyQAfterHermitianTridiagDist_z
(ElLeftOrRight side, ElUpperOrLower uplo, ElOrientation orientation, ElConstDistMatrix_z A, ElConstDistMatrix_z t, ElDistMatrix_z B)
References¶
- ELPA2011
Auckenthaler, V. Blum, H.-J. Bungartz, T. Huckle, R. Johanni, L. Kramer, B. Lang, H. Lederer, and P.R. Willems, Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations, Parallel Computing, Vol. 37, Issue 12, pp. 783–794, 2011. DOI: http://dx.doi.org/10.1016/j.parco.2011.05.002
- HJS1999
Bruce Hendrickson, Elizabeth Jessup, and Christopher Smith, Towards an efficient parallel eigensolver for dense symmetric matrices, SIAM Journal on Scientific Computing, Vol. 20, No. 3, pp. 1132–1154, 1999. DOI: http://dx.doi.org/10.1137/S1064827596300681
- Stanley1997
Kendall S. Stanley, Execution time of symmetric eigensolvers, EECS Department, UC Berkeley, Technical Report No. UCB/CSD-98-992, 1997. Accessed from: http://www.eecs.berkeley.edu/Pubs/TechRpts/1999/5365.html