Quickstart

Htool-DDM can be used for different use cases:

Dependencies

Htool-DDM is a C++ header-only library, it requires:

  • C++ compiler with standard 14.

  • BLAS, to perform algebraic dense operations.

And

  • For DDM solvers: a MPI implementation, HPDDM and LAPACK.

  • For shared memory parallelism when using in-house \(\mathcal{H}\)-matrix algebra: OpenMP

  • for SVD (re)compression: LAPACK

  • CMake to generate a build-system for the examples and tests.

Installation

It is sufficient to include the include folder of this repository in your library. If you prefer, the following command copies the include folder your OS-specific include directory to make it more widely available: in the root of this repository on your system,

cmake -B build
cmake --build build --config Release --target install -- -j $(nproc)

Note

The install prefix can be modified using cmake ..  -DCMAKE_INSTALL_PREFIX:PATH=your/install/path instead of the first line.

Once installed, you can also use Htool-DDM as a cmake imported target as follows in your CMakeLists.txt:

find_package(Htool REQUIRED)
find_package(MPI REQUIRED)
find_package(BLAS REQUIRED)
# other optional packages

add_executable(your_target your_file.cpp)
target_link_libraries(your_target Htool::htool)

Geometric clustering

The hierarchical clustering of a geometry is contained in htool::Cluster and can be created using a htool::ClusterTreeBuilder. For example,

int number_points = 10000;
int spatial_dimension = 3;
int number_of_partitions = 2;
int number_of_children = 2;
std::vector<double> coordinates(number_points*spatial_dimension);
// coordinates = {...}
htool::ClusterTreeBuilder<double> cluster_tree_builder;
htool::Cluster<double> cluster = cluster_tree_builder.create_cluster_tree(number_points, spatial_dimension, coordinates.data(), number_of_children, number_of_partitions);

where

  • number_of_partitions defines the number of children at a level of the cluster tree called “partition”, which can be used to distribute data in a MPI context.

  • number_of_children defines the number of children for the other nodes that are not leaves.

  • coordinates stores all geometric points associated with the evaluation or discretisation of the kernel to be compressed, e.g., in 2d \((x_0,y_0,x_1,y_1,...)\).

Note

See example

Hierarchical compression

To use the in-house hierarchical compression htool::HMatrix, Htool-DDM will need two inputs:

  1. The underlying geometry of the kernel to be compressed for the Geometric clustering.

  2. A function to generate any coefficient of the matrix to be compressed.

Note

See example

Coefficient generator

A coefficient generator is defined by following the interface htool::VirtualGenerator, and in particular htool::VirtualGenerator::copy_submatrix(). For example, a user could define the following generator:

class UserOperator: public htool::VirtualGenerator<double>{
    virtual void copy_submatrix(int M, int N, const int *rows, const int *cols, double *ptr) const override {
        for (int i = 0; i < M; i++) {
            for (int j = 0; j < N; j++) {
                ptr[i + M * j] = ...// A(rows[i], cols[j]) where A is the kernel to be compressed;
            }
        }
    }
};

Build a hierarchical matrix

To build a htool::HMatrix, a htool::HMatrixBuilder can be used:

// Geometry
const int number_points     = 10000;
const int spatial_dimension = 3;
std::vector<double> coordinates(number_points*spatial_dimension);
// coordinates = {...}

// HMatrix parameters
const double epsilon = 0.01;
const double eta     = 10;
char symmetry        = 'S';
char UPLO            = 'L';

// Generator
//UserOperator A = ...;

// HMatrix
HMatrixBuilder<double> hmatrix_builder(number_points, spatial_dimension, coordinates.data());
HMatrix<double> hmatrix = hmatrix_builder.build(A, htool::HMatrixTreeBuilder<double>(epsilon, eta, symmetry, UPLO));

where

  • epsilon is the parameter controlling the relative error when compressing a subblock with a low-rank approximation.

  • eta is the parameter \(\eta\) in the admissibility condition (1).

  • coordinates stores all geometric points associated with the evaluation or discretisation of the kernel to be compressed, e.g., in 2d \((x_0,y_0,x_1,y_1,...)\).

Note

The geometric clustering is done within the constructor of htool::HMatrixBuilder. You can still access the resulting target and source clusters as public members of hmatrix_builder.

Warning

htool::HMatrix contains pointers to its target and source clusters. Thus, in general, clusters need to be deleted after htool::HMatrix objects. In particular, when using a htool::HMatrixBuilder, which contains the cluster, it needs to be deleted after htool::HMatrix.

Use a hierarchical matrix

Basic linear algebra is provided for htool::HMatrix. Shared-memory parallelism is also supported via execution policy traits for C++17 and above.

Table 1 Supported operations

Operations

C++ function

Supported execution policy

Assembly

htool::HMatrixTreeBuilder::build

std::execution::seq, std::execution::par

\(\mathcal{H}\)-matrix vector product

htool::add_hmatrix_vector_product()

std::execution::seq, std::execution::par

\(\mathcal{H}\)-matrix matrix product

htool::add_hmatrix_matrix_product()

std::execution::seq, std::execution::par

\(\mathcal{H}\)-LU factorisation

htool::lu_factorization()

std::execution::seq, std::execution::par

\(\mathcal{H}\)-LU solve

htool::lu_solve()

None

\(\mathcal{H}\)-Cholesky factorisation

htool::cholesky_factorization()

std::execution::seq, std::execution::par

\(\mathcal{H}\)-Cholesky solve

htool::cholesky_solve()

None

Note

We try to have a similar API to BLAS/LAPACK or <linalg>. In particular, parallel version of linear algebra functions are available using execution policy traits. If none is given, it defaults to sequential operation. If the standard execution policy traits are unavailable, you can still use exec_compat::seq and exec_compat::par in C++17 and later, or invoke the underlying function directly with the parallelism type you need.

Distributed operator

A htool::DistributedOperator mainly consists in a vector of global-to-local and local-to-local operators. Here, local means a vector local to the current MPI process/partition.

Note

See example

Build a distributed operator

To assemble a distributed operator, first a geometric clustering needs to be applied. The partition defined by the target geometric cluster tree is then used to define a row-wise distributed operator (see Row-wise distributed operator). htool::DefaultApproximationBuilder builds a row-wise distributed operator htool::DistributedOperator where each block of rows is compressed using a htool::HMatrix, which can be accessed as a public member.

htool::HMatrixTreeBuilder<double> hmatrix_tree_builder(epsilon, eta, symmetry, uplo)
htool::DefaultApproximationBuilder<double> default_approximation_builder(UserOperator(), target_cluster, source_cluster, hmatrix_tree_builder,MPI_COMM_WORLD);
DistributedOperator<double> &distributed_operator = default_approximation_builder.distributed_operator;

A htool::DistributedOperator object provides distributed products with matrices and vectors.

Use a distributed operator

Basic linear algebra is provided for htool::DistributedOperator. “Local” means local to the current MPI process/partition.

DDM Solver

To solve the linear system associated with htool::DistributedOperator, an iterative linear solve can be used via a htool::DDM object. It can be built via a htool::DDMSolverBuilder. We give here an example for a simple block-Jacobi solver, in particular we use htool::DefaultApproximationBuilder::block_diagonal_hmatrix, which is a pointer to the \(\mathcal{H}\) matrix for the block diagonal associated with the local subdomain.

// Create DDM object
HMatrix<double> local_hmatrix = *default_approximation_builder.block_diagonal_hmatrix; // copy the block diagonal block to factorize it later
htool::DDMSolverBuilder<double> ddm_solver_build(distributed_operator,local_hmatrix);
htool::DDM<double> solver = ddm_solver_build.solver;

// Prepare solver
HPDDM::Option &opt = *HPDDM::Option::get();
opt.parse("-hpddm_schwarz_method asm ");
ddm_with_overlap.facto_one_level(); // block_diagonal_hmatrix is factorized in-place

// Solve
int mu = 1 // number of right-hand side
std::vector<double> x = ...// unknowns
std::vector<double> b = ...// right-hand sides
ddm_with_overlap.solve(b.data(), x.data(), mu);

Note

We rely on the external library HPDDM for the implementation of efficient iterative solvers. We refer to its cheatsheet listing the various possible options for the solver.

Note

See example