Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
180 changes: 136 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,69 +1,161 @@
# diff-lapack
# `diffblas`

## Installation
`diffblas`is a library that provides (algorithmically) differentiated BLAS routines from their reference implementation in [lapack](https://github.com/Reference-LAPACK/lapack) using the automatic differentiation tool [Tapenade](https://gitlab.inria.fr/tapenade/tapenade) in four modes: forward (`_d`), vector forward (`_dv`), reverse (`_b`), and vector reverse (`_bv`).
The compiled `libdiffblas` can be linked into applications that need derivatives of BLAS operations for optimization, sensitivity analysis etc.

We support building `diffblas` with the [Meson Build system](https://mesonbuild.com):
This work was inspired in part by a need to differentiate a Fortran code [HFBTHO](https://www.sciencedirect.com/science/article/abs/pii/S0010465525004564), that uses LAPACK and BLAS routines, and to use the differentiated application for gradient-based optimization.

## Using the pre-compiled library from your application

Use the library the same way the tests do:
1. Call the differentiated routines from your application.
2. Include the right header/module.
3. Link your application with `libdiffblas` and the BLAS library (`refblas`).


**Calling convention (same as in the tests):**

1. **Forward (`_d`):** Declare the differentiated routine as `external` (or use a module). Call it with the same arguments as the original BLAS routine, plus one extra “derivative” argument per floating-point input/output (e.g. `a`, `a_d`). Example for DGEMM:
```fortran
external :: dgemm_d
! ... set transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc
! ... set alpha_d, a_d, b_d, beta_d, c_d (derivative inputs; c_d is output)
call dgemm_d(transa, transb, m, n, k, alpha, alpha_d, a, a_d, lda, b, b_d, ldb, beta, beta_d, c, c_d, ldc)
```
2. **Reverse (`_b`):** Call the reverse routine with the same shape as the forward call; the “b” arguments carry adjoints (see generated test files such as `BLAS/test/test_dgemm_reverse.f90`).
3. **Vector forward (`_dv`) / vector reverse (`_bv`):** Same idea with an extra dimension for multiple directions; see `BLAS/test/test_*_vector_forward.f90` and `test_*_vector_reverse.f90`.

***Handing dimensions :***
BLAS uses assumed size arrays as seen in `dgemm.f`.
```fortran
DOUBLE PRECISION A(LDA,*),B(LDB,*),C(LDC,*)
```

In order to enable the use of the differentiated versions of these routines in a portable way, we require you to call addtional routines before calling the differentiated rouint. For **reverse** (`_b`) and **vector reverse** (`_bv`) — and in some cases **vector forward** (`_dv`) the differentiated code expects certain "ISIZE" values to be set before the call. These describe the dimensions of the assumed size arrays and matrices.

For example (as in `BLAS/test/test_dgemm_reverse.f90`): for `dgemm_b`, set the second dimension of the two matrix arguments with the appropriate dimensions:

```fortran
use DIFFSIZES
! ...
call set_ISIZE2OFA(lda) ! leading dimension of A
call set_ISIZE2OFB(ldb) ! leading dimension of B
call dgemm_b(transa, transb, m, n, k, alpha, alphab, a, ab, lda, b, bb, ldb, beta, betab, c, cb, ldc)
call set_ISIZE2OFA(-1)
call set_ISIZE2OFB(-1)
```

Matrix routines often use `set_ISIZE2OFA` / `set_ISIZE2OFB`; level-1 with vectors may use `set_ISIZE1OFX`, `set_ISIZE1OFY`, or `set_ISIZE1OFZx` / `set_ISIZE1OFZy`). The tests in `BLAS/test/` show the exact `set_ISIZE*` calls for each routine. If a setter is not called, the differentiated will stop with an error. You can pptionally reset the dimensions by calling set_ISIZE*(-1)` so that the next use must set the dimension again. This is recommended if you reuse the same differentiated routine in multiple contexts.

***Handing directions in vector mode:***
We have currently hardcoded the number of directions to 4. In case you need a different number, you should download the source of this repository and edit both `BLAS/include/DIFFSIZES.inc` and `BLAS/include/DIFFSIZES.F90`. Then build the library to generate the appropriate library.
```fortran
INTEGER, PARAMETER :: nbdirsmax = 4
```
```fortran
integer nbdirsmax
parameter (nbdirsmax=4)
```fortran


**Minimal example (forward mode, one routine):** Compile your main program and link with the built library and BLAS:

```bash
meson setup builddir
meson compile -C builddir
meson install -C builddir
gfortran -O2 -I/path/to/diff-lapack/BLAS/include -c my_main.f90 -o my_main.o
gfortran -o my_main my_main.o /path/to/diff-lapack/builddir/libdiffblas.a -L$LAPACKDIR -lrefblas
```

By default, Meson compiles a static libary `libdiffblas.a`.
For a shared library, please use the option `-Ddefault_library=shared`.
To install in a specific location, use the option `--prefix=install/dir`
(If you built with Make, use `BLAS/build/libdiffblas_d.a` for forward mode only.) The tests in `BLAS/test/` (e.g. `test_dgemm.f90`, `test_dgemm_reverse.f90`) are full examples of how to declare and call the differentiated routines and how to link; use them as templates for your application.

## Building the library (and tests)

You need **pre-generated** sources (from step 1). The build compiles them and links with a BLAS library and the Tapenade adStack runtime.

### Build with Meson (library and tests)

**Dependencies:**

- **Fortran compiler** (e.g. gfortran, ifort, ifx) and **C compiler** (e.g. gcc).
- **LAPACK installation** — a built Reference LAPACK (or compatible) providing BLAS (e.g. `librefblas.a` or `libblas.a`). Set **`LAPACKDIR`** (or equivalent) so Meson can find it (see below).
- **Tapenade adStack** — the repo already contains `TAPENADE/adStack.c` and `TAPENADE/include/`; Meson compiles and links these automatically. No separate Tapenade install is required for the build.

**Configure and build from the project root:**

```bash
meson setup builddir -Ddefault_library=shared --prefix=$(pwd)/diffblas
# If BLAS is in a custom location (e.g. LAPACK build dir), pass library search path
meson setup builddir -Dlibblas=refblas -Dlibblas_path=/path/to/lapack/build

# Or rely on system / environment (e.g. LIBRARY_PATH, or refblas in default path)
meson setup builddir -Dlibblas=refblas

meson compile -C builddir
meson install -C builddir
```

## Launching the differentiation
This produces `builddir/libdiffblas.a` (or shared library if configured with `-Ddefault_library=shared`). The Meson build uses `BLAS/meson.build`, compiles everything in `BLAS/src/` plus `BLAS/include/DIFFSIZES.f90`, `BLAS/src/DIFFSIZES_access.f`, and `TAPENADE/adStack.c`; it does **not** build the test executables (tests are built by the BLAS Makefile in 2b if you use that).

Get the latest version from the LAPACK webpage
To also run the tests, use the BLAS Makefile (2b) or run the test programs built there. To install the library:

```shell
wget -q -O tmp.html https://raw.githubusercontent.com/Reference-LAPACK/lapack/refs/heads/master/README.md
VERSION=`cat tmp.html | grep "VERSION" | tail -1 | awk '{print $3}'`
rm tmp.html
```bash
meson install -C builddir --prefix /your/install
```
And use this to create the link to the most recent LAPACK release

```shell
wget -P tmp/ https://github.com/Reference-LAPACK/lapack/archive/refs/tags/v$VERSION.tar.gz
tar xvzf tmp/v$VERSION.tar.gz -C tmp/
rm tmp/v$VERSION.tar.gz
python run_tapenade_blas.py --input-dir tmp/lapack-$VERSION/BLAS/SRC/ --out-dir out
```
### Build with Make (library and tests)

**Dependencies:**

These commands are gathered in a bash file `get_and_run.sh` calling it directly from bash
- **Fortran compiler** (e.g. gfortran) and **C compiler** (e.g. gcc).
- **LAPACK installed** — build and install [Reference LAPACK](https://github.com/Reference-LAPACK/lapack) so that you have `librefblas` (or set `BLAS_LIB` in the Makefile to point at your BLAS).
- **`LAPACKDIR`** — set to the directory where LAPACK was built (or where `librefblas.a` lives), so the linker can find `-lrefblas`:
```bash
export LAPACKDIR=/path/to/lapack/build
```
- **Tapenade adStack** — the Makefile looks for `adStack.c` in this order: `BLAS/src/adStack.c`, then `../TAPENADE/adStack.c`, then `$TAPENADEDIR/ADFirstAidKit/adStack.c`. The repo’s `TAPENADE/` copy is used by default; no Tapenade install needed if you do not override.

Some remarks:
-------------
**Build from the BLAS directory:**

* If one needs to run only a subset of the BLAS files:
```shell
python run_tapenade_blas.py --input-dir tmp/lapack-$VERSION/BLAS/SRC/ --out-dir out --file FILE1 FILE2
```bash
cd BLAS
export LAPACKDIR=/path/to/your/lapack/build # or wherever librefblas is
make
```
* The script has been tested for Python3. One could need to adapt the command line with the python version depending on your system.

## Calling the tests
This builds per-mode static libraries (`build/libdiffblas_d.a`, `libdiffblas_b.a`, `libdiffblas_dv.a`, `libdiffblas_bv.a`) and test executables in `build/`. Run tests:

```shell
cd out/
make # Will compile everything
./run_tests.sh # will run all available tests
```bash
./run_tests.sh
# or run individual test executables, e.g. build/test_dgemm, build/test_dgemm_reverse
```

Similarly, one can limit to certain functions only:
```shell
make dgemm # Will compile only dgemm
## Creating the library sources with run_tapenade_blas.py

This step **generates** the differentiated Fortran sources and test programs (e.g. under `BLAS/src/`, `BLAS/test/`, or a custom output directory like `out/`).

**Dependencies:**

- **LAPACK source tree** — Reference LAPACK (or at least its BLAS SRC directory) so that `--input-dir` points to the BLAS source files (e.g. `lapack-3.x/BLAS/SRC/`).
- **Tapenade** — installed and on your `PATH` (or pass `--tapenade-bin=/path/to/tapenade`).
- **Python 3** — to run `run_tapenade_blas.py`.

**Example: generate from a Reference LAPACK tarball**

```bash
# Download and unpack Reference LAPACK (or set LAPACK_SRC to your tree)
wget -q -O tmp.html https://raw.githubusercontent.com/Reference-LAPACK/lapack/refs/heads/master/README.md
VERSION=$(grep VERSION tmp.html | tail -1 | awk '{print $3}')
rm tmp.html
wget -P tmp/ https://github.com/Reference-LAPACK/lapack/archive/refs/tags/v${VERSION}.tar.gz
tar xzf tmp/v${VERSION}.tar.gz -C tmp/

# Generate differentiated BLAS into BLAS/ (flat layout: src/, test/, include/)
python run_tapenade_blas.py --input-dir tmp/lapack-${VERSION}/BLAS/SRC/ --out-dir BLAS --flat
```

This library generates a differentiated BLAS code using Tapenade in the following modes.
1. Forward mode
2. Vector forward mode
3. Reverse mode
4. Vector reverse mode
**Options:** Use `--file dgemm.f sgemm.f` to restrict to specific routines; `--mode d b` to generate only certain modes; see `python run_tapenade_blas.py --help`.

## Future work
There are several routines within BLAS that are not differentiated with verification, which we plan to add. We will add differentiated versions of the CBLAS routines.
We are beginning to write a differentiated version of the LAPACK library.

## Acknowledgement
This work was supported in part by the Applied Mathematics activity within the U.S. Department of Energy, Office of Science, Office
of Advanced Scientific Computing Research Applied Mathematics, and Office of Nuclear Physics SciDAC program under Contract No. DE-AC02-06CH11357. This work was supported in part by NSF CSSI grant 2104068.
Loading