Skip to content

Commit 5d045e8

Browse files
committed
Update readme
1 parent d0baee6 commit 5d045e8

File tree

1 file changed

+63
-19
lines changed

1 file changed

+63
-19
lines changed

README.md

Lines changed: 63 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,36 @@
11
# OpenAlphaDiffract
22

3-
OpenAlphaDiffract is an open-source implementation of the AlphaDiffract research project. It provides a reproducible pipeline to:
3+
OpenAlphaDiffract is an open-source implementation of the [AlphaDiffract](https://arxiv.org/abs/2603.23367) research project. It provides a reproducible pipeline to:
44
- Create a dataset from the Materials Project
55
- Simulate powder diffraction patterns from those structures
66
- Train and evaluate models on the generated dataset
77
- Run an inference web app to try out the model
88

99

10-
## Dataset Pipeline Overview
10+
## Pretrained Model & Demo
1111

12-
1. Acquire CIFs (downloader.Dockerfile)
12+
- **Model weights:** [linked-liszt/OpenAlphaDiffract](https://huggingface.co/linked-liszt/OpenAlphaDiffract) on Hugging Face
13+
- **Live demo:** [linked-liszt/OpenAlphaDiffract-UI](https://huggingface.co/spaces/linked-liszt/OpenAlphaDiffract-UI) on Hugging Face Spaces
14+
15+
16+
## Dataset Pipeline
17+
18+
1. Acquire CIFs (`docker/downloader.Dockerfile`)
1319
- Uses the Materials Project API to fetch crystal structures as CIF files
1420
- Configurable via `configs/download.yaml`
1521
- Filters structures by checking conventional cell consistency across multiple angle tolerances. This filters ~4.4% of MP structures as of 10/22/2025.
1622

17-
2. GSAS-II XRD Simulation (simulator.Dockerfile)
23+
2. GSAS-II XRD Simulation (`docker/simulator.Dockerfile`)
1824
- Generates synthetic powder diffraction patterns from CIFs
1925
- Configurable via `configs/simulator.yaml` (e.g., instrument file, noise ranges, job parallelism)
2026
- Creates .npy files with simulated pattern and metadata ready to be consumed by the training system
2127

22-
3. Open Alpha Diffract Training (trainer.Dockerfile)
28+
3. Open Alpha Diffract Training (`docker/trainer.Dockerfile`)
2329
- Trains the multi-task AlphaDiffract model on the generated dataset
2430
- Configurable via `configs/trainer.docker.yaml` or `configs/trainer.local.yaml`
2531
- Logs checkpoints and metrics (CSV and optional MLflow)
2632

27-
4. XRD Inference Web App (ui.Dockerfile)
33+
4. XRD Inference Web App (`docker/ui.Dockerfile`)
2834
- FastAPI service for model inference with a React frontend
2935
- Accepts processed XRD patterns and returns predictions via `/api/predict`
3036
- Serves the built frontend from the same container
@@ -39,8 +45,8 @@ Prerequisites:
3945
> [!WARNING]
4046
> Building the dataset and training will take a significant amount of space and computational resources:
4147
> - Expect to use around 1TB+ of space in total to replicate the paper's 100-variation dataset
42-
> - We recommend running simulation with ~100 processes in parallel. For reference [XYZ] this should take [XYZ hours].
43-
> - Training took [XYZ hours] on [XYZ hardware]
48+
> - We recommend running simulation with ~100+ processes in parallel. For reference, simulation took ~18 hours on 2x AMD EPYC 7742 (128 processes).
49+
> - Training took ~15 hours on a single H100 GPU.
4450
4551

4652
Setup:
@@ -50,11 +56,11 @@ Setup:
5056
- Optionally set `UID` and `GID` so the containers write files as your user.
5157

5258
2. Download CIFs:
53-
- `scripts/download.sh` (or docker compose run --rm trainer)
59+
- `scripts/download.sh` (or `docker compose run --rm downloader`)
5460
- CIFs will be written to `./data/raw_cif`
5561

5662
3. Simulate diffraction patterns:
57-
- `scripts/simulate.sh` (or docker compose run --rm simulator)
63+
- `scripts/simulate.sh` (or `docker compose run --rm simulator`)
5864
- Patterns will be written to `./data/dataset`
5965
- Errors (if any) go to `./data/error_logs`
6066

@@ -65,6 +71,7 @@ Setup:
6571
5. Run the inference UI:
6672
- Move a model checkpoint to `./src/ui/models/xrd_model.ckpt`
6773
- `docker compose up ui`
74+
- Open `http://localhost:7860`
6875

6976
Notes:
7077
- You can pass extra CLI args to the simulator via `scripts/simulate.sh`, e.g. `--sims_per_file 1 --parallel_jobs 4`
@@ -74,17 +81,54 @@ Notes:
7481

7582
```
7683
OpenAlphaDiffract/
77-
├── configs/ - Pipeline configuration files
78-
├── docker/ - Container definitions
79-
├── scripts/ - User-facing scripts
80-
├── src/ - Source code for pipeline components
81-
│ ├── downloader/
82-
│ ├── simulator/
83-
│ ├── trainer/
84-
│ └── ui/
84+
├── configs/ - Pipeline configuration files
85+
│ ├── instruments/ - GSAS-II instrument parameter files
86+
│ └── resources/ - Space group distance matrix for GEMD loss
87+
├── docker/ - Container definitions
88+
├── scripts/ - User-facing scripts
89+
├── src/
90+
│ ├── downloader/ - Materials Project CIF acquisition
91+
│ ├── simulator/ - GSAS-II powder diffraction simulation
92+
│ ├── trainer/ - Model definition, dataset, and training loop
93+
│ └── ui/ - FastAPI backend + React frontend
94+
│ └── frontend/
95+
└── compose.yaml
8596
```
8697

8798

99+
## Testing
100+
101+
Tests run in CI via GitHub Actions on every push/PR to `main`. To run locally:
102+
103+
```bash
104+
# All Python tests (from repo root)
105+
pytest
106+
107+
# Individual components
108+
pytest src/downloader/tests/ -v
109+
pytest src/simulator/tests/ -v
110+
pytest src/trainer/tests/ -v
111+
pytest src/ui/tests/ -v
112+
113+
# Frontend tests
114+
cd src/ui/frontend && npx vitest run
115+
```
116+
117+
118+
## License
119+
120+
This project is licensed under the BSD 3-Clause License. See [LICENSE](LICENSE) for details.
121+
122+
88123
## Citation
89124

90-
We hope this code was helpful to your work! If you use our code or extend our work, please consider citing our paper. (Will be added once released!)
125+
We hope this code was helpful! Please consider citing our paper:
126+
127+
```bibtex
128+
@article{andrejevic2026alphadiffract,
129+
title={AlphaDiffract: Automated Crystallographic Analysis of Powder X-ray Diffraction Data},
130+
author={Andrejevic, Nina and Du, Ming and Sharma, Hemant and Horwath, James P. and Luo, Aileen and Yin, Xiangyu and Prince, Michael and Toby, Brian H. and Cherukara, Mathew J.},
131+
journal={arXiv preprint arXiv:2603.23367},
132+
year={2026}
133+
}
134+
```

0 commit comments

Comments
 (0)