Skip to content

Corrfunc 3.0 release discussion #328

@manodeep

Description

@manodeep

@lgarrison To guide what needs to/could happen for Corrfunc 3.0, here is my list:

Essential (?)

  • Remove python2 support completely

    • Remove python2 from setup.py
    • Remove python2 related code from C extensions
    • Remove from __future__ type constructs from python code
  • Add modern packaging with pyproject.toml / meson / whatever-else-we-should-be-using

    • how?
  • Solve the multiple OpenMP runtime library issue

    • While I don't know if there is any way to remove the duplicate OMP runtime library linking with C code, we might be able to detect that there is an issue by running some simple OMP reduction routines as a C extension and checking if the result matches the correct answer.
    • We might also want to switch to creating shared libraries rather than static libraries

Possibly (?)

  • Add numbins optimisation that only uses the number of bins necessary (as determined by the min. and max. distance possible between two cell-pairs)

    • Add a function that takes a pair of cells and the histogram bins and returns the min. and max. bin-indices needed
    • These bin indices should be stored as part of the cell-pair struct and used to call the SIMD kernels (the bin indices need to be initialised to 0 and nbins-1)
    • Ideally, there is a specialised kernel for a single histogram bin that uses a simd reduction clause, or simply a += . This could be in conjunction with a bit-setting routine that sets bits and shifts to count up to 64 pairs before calling popcnt. However, I don't see how to do that without a LOT OF code duplication.
  • Change the OpenMP parallelization to go over cell-pairs (improves cache utilisation, reduces memory requirement -> we can increase the max-bin-ref factors)

    • Create a new generate_cell_pairs function that returns the potential neighbouring cell-pairs for any given primary cell
    • Change the OpenMP parallelization to go over these cell-pairs (improves cache re-use since the primary cell is always one of the cells)
    • Test that the code gives identical results but (hopefully) faster

May be (?)

  • Add ARM64 kernels

    • Add kernels to all pair-counters
    • Run the INTEGRATION_TESTS on laptop
    • Add -march=armv8a (or -mcpu=apple-m1) to CFLAGS
    • Make sure that the OSX tests run for both ARM64 and Intel cpus
  • Rename package to corrfunc and release conda wheels

    • Add target_clones to all functions
    • cross-compile with recent gcc (possibly on linux + x86_64)
    • Add deprecation option so that import Corrfunc still works (how?)

This is also open for community discussion. If anyone has opinions on what should go in, please do add a comment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions