Skip to content

Conversation

@nevi-me
Copy link
Contributor

@nevi-me nevi-me commented Jan 25, 2021

This is on top of #9297

I was curious if (ab)using the compute::unary kernel would perform better on slightly complex functions.

I implemented the Haversine function, which calculates the distance between two geographic coordinates.
I then benchmarked an implementation that I tried to simplify and optimise with unary kernels, vs one that I'd have to write if I couldn't use the unary kernels for things like:

  • arithmetics with scalars
  • functions that would otherwise require generating intermediate arrays (e.g. sin(x) * cos(x) would be multiply(sin(x), cos(x)))

The function that uses unary kernels for the above, is slightly faster.

I ran this on an M1 CPU, with the below options

cargo bench --bench trigonometry_kernels
cargo bench --bench trigonometry_kernels --features simd
RUSTFLAGS="-C target-cpu=native" cargo bench --bench trigonometry_kernels
RUSTFLAGS="-C target-cpu=native" cargo bench --bench trigonometry_kernels --features simd
haversine_no_unary 512  time:   [14.074 us 14.140 us 14.216 us]

haversine_unary 512     time:   [11.191 us 11.308 us 11.436 us]
haversine_no_unary_nulls 512                                                                             
                        time:   [15.902 us 15.985 us 16.083 us]

haversine_unary_nulls 512                                                                             
                        time:   [12.486 us 12.552 us 12.625 us]

The biggest benefit is from setting the RUSTFLAGS, the non-null benches go 3-10% faster.

@nevi-me nevi-me marked this pull request as draft January 25, 2021 06:04
@github-actions
Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on JIRA?
https://issues.apache.org/jira/browse/ARROW

Then could you also rename pull request title in the following format?

ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}

See also:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorgecarleitao I had to convert the floats to T::Native in order to use them in the unary kernel as part of Fn(_) expressions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not find any easy way of doing this, also. It feels like we would need a generic of a generic. It looks good to me, though.

@jorgecarleitao
Copy link
Member

IMO looked good to me (before the rebase), however, a more through review would need a rebase xD

@nevi-me nevi-me force-pushed the trigonometry-kernels branch 2 times, most recently from d19ae1d to ef36a3d Compare January 28, 2021 03:45
@nevi-me
Copy link
Contributor Author

nevi-me commented Jan 28, 2021

I've rebased.

Ideas for the PR before I work more on it:

  • Are the trig kernels fine (is the direction good)?
  • Should I remove Haversine? I was using it to evaluate the performance benefit of using unary

@codecov-io
Copy link

Codecov Report

Merging #9313 (ef36a3d) into master (53d392c) will increase coverage by 0.00%.
The diff coverage is 87.01%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #9313   +/-   ##
=======================================
  Coverage   81.90%   81.91%           
=======================================
  Files         216      217    +1     
  Lines       52990    53066   +76     
=======================================
+ Hits        43402    43468   +66     
- Misses       9588     9598   +10     
Impacted Files Coverage Δ
rust/arrow/src/compute/kernels/trigonometry.rs 86.48% <86.48%> (ø)
rust/arrow/src/compute/kernels/arity.rs 92.85% <100.00%> (+1.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 53d392c...ef36a3d. Read the comment docs.

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is ready to review, as it is marked as draft, but I reviewed it anyways.

LGTM, thanks a lot for taking this! :)

I left some ideas / suggestions but I am fine as is also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we move this to the tests, if it is only used for testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed it, as it's also in the benchmark

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try to get the result from the exercise, drop this code once the best option was found, and document which was was better in implementation notes on the decided approach.

@nevi-me
Copy link
Contributor Author

nevi-me commented Feb 4, 2021

Thanks @jorgecarleitao, I'm currently constrained with time during the week. I'll finish this up over the weekend

@nevi-me nevi-me force-pushed the trigonometry-kernels branch from ef36a3d to 7423452 Compare February 5, 2021 19:58
@alamb
Copy link
Contributor

alamb commented Mar 3, 2021

@nevi-me I am closing this PR for the time being to clean up the Rust/Arrow PR backlog. Please let me know if this is a mistake

@alamb alamb closed this Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants