-
Notifications
You must be signed in to change notification settings - Fork 168
Further cleaning the handling of grid searching errors #2177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further cleaning the handling of grid searching errors #2177
Conversation
Note that this breaks some of the unit tests
|
If you get rid of the while loop, how will we handle the case where particles come in with an existing |
Well, since the Morton encoding is so fast, we may not even need But if this PR is successful, we could save some memory overhead by not using the element index? |
|
I guess it would be worth pulling runtimes for a few key curvilinear grid examples (e.g. MOi) from v4-dev and comparing. In the meantime, I'm working on getting you up to speed on how the morton hashing works. There's a new notebook on the unstructured-mesh-parcels-sandbox repository I should be able to finish up in a few days . The goal is to help facilitate discussion on algorithm choices around two main features of the morton hashing implementation
|
…l check The query method now requires a point in cell check method to be sent in as an argument
|
@erikvansebille - this last commit here fixes the issue. To gain speed, I originally went with a quick and dirty distance minimization for the point-in-cell check. Additionally, this only required registering faces to the hash table based on morton encoding of the source grid face centers. While this is fast, and puts us "in the neighborhood", as you've found, still requires some additional searching. Here, I've switched (back) to registering faces to the hash table by calculating all the morton codes associated with a faces bounding box. This increases, slightly, the storage requirement for the hash table, but still seems to be ok; I need to see how this does on the MOi grid. Additionally, the query method now requires a What needs to happen next (assuming we're happy with this)
Let me know if you'd want to hop on a call Monday afternoon to talk this over a bit. Could be quicker in getting this resolved. |
This ensures overlaps are properly calculated across antimeridian
|
We're back to having long construction times for the MOi grid. I'm working on incorporating land masks again |
My calendar is fully blocked on Monday and Tuesday; shall we discuss this on our regular Wednesday call? Not sure I'm a fan of landmasks though; it puts a lot of extra burden on users (who may not even have a landmask of their flow field). If we can't get the cell search faster, then let's just leave the while-loop in? |
|
The search is plenty fast. It's the construction (of specifically MOi) that is painfully slow |
|
I'm planning on seeing if increasing bit width helps improve initialization time. This should effectively increase resolution of the implied hash grid and reduce the number of hits per hash cell during construction. As before, some has cells overlap thousands of elements which is what is causing the slowdown. |
|
@erikvansebille - I've done some work to vectorize the hash table creation which significantly improves performance. On the MOi grid, it takes 8-9s (on my laptop) for initialization of the hash table, and less than a second for 1million particle search. Right now, the computational coordinates are computed twice. We'll need to look into a strategy to only compute them once during the point-in-cell check and filter out the correct ones within query. I'm keen to see what your experience is with this new version on the MOi grid. @VeckoTheGecko - I'd be interested to hear your thoughts on code organization here. Let me know what you think. |
|
Great work @fluidnumerics-joe! I just added performance benchmarking at Parcels-code/parcels-benchmarks#5 (comment). We get a factor 10 speedup for 5 million particles The question is whether we consider this fast enough to drop |
Too cool! It was looking good on my side, but didn't want to claim victory until you had numbers on your system too. This is fantastic
I think that's a good idea. I'd like to have the particle in cell check return computational coordinates anyways (to cut the number of coordinate evaluations in half) and this does seem to fit nicely. |
This function is now only needed within the uxgrid module
|
@erikvansebille - Did a little digging here. That peak memory consumption occurs during the hash table construction, and very little is needed for querying (see e.g. for 1000 particles). In this example, the construction is called separate However, I'm not able to reproduce the memory behavior with increasing particle count that you're seeing here. On my system, |
We're not anticipating having beyond 2^31 faces in problems for the forseeable future.
|
The last commit uses all int32 for integers in hash table construction drops peak memory usage during initial construction from 10454 MB. The main limitation is that a mesh can't have more than 2^31 (~2 billion) faces. Since most global meshes at mesoscale resolving range have O( 1-10 million ), I think we'll be fine. |
This eliminates the extra step needed to mask out grid search errors
This removes the need for passing the PIC method during the query call. At the moment we're assuming that if an XGrid type comes in for the source_grid that it is indeed curvilinear.
…earch_without_while_loop
…earch_without_while_loop
|
@erikvansebille - I'm curious to see how this latest version works with your benchmarks |
Just checked; same speed and memory footprint as yesterday. Is that to be expected? |
VeckoTheGecko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No further comments from my end - though my review here has been at a more surface level. If there's anything specific you'd like to me look at I can shift some bandwidth to this PR
Co-authored-by: Nick Hodgskin <36369090+VeckoTheGecko@users.noreply.github.com>
I think 0be54db is the last major change for memory consumption. Between that and the most recent version, we've cut out the additional computational coordinate calculation. Getting the same speed tells me that the additional coordinate calculation was not a dominant contributor to runtime. I can't say if this was expected or not without a hotspot profile . Final remark on memory consumption though.. in testing I did find that the peak memory consumption occurs during hash table initialization (hardly any is used in query). So, if we were to look at reducing memory consumption in the future, effort ought to be focused in the initialization method for the SpatialHash class. |
|
Hi @fluidnumerics-joe , is there something in specific that you would like me to review here? If so, I can get onto that next week I'm currently focusing on FieldSet ingestion which I think needs to remain my priority so I don't know if I can commit enough bandwidth here to do a detailed here. Have a great weekend :)) |
|
@VeckoTheGecko - Ive made all the requested changes. I'm really looking for approval on this at this point. I'm not too keen on starting the integration with UXGrid unless this is settled and merged into v4-dev. |
VeckoTheGecko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I should have clicked "approve" and not "comment" in the prior review :)
This PR addresses #2175 by exploring whether it is possible to remove the while-loop in the
_search_indices_curvilinear_2dfunction.Currently, there are three breaking unit tests:
Note that in the first breaking unit test, a -3 means
GRID_SEARCH_ERROR, so this is a clear case where the first computation ofxsiandetafailed to produce results within the [0, 1] rangemainfor v3 changes,v4-devfor v4 changes)