-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Opencl vng demosaicer #20068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opencl vng demosaicer #20068
Conversation
|
Inspired by #20055 reporting an OpenCL performance regression for VNG demosaicer. @agat114 if you can self-compile you might want to check For performance checks use compared with RCD it shouldn't be that bad. Release note: Fixed some performance regression for OpenCL VNG demosaicers |
|
I'm going to compile today. |
|
Without this PR: With this PR: So a bit faster but nothing noticeable by user I would say. This is using an NVIDIA CUDA Quadro T1000. |
Thanks also for the profiling data. I checked here comparing to 5.2 branch too and couldn't find a performance regression. |
|
I'll run some measurements soon. Should I use huge files that cause tiling, or smaller ones, without tiling? |
small ones will do. Also compare vs RCD if possible please. |
|
@jenshannoschwalm Ignore the above. It seems I have pulled wrong repo and branch. |
|
@jenshannoschwalm I have compiled DT from your branch The stutter is still there with VNG4 when turning it on and modifying module parameters. No stutter, whatsoever, with AMAZE. UPD the version of compiled app is still weird, despite I pulled it from your repo and this PR branch |
|
I see no difference (well, no speed-up). RCD PR 5.5.0+58~g91629d8df3: VNG4 PR 5.5.0+58~g91629d8df3: Image: https://discuss.pixls.us/uploads/short-url/hqnf0h20vaTzsr5Ay0NmWsFsWu2.NEF with minimal processing, capture sharpen OFF. Nvidia 1060/6GB, resources = large. |
|
Thanks for all that testing, we seem to have a very small benefit with this PR on small GPUs. The main culprit seems to be the VNG algo itself. |
No problem. It was interesting to help. |
In my case, it actually became about a bit slower, 286 vs 265 ms. Of course, it was a single test. Such small changes, either speed-ups or slow-downs, are not really perceptible. I could try with a huge file, but then the tiling would dominate over the algo, I think. |
Fallback to non-const buffer should only be reported in DT_DEBUG_VERBOSE mode.
1. subtle vng border_interpolate kernel improvements - #define AVGWINDOW - use samplerA where coordinates have been checked 2. in OpenCL VNG code don't calculate and copy to device buffers if not required as we do only the linear interpolation part. 3. use vectorized copy_zero 4. capture log fix
When dual demosaicing we use the only_linear VNG/VNG4 demosaicer mode. For bayer sensors we must do the green-equilibrate too to avoid color casts.
91629d8 to
a12eea9
Compare
|
This introduces at least 2 regressions: The diff for 0163:
The diff for 0164:
In both cases the Max dE is > 10 and the number of changed pixel is quite big. @jenshannoschwalm : Can you look at this? TIA. |
|
Yes i just did again and confess openly, i don't have the integration suite tests not yet working on my new system :-) I am sure the new algo including green equil is correct and leading to better results - iirc i mentioned the color cast while working on 5.4 demosaicer. It's not very visible, especially on light-brown sands there was some color discrepancy between low vs high frequency content which is now gone. (xtrans was not affected as that is VNG) How to proceed? If you want me to do that, i could add another demoasic version bump introducing a flag |
|
@jenshannoschwalm : I had a look at the expected vs new output and I cannot see a difference visually. So let's this be the new expected output. Thanks. |
|
Maybe a release note: Fixed subtle color casts in bayer dual demosaicers |
|
Seems like this has also affected 0172-capture-dual-rcd. Is that expected? |
|
Will check again |
|
Sorry i misunderstood your question. Yes of course! The VNG4 linear-only step for dual demosaicing missed the greens-equil for bayer sensors. BTW there is one more pr about vng coming later |


Fix misleading OpenCL oversize buffer log
Fallback to non-const buffer should only be reported in DT_DEBUG_VERBOSE mode.
Fix some VNG OpenCL performance regressions
AVGWINDOWsamplerAwhere coordinates have been checked