Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
b99ae6e
Fix small bug in GenericDataAccessor definition
Nov 11, 2025
b9537ea
First draft of Warpmap Generation workgroup implementation
Nov 11, 2025
a737173
Add warp concept
Nov 18, 2025
64349db
Add spherical warp
Nov 18, 2025
e44fcf4
Remove envmap accessors.hlsl
Nov 18, 2025
9b29dfd
Hierarchical image sampling implementation
Nov 18, 2025
9be65a0
changed quaternion struct name, brought in quaternion stuff from prze…
keptsecret Dec 1, 2025
3e03467
_static_create quaternion to matrix, create quaternion from axis-angle
keptsecret Dec 2, 2025
34d1385
added more quaternion funcs: transformVec, inverse, normalize + stati…
keptsecret Dec 2, 2025
06ae3e4
Merge branch 'master' into env_map_importance_sampling
Dec 19, 2025
16ecb52
Merge branch 'master' into env_map_importance_sampling
Dec 20, 2025
8d682b9
Remove envmap.hlsl
Dec 20, 2025
890f7c6
Move to sampling namespace and implement backward density
Dec 22, 2025
f99c63b
Remove private, public from hierarchical_image
Dec 22, 2025
3ff2791
Refactor hierarchical image to keep accessor and common data as member
Dec 22, 2025
76ef536
Refactor hierarchical image to separate binarySearch from Hierarchica…
Dec 26, 2025
ef773fd
Fix Spherical warp indentation
Dec 26, 2025
b9467fe
Add some comment why we add xi to the sample uvs
Dec 26, 2025
3682604
Merge branch 'master' into env_map_importance_sampling
Dec 30, 2025
ac1e2f3
WIP
Jan 6, 2026
baca1cf
Rename uv to coord for LuminanceAccessor concepts
Jan 9, 2026
f12b797
Fix hierarchical_image.hlsl
Jan 9, 2026
0957aed
Fix typo in spherical.hlsl
Jan 9, 2026
1b35d34
Implement gen_luma, gen_warpmap and measure_luma shaders
Jan 9, 2026
665bb8d
EnvmapImportanceSampling CMakeLists
Jan 9, 2026
b522b4f
Initial implementation of CEnvmapImportanceSampling
Jan 9, 2026
3e51c69
Initial implementation of CEnvmapImportanceSampling
Jan 12, 2026
c72d305
Small fixes
Jan 12, 2026
5ee2ce7
Initial implementation of computeWarpMap
Jan 20, 2026
867868c
Fix arithmetic config no const specifier for method
Jan 30, 2026
1a66157
Define config_t from outside
Jan 30, 2026
d4b8105
More fixes on computeWarpMap implementation
Jan 30, 2026
8853738
Fix chose second to be placed inside the loop
Jan 30, 2026
6bde489
LuminanceReadAccessor take ScalarT as template parameter
Jan 30, 2026
756fbb0
gen_warpmap to gen_warp
Jan 30, 2026
8d64a19
Move hierarchical_image concepts to sampling subdirectory
Feb 18, 2026
70d8423
Add some comment regarding corner sampling
Feb 18, 2026
2842d29
Parameterize spherical warp and make sure all literal is in the corre…
Feb 18, 2026
a51848c
Refactor CEnvmapImportanceSampling to block and calculate avgLuma
Feb 18, 2026
58c9c13
merge master, fix conflicts
keptsecret Feb 19, 2026
fa94ac2
Fix warp concept and add density type to warp concept
Feb 19, 2026
8494124
Rename luminanceScale to lumaRGBCoefficients
Feb 19, 2026
b273d87
Remove measure_luma.comp.hlsl
Feb 19, 2026
3bc0e57
Fix some bug in hierarchical_image.hlsl
Feb 19, 2026
1dadf92
Rename luminanceScales to lumaRGBCoefficients
Feb 19, 2026
f19cbe9
Move EnvmapImportanceSampling from ext to core
Feb 21, 2026
fde2bba
Fix binarySearch implementation. when last is 2x1 we should check for…
Feb 21, 2026
81cae21
Rename private member with underscore prefix
Feb 21, 2026
733a4ab
Merge branch 'master' into env_map_importance_sampling
Feb 22, 2026
ba6be93
Update submodule to follow master branch
Feb 22, 2026
f04d98b
Add missed EnvmapSampler.h and cpp
Feb 23, 2026
df2bfc3
Rename get and gather to texelFetch and texelGather
Feb 23, 2026
4930e25
Merge branch 'hlsl_path_tracer_example' into env_map_importance_sampling
Feb 23, 2026
05b862a
Include missing files into commit
Feb 23, 2026
d50b50f
Merge branch 'master' into env_map_importance_sampling
Feb 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion 3rdparty/openexr
Submodule openexr updated 1 files
+2 −0 cmake/CMakeLists.txt
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ option(NBL_BUILD_EXAMPLES "Enable building examples" ON)
option(NBL_BUILD_MITSUBA_LOADER "Enable nbl::ext::MitsubaLoader?" ON)
option(NBL_BUILD_IMGUI "Enable nbl::ext::ImGui?" ON)
option(NBL_BUILD_DEBUG_DRAW "Enable Nabla Debug Draw extension?" ON)
option(NBL_BUILD_ENVMAP_IMPORTANCE_SAMPLING "Enable Nabla Envmap Importance Sampling extension?" ON)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no longer an extension, dont need the option


option(NBL_BUILD_OPTIX "Enable nbl::ext::OptiX?" OFF)
if(NBL_COMPILE_WITH_CUDA)
Expand Down
2 changes: 1 addition & 1 deletion examples_tests
Submodule examples_tests updated 37 files
+40 −0 31_HLSLPathTracer/CMakeLists.txt
+837 −0 31_HLSLPathTracer/app_resources/glsl/common.glsl
+182 −0 31_HLSLPathTracer/app_resources/glsl/litByRectangle.comp
+60 −0 31_HLSLPathTracer/app_resources/glsl/litBySphere.comp
+105 −0 31_HLSLPathTracer/app_resources/glsl/litByTriangle.comp
+36 −0 31_HLSLPathTracer/app_resources/hlsl/accumulator.hlsl
+50 −0 31_HLSLPathTracer/app_resources/hlsl/common.hlsl
+204 −0 31_HLSLPathTracer/app_resources/hlsl/concepts.hlsl
+409 −0 31_HLSLPathTracer/app_resources/hlsl/example_common.hlsl
+74 −0 31_HLSLPathTracer/app_resources/hlsl/intersector.hlsl
+242 −0 31_HLSLPathTracer/app_resources/hlsl/material_system.hlsl
+458 −0 31_HLSLPathTracer/app_resources/hlsl/next_event_estimator.hlsl
+268 −0 31_HLSLPathTracer/app_resources/hlsl/pathtracer.hlsl
+19 −0 31_HLSLPathTracer/app_resources/hlsl/present.frag.hlsl
+51 −0 31_HLSLPathTracer/app_resources/hlsl/rand_gen.hlsl
+75 −0 31_HLSLPathTracer/app_resources/hlsl/ray_gen.hlsl
+335 −0 31_HLSLPathTracer/app_resources/hlsl/render.comp.hlsl
+31 −0 31_HLSLPathTracer/app_resources/hlsl/render_common.hlsl
+17 −0 31_HLSLPathTracer/app_resources/hlsl/render_rwmc_common.hlsl
+66 −0 31_HLSLPathTracer/app_resources/hlsl/resolve.comp.hlsl
+15 −0 31_HLSLPathTracer/app_resources/hlsl/resolve_common.hlsl
+7 −0 31_HLSLPathTracer/app_resources/hlsl/rwmc_global_settings_common.hlsl
+252 −0 31_HLSLPathTracer/app_resources/hlsl/scene.hlsl
+28 −0 31_HLSLPathTracer/config.json.template
+17 −0 31_HLSLPathTracer/include/nbl/this_example/common.hpp
+167 −0 31_HLSLPathTracer/include/nbl/this_example/transform.hpp
+1,748 −0 31_HLSLPathTracer/main.cpp
+50 −0 31_HLSLPathTracer/pipeline.groovy
+17 −1 40_PathTracer/src/renderer/CRenderer.cpp
+72 −0 74_EnvmapImportanceSampling/CMakeLists.txt
+33 −0 74_EnvmapImportanceSampling/app_resources/common.hlsl
+24 −0 74_EnvmapImportanceSampling/app_resources/present.frag.hlsl
+103 −0 74_EnvmapImportanceSampling/app_resources/test.comp.hlsl
+28 −0 74_EnvmapImportanceSampling/config.json.template
+7 −0 74_EnvmapImportanceSampling/imagesTestList.txt
+440 −0 74_EnvmapImportanceSampling/main.cpp
+4 −2 CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ NBL_CONCEPT_END(
#include <nbl/builtin/hlsl/concepts/__end.hlsl>

template<typename T, typename V, typename I=uint32_t>
NBL_BOOL_CONCEPT GenericDataAccessor = GenericWriteAccessor<T,V,I> && GenericWriteAccessor<T,V,I>;
NBL_BOOL_CONCEPT GenericDataAccessor = GenericReadAccessor<T,V,I> && GenericWriteAccessor<T,V,I>;

}
}
Expand Down
193 changes: 193 additions & 0 deletions include/nbl/builtin/hlsl/sampling/hierarchical_image.hlsl
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
// Copyright (C) 2018-2025 - DevSH Graphics Programming Sp. z O.O.
// This file is part of the "Nabla Engine".
// For conditions of distribution and use, see copyright notice in nabla.h

#ifndef _NBL_BUILTIN_HLSL_SAMPLING_HIERARCHICAL_IMAGE_INCLUDED_
#define _NBL_BUILTIN_HLSL_SAMPLING_HIERARCHICAL_IMAGE_INCLUDED_

#include <nbl/builtin/hlsl/sampling/basic.hlsl>
#include <nbl/builtin/hlsl/sampling/warp.hlsl>
#include <nbl/builtin/hlsl/sampling/hierarchical_image/accessors.hlsl>
#include <nbl/builtin/hlsl/cpp_compat/intrinsics.hlsl>

namespace nbl
{
namespace hlsl
{
namespace sampling
{

template <typename ScalarT, typename LuminanceAccessorT
NBL_PRIMARY_REQUIRES(
is_scalar_v<ScalarT> &&
hierarchical_image::LuminanceReadAccessor<LuminanceAccessorT, ScalarT>
)
Comment on lines +20 to +24

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a bool for whether the accessor is corner sampled or not

struct LuminanceMapSampler

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is actually the HierarchicalLuminanceSampler

{
using scalar_type = ScalarT;
using vector2_type = vector<scalar_type, 2>;
using vector4_type = vector<scalar_type, 4>;

LuminanceAccessorT _map;
uint32_t2 _mapSize;
uint32_t2 _lastWarpPixel;
bool _aspect2x1;
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

store these instead

float32_t2 rcpMapSize;
uint16_t mip2x1 : 15;
uint16_t aspect2x1 : 1;


static LuminanceMapSampler<ScalarT, LuminanceAccessorT> create(NBL_CONST_REF_ARG(LuminanceAccessorT) lumaMap, uint32_t2 mapSize, bool aspect2x1, uint32_t2 warpSize)
{
LuminanceMapSampler<ScalarT, LuminanceAccessorT> result;
result._map = lumaMap;
result._mapSize = mapSize;
result._lastWarpPixel = warpSize - uint32_t2(1, 1);
result._aspect2x1 = aspect2x1;
return result;
}

static bool choseSecond(scalar_type first, scalar_type second, NBL_REF_ARG(scalar_type) xi)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefix with __ for private methods

{
// numerical resilience against IEEE754
scalar_type dummy = scalar_type(0);
PartitionRandVariable<scalar_type> partition;
partition.leftProb = scalar_type(1) / (scalar_type(1) + (second / first));
return partition(xi, dummy);
}

vector2_type binarySearch(const uint32_t2 coord)
{
// We use _lastWarpPixel here for corner sampling
float32_t2 xi = float32_t2(coord)/ _lastWarpPixel;
uint32_t2 p = uint32_t2(0, 0);
const uint32_t2 mip2x1 = findMSB(_mapSize.y);
Comment on lines +55 to +60

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be able to use this as a sampler, I also need a codomain_type generate(const domaint_type xi) so best you construct this binarySearch in terms of that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also wouldn't store _lastWarpPixel here then, because it would just be a warpmap agnostic sampler

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you using a uint32_t2 to store this, its a uint16_t at most, its a scalar quantity

Also should be precomputed


if (_aspect2x1) {
// do one split in the X axis first cause penultimate full mip would have been 2x1
p.x = choseSecond(_map.texelFetch(uint32_t2(0, 0), mip2x1), _map.texelFetch(uint32_t2(1, 0), mip2x1), xi.x) ? 1 : 0;
}

for (int i = mip2x1 - 1; i >= 0; i--)
{
p <<= 1;
const vector4_type values = _map.texelGather(p, i);
scalar_type wx_0, wx_1;
{
const scalar_type wy_0 = values[3] + values[2];
const scalar_type wy_1 = values[1] + values[0];
if (choseSecond(wy_0, wy_1, xi.y))
{
p.y |= 1;
wx_0 = values[0];
wx_1 = values[1];
}
else
{
wx_0 = values[3];
wx_1 = values[2];
}
}
if (choseSecond(wx_0, wx_1, xi.x))
p.x |= 1;
Comment on lines +87 to +88

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw you can get the PDF of the finally chosen texel as metadata if on the final call to choseSecond you save wx_0 or wx_1 and thats your chosen pixel luminance, you just need to know the average to and total number of texels (corner sampled edges count as half pixels) get the PDF

(obvioulsy if mip2x1 is 0, the PDF is either 1.0 or the dummy from the choice on line 64)

}


// If we don`t add xi, the sample will clump to the lowest corner of environment map texel. We add xi to simulate uniform distribution within a pixel and make the sample continuous. This is why we compute the pdf not from the normalized luminance of the texel, instead from the reciprocal of the Jacobian.
const vector2_type directionUV = (vector2_type(p.x, p.y) + xi) / vector2_type(_mapSize);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if your spheremap or octahedral maps are corner sampled, the edge pixels need special treatment

Essentially whenever your X or Y coordinate is an edge coordinate, you'd need to weight their luma contribution down by 50% and also change how the remaining xi gets added:

  1. coord==0 then xi gets rescaled to [0.5,1.0]
  2. coord==Last then xi gets rescaled to [0,0.5]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then you'd actually rescale that final UV from [0.5/size,1-0.5/size] to [0,1] before outputting it in the warpmap

return directionUV;
}

matrix<scalar_type, 4, 2> sampleUvs(uint32_t2 sampleCoord) NBL_CONST_MEMBER_FUNC
{
const vector2_type dir0 = binarySearch(sampleCoord + vector2_type(0, 1));
const vector2_type dir1 = binarySearch(sampleCoord + vector2_type(1, 1));
const vector2_type dir2 = binarySearch(sampleCoord + vector2_type(1, 0));
const vector2_type dir3 = binarySearch(sampleCoord);
return matrix<scalar_type, 4, 2>(
dir0,
dir1,
dir2,
dir3
);
}
Comment on lines +97 to +109
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure that sampleUvs should be exposed like that (if something needs it -like a test- let them call binarySearch 4 times), why?

  1. if not using a warpmap you wouldn't actually importance sample 4 times to perform finite difference and get your jacobian, you'd use the Luma of the image (with NEAREST not linear sampler) as the PDF because thats 100% accurate
  2. if using a warmap you'd textureGather it and work out the Jacobian from differntiating the bilinear interpolation equation

};

template <typename ScalarT, typename LuminanceAccessorT, typename HierarchicalSamplerT, typename PostWarpT
NBL_PRIMARY_REQUIRES(is_scalar_v<ScalarT> &&
concepts::accessors::GenericReadAccessor<LuminanceAccessorT, ScalarT, float32_t2> &&
hierarchical_image::HierarchicalSampler<HierarchicalSamplerT, ScalarT> &&
concepts::Warp<PostWarpT>)
struct HierarchicalImage
Comment on lines +112 to +117

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be called WarpmapSampler or something like that

{
using scalar_type = ScalarT;
using vector2_type = vector<ScalarT, 2>;
using vector3_type = vector<ScalarT, 3>;
using vector4_type = vector<ScalarT, 4>;
LuminanceAccessorT _lumaMap;
HierarchicalSamplerT _warpMap;
uint32_t2 _warpSize;
uint32_t2 _lastWarpPixel;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you actually only need to store the _lastWarpPixel.x * _lastWarpPixel.y prodct, I expect the warpmap to handle gathering

scalar_type _rcpAvgLuma;

static HierarchicalImage create(NBL_CONST_REF_ARG(LuminanceAccessorT) lumaMap, NBL_CONST_REF_ARG(HierarchicalSamplerT) warpMap, uint32_t2 warpSize, scalar_type avgLuma)
{
HierarchicalImage<ScalarT, LuminanceAccessorT, HierarchicalSamplerT, PostWarpT> result;
result._lumaMap = lumaMap;
result._warpMap = warpMap;
result._warpSize = warpSize;
result._lastWarpPixel = warpSize - uint32_t2(1, 1);
result._rcpAvgLuma = ScalarT(1.0) / avgLuma;
return result;
}

vector2_type inverseWarp_and_deferredPdf(NBL_REF_ARG(scalar_type) pdf, vector3_type direction) NBL_CONST_MEMBER_FUNC
{
vector2_type envmapUv = PostWarpT::inverseWarp(direction);
scalar_type luma;
_lumaMap.get(envmapUv, luma);
pdf = (luma * _rcpAvgLuma) * PostWarpT::backwardDensity(direction);
return envmapUv;
}

scalar_type deferredPdf(vector3_type direction) NBL_CONST_MEMBER_FUNC
{
vector2_type envmapUv = PostWarpT::inverseWarp(direction);
scalar_type luma;
_lumaMap.get(envmapUv, luma);
return luma * _rcpAvgLuma * PostWarpT::backwardDensity(direction);
}

vector3_type generate_and_pdf(NBL_REF_ARG(scalar_type) pdf, NBL_REF_ARG(vector2_type) uv, vector2_type xi) NBL_CONST_MEMBER_FUNC
Comment on lines +140 to +157

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it conform to #1001 's ResamplableSampler

{
const vector2_type texelCoord = xi * float32_t2(_lastWarpPixel);

matrix<scalar_type, 4, 2> uvs = _warpMap.sampleUvs(uint32_t2(texelCoord));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warpmap should convert you from [0,1] normalized xi to its own texels,the sampleUvs should really work same as textureGather, it should take normalized uvs as input

const vector2_type interpolant = frac(texelCoord);
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the _warpMap.gather spit out the interpolant so its cleaner (requires a specialized gather overload, adding on top of Gatherable concept)


const vector2_type xDiffs[] = {
uvs[2] - uvs[3],
uvs[1] - uvs[0]
};
const vector2_type yVals[] = {
xDiffs[0] * interpolant.x + uvs[3],
xDiffs[1] * interpolant.x + uvs[0]
};
const vector2_type yDiff = yVals[1] - yVals[0];
uv = yDiff * interpolant.y + yVals[0];

const WarpResult<vector3_type> warpResult = PostWarpT::warp(uv);

const scalar_type detInterpolJacobian = determinant(matrix<scalar_type, 2, 2>(
lerp(xDiffs[0], xDiffs[1], interpolant.y), // first column dFdx
yDiff // second column dFdy
)) * _lastWarpPixel.x * _lastWarpPixel.y;

pdf = abs(warpResult.density / detInterpolJacobian);

return warpResult.dst;
}
};
Comment on lines +112 to +187
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate file, also struct needs better name like WarpMap sampler


}
}
}

#endif
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#ifndef _NBL_BUILTIN_HLSL_HIERARCHICAL_IMAGE_ACCESSORS_INCLUDED_
#define _NBL_BUILTIN_HLSL_CONCEPTS_ACCESSORS_HIERARCHICAL_IMAGE_INCLUDED_

#include "nbl/builtin/hlsl/concepts/accessors/generic_shared_data.hlsl"

namespace nbl
{
namespace hlsl
{
namespace sampling
{
namespace hierarchical_image
{
// declare concept
#define NBL_CONCEPT_NAME LuminanceReadAccessor
#define NBL_CONCEPT_TPLT_PRM_KINDS (typename)(typename)
#define NBL_CONCEPT_TPLT_PRM_NAMES (U)(ScalarT)
// not the greatest syntax but works
#define NBL_CONCEPT_PARAM_0 (a,U)
#define NBL_CONCEPT_PARAM_1 (coord,uint32_t2)
#define NBL_CONCEPT_PARAM_2 (level,uint32_t)
// start concept
NBL_CONCEPT_BEGIN(3)
// need to be defined AFTER the concept begins
#define a NBL_CONCEPT_PARAM_T NBL_CONCEPT_PARAM_0
#define coord NBL_CONCEPT_PARAM_T NBL_CONCEPT_PARAM_1
#define level NBL_CONCEPT_PARAM_T NBL_CONCEPT_PARAM_2
NBL_CONCEPT_END(
((NBL_CONCEPT_REQ_EXPR_RET_TYPE)((a.template texelFetch(coord,level)) , ::nbl::hlsl::is_same_v, ScalarT))
((NBL_CONCEPT_REQ_EXPR_RET_TYPE)((a.template texelGather(coord,level)) , ::nbl::hlsl::is_same_v, vector<ScalarT, 4>))
);
#undef level
#undef coord
#undef a
#include <nbl/builtin/hlsl/concepts/__end.hlsl>
Comment on lines +14 to +35
Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you merge latest path tracer branch, you can forego this concept in lieu of a LoadableImage with Components=1 and add a GatherableImage with all the other accessors , see #969 (comment)


// sampleUvs return 4 UVs in a square to calculate the jacobian matrix

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace "to calculate the jacobian matrix" with "for manual bilinear interpolation with differentiability"

// declare concept
#define NBL_CONCEPT_NAME HierarchicalSampler
#define NBL_CONCEPT_TPLT_PRM_KINDS (typename)(typename)
#define NBL_CONCEPT_TPLT_PRM_NAMES (HierarchicalSamplerT)(ScalarT)
// not the greatest syntax but works
#define NBL_CONCEPT_PARAM_0 (sampler,HierarchicalSamplerT)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sampler is a reserved keyword in HLSL

#define NBL_CONCEPT_PARAM_1 (coord,vector<uint32_t, 2>)
// start concept
NBL_CONCEPT_BEGIN(2)
// need to be defined AFTER the concept begins
#define sampler NBL_CONCEPT_PARAM_T NBL_CONCEPT_PARAM_0
#define coord NBL_CONCEPT_PARAM_T NBL_CONCEPT_PARAM_1
NBL_CONCEPT_END(
((NBL_CONCEPT_REQ_EXPR_RET_TYPE)((sampler.template sampleUvs(coord)) , ::nbl::hlsl::is_same_v, matrix<ScalarT, 4, 2>))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signature should be template sampleUvs<matrix<>,vector<>>(outMatrix,coord)

);
#undef sampler
#undef coord
#include <nbl/builtin/hlsl/concepts/__end.hlsl>

}
}
}
}

#endif
26 changes: 26 additions & 0 deletions include/nbl/builtin/hlsl/sampling/hierarchical_image/common.hlsl
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#ifndef _NBL_HLSL_SAMPLING_HIERARCHICAL_IMAGE_COMMON_INCLUDED_
#define _NBL_HLSL_SAMPLING_HIERARCHICAL_IMAGE_COMMON_INCLUDED_

#include "nbl/builtin/hlsl/cpp_compat.hlsl"

namespace nbl
{
namespace hlsl
{
namespace sampling
{
namespace hierarchical_image
{

struct SLumaGenPushConstants
{
float32_t3 lumaRGBCoefficients;
uint32_t2 lumaMapResolution;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use uint32_t bitfield of 16 bit each per axis (don't use uint16_t2 though because AMD doesn't support 16 bit push constants)

};

}
}
}
}

#endif
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#include "common.hlsl"

using namespace nbl;
using namespace nbl::hlsl;
using namespace nbl::hlsl::sampling::hierarchical_image;

[[vk::push_constant]] SLumaGenPushConstants pc;

[[vk::binding(0, 0)]] Texture2D<float32_t4> envMap;
[[vk::binding(1, 0)]] RWTexture2D<float32_t> outImage;
Comment on lines +9 to +10

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assumes a 2D map and not a cubemap/layered thing, use layered images instead


[numthreads(WORKGROUP_DIM, WORKGROUP_DIM, 1)]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho don't pass a define, and hardcode to 16x16 (but obviously leave a cosntexpr in a shared header so the c++ can be in-sync)

[shader("compute")]
void main(uint32_t3 threadID : SV_DispatchThreadID)
{
if (all(threadID < pc.lumaMapResolution))
{

const float uv_y = (float(threadID.y) + float(0.5f)) / pc.lumaMapResolution.y;
const float32_t3 envMapSample = envMap.Load(float32_t3(threadID.xy, 0));
const float32_t luma = hlsl::dot(envMapSample, pc.lumaRGBCoefficients) * sin(numbers::pi<float32_t> * uv_y);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assumes a spherical warp

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generalize


outImage[threadID.xy] = luma;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#include "nbl/builtin/hlsl/sampling/hierarchical_image.hlsl"

[[vk::binding(0, 0)]] Texture2D<float32_t> lumaMap;

[[vk::binding(1, 0)]] RWTexture2D<float32_t2> outImage;

using namespace nbl;
using namespace nbl::hlsl;
using namespace nbl::hlsl::sampling;

struct LuminanceAccessor
{
float32_t texelFetch(uint32_t2 coord, uint32_t level)
{
return lumaMap.Load(uint32_t3(coord, level));
}

float32_t4 texelGather(uint32_t2 coord, uint32_t level)
{
return float32_t4(
lumaMap.Load(uint32_t3(coord, level), uint32_t2(0, 1)),
lumaMap.Load(uint32_t3(coord, level), uint32_t2(1, 1)),
lumaMap.Load(uint32_t3(coord, level), uint32_t2(1, 0)),
lumaMap.Load(uint32_t3(coord, level), uint32_t2(0, 0))
);
Comment on lines +20 to +25

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw OOB reads using texelFetch are undefined, asser that coord is between 0 and lastPixel-1 for both axes


}
};

[numthreads(WORKGROUP_DIM, WORKGROUP_DIM, 1)]
[shader("compute")]
void main(uint32_t3 threadID : SV_DispatchThreadID)
{
LuminanceAccessor luminanceAccessor;
uint32_t lumaMapWidth, lumaMapHeight;

lumaMap.GetDimensions(lumaMapWidth, lumaMapHeight);

using LuminanceSampler = LuminanceMapSampler<float32_t, LuminanceAccessor>;

LuminanceSampler luminanceSampler =
LuminanceSampler::create(luminanceAccessor, uint32_t2(lumaMapWidth, lumaMapHeight), lumaMapWidth != lumaMapHeight, uint32_t2(lumaMapWidth, lumaMapHeight));
Comment on lines +41 to +42

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we passing the uint32_t2(lumaMapWidth, lumaMapHeight) resolution twice ?


uint32_t2 pixelCoord = threadID.xy;

outImage[pixelCoord] = luminanceSampler.binarySearch(pixelCoord);

Comment on lines +44 to +47

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to handle OOB, I can have a 4x4 sphere map, but my workgroup is 16x16

}
Loading
Loading