Conversation
- Enable pulling from private GitLab repo - Improve caching for pip, apt and cargo - Fix cMake version - Remove problematic pip installation in favor of apt package - Add ZSH an Oh My ZSH - Add package dependencies for GAP9 SDK - Remove unused files from the container - Fix banshee package problems
… issueFix duplicate template generation due to PULP inheritance issue
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@TargetLibraries/GAP9/prebuilt/include/pulp_nn_utils.h`:
- Around line 1616-1633: The tail loop uses the variable left as the loop guard
but erroneously decrements length inside the loop, causing an infinite loop; in
the loop in pulp_nn_utils (the block that extracts nibbles with bitextu from
*pIn and *pCom, computes inA0/inA1/inB0/inB1, and writes back with bitins)
replace the decrement of length (length--) with a decrement of left (left--) so
the while(left > 0u) loop terminates correctly while leaving pIn, pCom, and
length semantics intact.
🧹 Nitpick comments (1)
TargetLibraries/GAP9/prebuilt/include/pulp_nn_utils.h (1)
857-865: Existing TODO indicates potential correctness issue for non-multiple-of-8 channel sizes.The TODO comment warns that this code path may be incorrect when the number of channels is not a multiple of 8. The leftover loop decrements
lfover -= 8which could cause issues or skip elements iflfoveris between 1-7.Would you like me to open an issue to track fixing this limitation, or is this an acceptable constraint documented elsewhere?
- Add GAP9 tiled L3 memory mode tests (singlebuffer/doublebuffer) - Fix duplicate NUM_CORES definition (global → target-specific) - Add prebuilt library NUM_CORES validation with auto-fallback - Fix hex file count reporting (CMAKE_MATCH_COUNT → list LENGTH) - Reduce dangerous warning suppressions in deeploygap9 - Standardize math.h include in BatchNorm_fp32.c - Update GVSOC_INSTALL_DIR error messages (env var → CMake var) - Add stack size configuration comments in sdk.config - Remove dead code in deeployRunner_tiled_gap9.py
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@cmake/gap9/gap9_gvsoc.cmake`:
- Around line 18-21: The guard currently uses if(NOT DEFINED GAP9_SDK_HOME)
which won't catch an empty environment value because set(GAP9_SDK_HOME
$ENV{GAP_SDK_HOME}) always creates the CMake variable; change the check to test
for emptiness or the environment directly: either replace the guard with if(NOT
GAP9_SDK_HOME) after the set, or check the environment before setting with
if(NOT DEFINED ENV{GAP_SDK_HOME}) and then set(GAP9_SDK_HOME
$ENV{GAP_SDK_HOME}); update the error message path check around GAP9_SDK_HOME
accordingly so an empty or missing SDK path triggers the FATAL_ERROR.
In `@cmake/simulation.cmake`:
- Around line 84-95: The add_custom_target for gvsoc_${name} currently uses a
COMMAND that includes shell-specific constructs (wildcard *.bin and "|| true")
which are not handled because add_custom_target does not run a shell; replace
the copy COMMAND so it runs via an explicit shell at build time (e.g., invoke sh
-c "...") or implement a build-time CMake script invoked by a COMMAND that
checks for and copies any *.bin files, ensuring the wildcard expansion and the
"|| true" fallback happen at build execution rather than configure time; update
the COMMAND line in the add_custom_target definition (the one referencing
${CMAKE_COMMAND} -E copy_if_different ${CMAKE_BINARY_DIR}/*.bin
${GVSOC_WORKDIR}/ || true) to use the shell wrapper or the build-time script so
the copy is evaluated correctly.
In `@Deeploy/Targets/GAP9/Platform.py`:
- Around line 275-306: The untiledOps lists in GAP9Platform and
MemoryGAP9PlatformWrapper currently contain the lowercase string "add", which
never matches ONNX node.op values like "Add"; update the untiledOps definitions
(in both GAP9Platform.untiledOps and MemoryGAP9PlatformWrapper.untiledOps) to
use the correct ONNX name "Add" (or implement a case-insensitive check in
getTargetMemoryLevel by normalizing node.op) so the getTargetMemoryLevel methods
properly detect Add nodes and return ctxt.lookup(tensorName)._memoryLevel.
In `@DeeployTest/Platforms/GAP9/sdk.config`:
- Around line 5-6: The config currently enables two boards
(CONFIG_BOARD_GAP9MOD_V1_0_B and CONFIG_BOARD_GAP9EVK_V1_3) which conflicts with
the build target hardcoded as --target=gap9.evk in gap9_gvsoc.cmake; update
sdk.config to select only the EVK board to match the CMake target (remove or set
to n the CONFIG_BOARD_GAP9MOD_V1_0_B entry) or alternatively update
gap9_gvsoc.cmake to accept the MOD board if intended—ensure only one of
CONFIG_BOARD_GAP9MOD_V1_0_B or CONFIG_BOARD_GAP9EVK_V1_3 remains enabled to
remove ambiguity.
🧹 Nitpick comments (3)
CMakeLists.txt (2)
36-40: GuardGAP_SDK_HOMEbefore include to avoid opaque configure failures.
If the env var is missing,include()fails with a confusing path error. A direct guard makes the failure explicit.🔧 Suggested patch
elseif(platform STREQUAL GAP9) message(STATUS "Building for platform 'GAP9'") set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}) set(ENV{KCONFIG_CONFIG} DeeployTest/Platforms/GAP9/sdk.config) - include($ENV{GAP_SDK_HOME}/utils/cmake/setup.cmake) + if(NOT DEFINED ENV{GAP_SDK_HOME}) + message(FATAL_ERROR "GAP_SDK_HOME is not set") + endif() + include($ENV{GAP_SDK_HOME}/utils/cmake/setup.cmake)
225-227: Avoid undefinedTESTNAMEin GAP9 project setup.
IfTESTNAMEisn’t provided,project(${TESTNAME} ...)can fail at configure time. Consider a safe default or explicit error.🔧 Suggested patch
if(platform STREQUAL GAP9) - project(${TESTNAME} LANGUAGES C ASM) + if(NOT DEFINED TESTNAME) + set(TESTNAME deeploy) + endif() + project(${TESTNAME} LANGUAGES C ASM)Deeploy/Targets/GAP9/Platform.py (1)
264-286: Avoid mutable defaults and instantiated objects in__init__parameters.
Using a list literal withGAP9ClusterEngine(...)shares state across instances.🔧 Suggested patch
class GAP9Platform(DeploymentPlatform): def __init__(self, - engines = [GAP9ClusterEngine("GAP9Cluster")], + engines = None, variableBuffer = GAP9VariableBuffer, constantBuffer = GAP9ConstantBuffer, structBuffer = GAP9StructBuffer, transientBuffer = GAP9TransientBuffer) -> None: + if engines is None: + engines = [GAP9ClusterEngine("GAP9Cluster")] super().__init__(engines, variableBuffer, constantBuffer, structBuffer, transientBuffer) @@ class MemoryGAP9Platform(MemoryPlatform): def __init__(self, memoryHierarchy: MemoryHierarchy, defaultTargetMemoryLevel: MemoryLevel, - engines = [GAP9ClusterEngine("GAP9Cluster")], + engines = None, variableBuffer = GAP9VariableBuffer, constantBuffer = GAP9ConstantBuffer, structBuffer = GAP9StructBuffer, transientBuffer = GAP9TransientBuffer) -> None: + if engines is None: + engines = [GAP9ClusterEngine("GAP9Cluster")] super().__init__(memoryHierarchy, defaultTargetMemoryLevel, engines, variableBuffer, constantBuffer, structBuffer, transientBuffer)
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.github/workflows/_runner-gap9.yml:
- Around line 38-42: The workflow step named "Cache ccache" currently uses
actions/cache/restore@v4 to only restore /app/.ccache (key ccache-ci) but never
saves updated entries; update this by either replacing uses:
actions/cache/restore@v4 with uses: actions/cache@v4 so restore+save are handled
automatically, or add a matching post step that uses actions/cache/save@v4 with
the same path (/app/.ccache) and key (ccache-ci) so that ccache entries produced
during runs (CCACHE_DIR=/app/.ccache) are persisted across runs.
The prebuilt pulp-nn library headers already have Apache 2.0 licenses but lack SPDX identifiers. Since these are external prebuilt binaries (similar to third_party code), they should be excluded from SPDX license validation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In @.github/workflows/infra-generate-ccache-gap9.yml:
- Around line 16-17: The comment stating the cron schedule is CET is incorrect
because GitHub Actions uses UTC; update the comment above the cron entry that
currently reads the schedule as "0 2 * * *" so it accurately states the timezone
(UTC) or converts the time to CET if you want a 02:00 CET run (e.g., note that
"0 2 * * *" runs at 02:00 UTC / 03:00 CET). Keep the cron line "- cron: \"0 2 *
* *\"" as-is if you intend UTC and change the explanatory comment to mention
UTC, or adjust the cron expression and comment together if you want actual CET
scheduling.
- Around line 50-54: The cache step using actions/cache@v4 currently uses a
static key "ccache-gap9" which prevents updates; update the step that defines
path: /app/.ccache and key: ccache-gap9 to generate a unique, versioned key
(e.g., include github.run_id or a hashFiles() call) and add a restore-keys entry
so older caches can be used as fallbacks; specifically modify the "Clean and
Upload CCache" step to compute a dynamic key (for example by combining a prefix
like "ccache-gap9-" with hashFiles('**/ccache-config') or github.run_id) and add
restore-keys: ccache-gap9- to allow partial restores.
In `@Deeploy/Targets/GAP9/Platform.py`:
- Around line 202-218: The _bufferRepresentation output in GAP9TransientBuffer
uses self._type while GAP9VariableBuffer uses self._instance, causing
inconsistent type/property serialization; inspect the classes
GAP9TransientBuffer and GAP9VariableBuffer and decide on the canonical attribute
name (either _type or _instance), then update
GAP9TransientBuffer._bufferRepresentation (or GAP9VariableBuffer) to use the
chosen attribute consistently and add a fallback (e.g., check hasattr and
default to None) to avoid AttributeError when the attribute is absent; ensure
the updated attribute name matches how instances are initialized elsewhere in
the class constructors.
- Around line 294-306: MemoryGAP9PlatformWrapper defines untiledOps = [] which
is inconsistent with MemoryGAP9Platform (which sets untiledOps = ["add"]) and
renders the conditional in getTargetMemoryLevel dead; update
MemoryGAP9PlatformWrapper.untiledOps to match the platform (set untiledOps to
["add"]) following the pattern used in MemoryPULPPlatformWrapper so the check in
getTargetMemoryLevel(node, tensorName, ctxt) behaves as intended.
🧹 Nitpick comments (4)
cmake/simulation.cmake (1)
72-78: Minor stylistic inconsistency withgvsoc_flags_add_files_to_hyperflash.The hyperflash version (lines 63-70) uses
list(TRANSFORM)while this uses aforeachloop. Both are correct, but aligning the implementation style would improve maintainability.♻️ Optional: align with hyperflash version using list(TRANSFORM)
function(gvsoc_flags_add_files_to_flash out_var files_var) - set(flags) - foreach(file ${${files_var}}) - list(APPEND flags "--flash-property=${file}@flash:readfs_flash:files") - endforeach() + set(flags ${${files_var}}) + list(TRANSFORM flags PREPEND "--flash-property=") + list(TRANSFORM flags APPEND "@flash:readfs_flash:files") set(${out_var} ${flags} PARENT_SCOPE) endfunction()Deeploy/Targets/GAP9/Platform.py (2)
221-237: LGTM!Properly extends parent class buffer representation. The same
getattrsimplification could be applied here for consistency:memoryLevel = getattr(self, "_memoryLevel", None)
262-270: Avoid mutable default argument with function call.The default
engines = [GAP9ClusterEngine("GAP9Cluster")]is evaluated once at function definition time. While unlikely to cause issues in practice, this pattern can lead to subtle bugs if the list is modified.♻️ Suggested fix
class GAP9Platform(DeploymentPlatform): def __init__(self, - engines = [GAP9ClusterEngine("GAP9Cluster")], + engines = None, variableBuffer = GAP9VariableBuffer, constantBuffer = GAP9ConstantBuffer, structBuffer = GAP9StructBuffer, transientBuffer = GAP9TransientBuffer) -> None: + if engines is None: + engines = [GAP9ClusterEngine("GAP9Cluster")] super().__init__(engines, variableBuffer, constantBuffer, structBuffer, transientBuffer).github/workflows/infra-generate-ccache-gap9.yml (1)
32-41: Avoid swallowing setup/build failures.
|| truehides failures in environment setup and installation, which can lead to generating or uploading an incomplete cache with no signal. If the script is optional, log and skip explicitly; otherwise fail fast.🛠️ Proposed fix (fail fast or explicit guard)
- name: Build Deeploy shell: bash run: | source /app/install/gap9-sdk/.gap9-venv/bin/activate - source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true - pip install -e . || true + if [ ! -f /app/install/gap9-sdk/configs/gap9_evk_audio.sh ]; then + echo "Missing gap9_evk_audio.sh" >&2 + exit 1 + fi + source /app/install/gap9-sdk/configs/gap9_evk_audio.sh + pip install -e . deactivate - name: Generate CCache for GAP9 run: | source /app/install/gap9-sdk/.gap9-venv/bin/activate - source /app/install/gap9-sdk/configs/gap9_evk_audio.sh || true + source /app/install/gap9-sdk/configs/gap9_evk_audio.sh
f4face7 to
67e7e5b
Compare
- Remove TargetLibraries/GAP9/third_party from git tracking (this is a CMake-generated symlink that should not be in the repo) - Add GAP9/third_party to .gitignore to prevent future tracking - Revert pulp-nn-mixed submodule to original commit (a9b4aaf5) (submodule was unintentionally updated to b69ec23e)
Xeratec
left a comment
There was a problem hiding this comment.
Thanks for the adjustments. I think we will converge in the next round. I only have a few minor comments and would still prefer to move the pulp-nn libraries from TargetLibraries/PULPOpen/third_party to TargetLibraries/third_party.
As discussed offline, I will create another PR once this is merged to properly update the Dockerflow, including support for running on external hardware.
| with: | ||
| path: /app/.ccache | ||
| key: ccache-ci | ||
| key: ccache-gap9 |
There was a problem hiding this comment.
Out of curiosity, why use a separate cache?
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
TargetLibraries/PULPOpen/CMakeLists.txt (1)
42-47:⚠️ Potential issue | 🟠 MajorGuard
pulp-nn-mixedinclusion to prevent duplicate target errors.Both
TargetLibraries/PULPOpen/CMakeLists.txtandTargetLibraries/GAP9/CMakeLists.txtcalladd_subdirectory(../third_party/pulp-nn-mixed). Configuring both platforms in the same build causes CMake to fail with "source directory already added" error. Wrap the subdirectory call with aTARGETguard:-add_subdirectory(../third_party/pulp-nn-mixed) +if(NOT TARGET pulp-nn-mixed) + add_subdirectory(../third_party/pulp-nn-mixed) +endif() target_include_directories(pulp-nn-mixed PUBLIC ${PULP_SDK_INCLUDES}) target_compile_options(pulp-nn-mixed PUBLIC ${PULP_SDK_COMPILE_FLAGS})
🤖 Fix all issues with AI agents
In @.github/workflows/ci-platform-gap9-tiled.yml:
- Around line 27-30: The conditional passed to the select-env step (the value of
input docker_image_deeploy in the select-env block) lacks explicit grouping for
the || and && operators; update the expression to add parentheses so the
repo-equals check is grouped with the && fallback (e.g. wrap (github.repository
== 'pulp-platform/Deeploy' && 'ghcr.io/pulp-platform/deeploy-gap9:latest')
before the ||) so forks don't yield the literal "false" string and the fallback
behaves correctly; modify the docker_image_deeploy expression in the select-env
step accordingly.
8c1f6ed to
a0e65a6
Compare
Specify binary directory when adding third_party submodules since they are not subdirectories of PULPOpen/GAP9.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@TargetLibraries/GAP9/CMakeLists.txt`:
- Around line 39-47: NUM_CORES is used without being defined which leads to
surprising behavior; add an explicit guard that checks DEFINED NUM_CORES before
the USE_PREBUILT_PULPNN check and error out if it is unset. Specifically, before
the block that uses NUM_CORES (the if(USE_PREBUILT_PULPNN AND NOT NUM_CORES
EQUAL 8) check), add a check like if(NOT DEFINED NUM_CORES) and call
message(FATAL_ERROR "...") to fail fast and tell the user to supply NUM_CORES;
keep the rest of the logic for option(USE_PREBUILT_PULPNN ...) and the 8-core
validation unchanged.
🧹 Nitpick comments (2)
TargetLibraries/GAP9/CMakeLists.txt (2)
5-12: AddCONFIGURE_DEPENDSto keep globbed sources in sync.
Without it, adding/removing files won’t trigger a CMake reconfigure.♻️ Proposed change
-file(GLOB_RECURSE SOURCES - "src/**" -) +file(GLOB_RECURSE SOURCES CONFIGURE_DEPENDS + "${CMAKE_CURRENT_LIST_DIR}/src/**" +) @@ -file(GLOB_RECURSE PULPOPEN_SOURCES "../PULPOpen/src/**") +file(GLOB_RECURSE PULPOPEN_SOURCES CONFIGURE_DEPENDS + "${CMAKE_CURRENT_LIST_DIR}/../PULPOpen/src/**" +)
21-23: Prefertarget_compile_definitionsforNUM_CORES.
This keeps usage requirements in the right property instead ofCOMPILE_OPTIONS.♻️ Proposed change
-target_compile_options(deeploygap9 PUBLIC - -DNUM_CORES=${NUM_CORES} - ) +target_compile_definitions(deeploygap9 PUBLIC NUM_CORES=${NUM_CORES})
Summary
This PR adds complete GAP9 platform support to Deeploy, including platform integration, DMA support, tiling capabilities, CI/CD workflows, and comprehensive testing infrastructure. This represents 20 commits specifically focused on GAP9 development.
Added
GAP9 Platform Support
Changed
- Transpose operator: Fixed GCC segmentation fault caused by template syntax (commit 9ca4595)
- LayerNorm operator: Resolved epsilon ABI compatibility issue (commit 6b5c2e5)
Known Limitations
Platform Capabilities
✅ Multi-core (1-8) | ✅ L1/L2/L3 memory | ✅ Multi-channel DMA
✅ GVSoC simulation | ✅ Tiling | ✅ PULP-NN integration
PR Merge Checklist
develcommit and pointing todevel.CHANGELOG.mdfile has been updated.