Skip to content

Fix Python/FastAPI/SQL parsing: route false positives, Depends() tracking, SQL size guard, DLL calls#66

Open
kingchenc wants to merge 2 commits intoDeusData:mainfrom
kingchenc:main
Open

Fix Python/FastAPI/SQL parsing: route false positives, Depends() tracking, SQL size guard, DLL calls#66
kingchenc wants to merge 2 commits intoDeusData:mainfrom
kingchenc:main

Conversation

@kingchenc
Copy link

Summary

Details

#28 — Python dict .get() misidentified as Route nodes

Source-based route extractors (extractGoRoutes, extractExpressRoutes, extractLaravelRoutes, extractKtorRoutes) were running on all function nodes regardless of file type. The Ktor regex \b(get|post|...)\("..." matched payload.get("sub") in Python files, creating ~125 false Route nodes.

Fix: File extension guard via switch filepath.Ext() — each extractor only runs on its own language (.go, .js/.ts, .php, .kt).

#27 — FastAPI Depends() not tracked

Functions passed to Depends() as parameter defaults (e.g. user = Depends(get_current_user)) were not extracted as calls — making critical auth/DI functions appear as dead code with in_degree=0.

Fix: New extractPythonDependsEdges() scans Python function signatures for Depends(func_ref) patterns and emits CALLS edges (resolution_strategy: "fastapi_depends"). Includes fallback for import aliases (from X import Y as Z) by extracting the original function name from the import path.

Tested: 392 Depends edges across 39 router files on a real FastAPI project. require_admin went from in_degree: 64in_degree: 180.

#62 — Stack overflow in tree-sitter SQL parser

Large .sql files (bulk INSERT dumps ~4.5MB) cause deep recursion in the tree-sitter SQL grammar, exhausting the C stack (especially on Windows with 1MB default).

Fix: Per-language file size guard in cbmParseFile(): SQL >1MB skipped, any file >4MB skipped. Logged as cbm.skip.large_sql / cbm.skip.large_file.

#29 — Dynamic DLL calling not tracked

C/C++ code using GetProcAddress(handle, "Func"), dlsym(handle, "func"), or .Resolve("Func") for dynamic DLL loading had no call graph edges to the resolved functions.

Fix: New extractDLLResolveEdges() detects these patterns via regex, creates CALLS edges to synthetic stub nodes with dll_name/dll_function metadata. Stubs are created during the sequential flush phase (same path as LSP stub nodes).

Test plan

Changed files

  • internal/httplink/httplink.go — file extension guard in discoverRoutes()
  • internal/pipeline/pipeline_cbm.go — SQL size guard, extractPythonDependsEdges(), extractDLLResolveEdges()
  • internal/pipeline/pipeline.go — extend createLSPStubNodes() to handle dll_resolve strategy

…king, SQL size guard, DLL calls

  - Fix DeusData#28: Restrict source-based route extractors (Go/Express/Laravel/Ktor)
    to their own file extensions. Prevents Python dict .get() from matching
    Ktor route regex and creating ~125 spurious Route nodes.

  - Fix DeusData#27: Track FastAPI Depends(func_ref) in parameter defaults as CALLS
    edges. Scans Python function signatures for Depends() patterns so
    dependency-injected functions (e.g. get_current_user) no longer appear
    as dead code with in_degree=0.

  - Fix DeusData#62: Add file size guard in cbmParseFile() to prevent tree-sitter
    SQL parser stack overflow on large .sql files (bulk INSERTs). SQL files
    >1MB and any file >4MB are skipped with a logged warning.

  - Fix DeusData#29: Detect dynamic DLL resolution patterns (GetProcAddress, dlsym,
    Resolve) in C/C++ source and create CALLS edges to synthetic stub nodes
    with dll_name/dll_function metadata.
…king, SQL size guard, DLL calls

  - Fix DeusData#28: Restrict source-based route extractors (Go/Express/Laravel/Ktor)
    to their own file extensions. Prevents Python dict .get() from matching
    Ktor route regex and creating ~125 spurious Route nodes.

  - Fix DeusData#27: Track FastAPI Depends(func_ref) in parameter defaults as CALLS
    edges. Scans Python function signatures for Depends() patterns so
    dependency-injected functions no longer appear as dead code (in_degree=0).
    Includes fallback for import aliases (e.g. `import X as _Y`).

  - Fix DeusData#62: Add file size guard in cbmParseFile() to prevent tree-sitter
    SQL parser stack overflow on large .sql files (bulk INSERTs). SQL files
    >1MB and any file >4MB are skipped with a logged warning.

  - Fix DeusData#29: Detect dynamic DLL resolution patterns (GetProcAddress, dlsym,
    Resolve) in C/C++ source and create CALLS edges to synthetic stub nodes
    with dll_name/dll_function metadata.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant