Skip to content

perf(gui): implement batched UI rendering#2514

Open
githubawn wants to merge 22 commits intoTheSuperHackers:mainfrom
githubawn:perf/gui-batch
Open

perf(gui): implement batched UI rendering#2514
githubawn wants to merge 22 commits intoTheSuperHackers:mainfrom
githubawn:perf/gui-batch

Conversation

@githubawn
Copy link
Copy Markdown

@githubawn githubawn commented Mar 31, 2026

Reduces draw calls by batching 2D elements by texture state in setup2DRenderState.

Environment Scenario (-quickstart) Before After
Main Machine Skirmish Screen 477 FPS 918 FPS
M1 Macbook (Wine) Skirmish Screen 16 FPS 61 FPS
M1 Macbook (Wine) In-Game (Match Start) 22 FPS 33 FPS
Main Machine Selecting a thousand units in skirmish 45 FPS 56 FPS

Added early-out clipping and refined bounds checks to skip rendering objects outside the active region.

Optimized HotKey rendering by removing a redundant Render() call.

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 31, 2026

Greptile Summary

This PR batches 2D UI draw calls by texture/mode state through a new setup2DRenderState helper and beginBatch/endBatch/flush API on the Display base class, deferring m_2DRender->Render() until the state must change or the batch ends. It also adds an early-out clip check in drawImage before any texture work, fixes a redundant hotkey Render() call in W3DDisplayString, and carries an unrelated camera-offset refactor in W3DView.cpp that has a few leftover debug artifacts.

Confidence Score: 5/5

Safe to merge; the batching logic and texture ref-counting are correct, and all remaining findings are P2 style issues.

The core batching implementation — setup2DRenderState, ref-counting for managed and raw textures, flush-on-state-change, and the beginBatch/endBatch API — is well-structured and handles all draw-call paths consistently. Previous P0/P1 thread concerns (m_batchNeedsInit initialisation, deferred Release_Ref) have both been addressed in subsequent commits. The only open items are two commented-out debug lines and one uncertain comment in the unrelated camera-offset refactor in W3DView.cpp, none of which affect 2D rendering correctness.

Core/GameEngineDevice/Source/W3DDevice/GameClient/W3DView.cpp — two commented-out m_cameraOffset.z lines and one uncertain developer comment should be cleaned up before merge.

Important Files Changed

Filename Overview
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp Core batching implementation: adds setup2DRenderState, onBeginBatch/onEndBatch/onFlush, converts all draw* functions to defer Render() when batching is active. Texture ref-counting for both raw and managed textures looks correct. Also refactors setDisplayMode to drop the local-capture pattern.
GeneralsMD/Code/GameEngine/Source/GameClient/Display.cpp Adds beginBatch/endBatch/flush with re-entrancy guard (nested beginBatch is a no-op) and initialises m_isBatching to FALSE in the constructor.
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DInGameUI.cpp Wraps the entire W3DInGameUI::draw() body in beginBatch/endBatch, collapsing all HUD draw calls into a single batched pass.
GeneralsMD/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplayString.cpp Calls TheDisplay->flush() before text rendering to commit any pending 2D batch before switching to the font renderer; removes redundant m_textRendererHotKey.Render() that fired before the main sentence was drawn.
Core/GameEngineDevice/Source/W3DDevice/GameClient/W3DView.cpp Adds beginBatch/endBatch around iterateDrawablesInRegion and refactors camera-offset computation into a cached m_cameraOffset vector; contains two commented-out m_cameraOffset.z update lines with debug-prefix comments and one uncertain dev comment in resetCamera.
Generals/Code/GameEngineDevice/Source/W3DDevice/GameClient/W3DDisplay.cpp Identical batching changes replicated from GeneralsMD; same setup2DRenderState, draw* batching, and setDisplayMode refactor.

Sequence Diagram

sequenceDiagram
    participant UI as W3DInGameUI::draw()
    participant D as Display (beginBatch/endBatch)
    participant W as W3DDisplay
    participant R as Render2DClass (m_2DRender)

    UI->>D: beginBatch()
    D->>W: onBeginBatch() — Reset(), m_batchNeedsInit=TRUE

    loop Per drawImage / drawLine / drawFillRect call
        UI->>W: drawImage(image, ...)
        W->>W: setup2DRenderState(tex, mode, grayscale)
        alt same tex+mode+grayscale and !m_batchNeedsInit
            W-->>W: early return (no flush, accumulate)
        else state changed
            W->>W: onFlush() → Render() + Reset()
            W->>W: Add_Ref(newTex), Release_Ref(oldTex)
            W->>R: Enable_Texturing / Set_Texture / Enable_Alpha
        end
        W->>R: Add_Quad / Add_Tri / Add_Line (geometry only)
    end

    UI->>D: endBatch()
    D->>W: onEndBatch()
    W->>W: onFlush() → Render() + Reset()
    W->>W: Release_Ref(m_batchTexture)
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: Core/GameEngineDevice/Source/W3DDevice/GameClient/W3DView.cpp
Line: 3254-3256

Comment:
**Debug-prefixed commented-out code**

`m_cameraOffset.z` is commented out with a heavy slash prefix (`/////////////////////`) at both waypoint completion (line 3254) and the in-progress interpolation path (line 3330). Because `m_cameraOffset.y` and `m_cameraOffset.x` are derived from `z`, updating `x`/`y` while `z` stays frozen is a no-op when `m_cameraPitch` and `m_cameraYaw` are constant — making those two lines effectively dead code during waypoint movement. Either the whole block should be active (updating all three) or removed if the intent is to keep the offset fixed.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: Core/GameEngineDevice/Source/W3DDevice/GameClient/W3DView.cpp
Line: 2900

Comment:
**Uncertain developer comment**

The inline comment `// this isn't right... or is it?` signals unresolved doubt about this call and should be resolved before merging. If the computation is confirmed correct, remove the comment; if it's incorrect, fix it.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (18): Last reviewed commit: "implemented review feedback" | Re-trigger Greptile

@githubawn
Copy link
Copy Markdown
Author

githubawn commented Mar 31, 2026

Significantly improves #109
Fully fixing this requires significant refactoring to Render2DSentenceClass for fonts

optimize clipping check
return statement on seperate line
centralized flush
@githubawn
Copy link
Copy Markdown
Author

@greptileai

"One P1 render-state initialization bug in onBeginBatch() should be fixed before merging to avoid potential cross-frame visual corruption."

The P1 finding could be a false positive based on a local variable analysis that omits the downstream hardware synchronization in Render2DClass::Render() (render2d.cpp:601).

Render2DClass is a self-synchronizing state machine. Its Render() call (line 628 for textures, line 694 for shaders) explicitly pushes its internal members to the DX8Wrapper on every execution. The tracking variables in W3DDisplay (m_batchTexture, etc.) are mirror-images of Render2DClass's internal members.

happy?

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 31, 2026

No, the P1 finding stands.

Your argument about Render2DClass being self-synchronizing addresses hardware state consistency, but not the C++ undefined behavior in W3DDisplay.

The issue:

  • onFlush() reads !m_batchNeedsInit (line 617 of your diff)
  • setup2DRenderState() reads !m_batchNeedsInit (line 627)
  • If uninitialized, these branches are unpredictable

Even if Render2DClass correctly synchronizes hardware state on every Render() call, W3DDisplay's control flow still depends on m_batchNeedsInit to decide:

  1. Whether to call Render() in onFlush()
  2. Whether to skip redundant state setup in setup2DRenderState()

An uninitialized bool means these decisions are based on garbage. The fact that the hardware might recover downstream doesn't make reading uninitialized memory defined behavior.

The fix is one line. Keep it.

@Mauller
Copy link
Copy Markdown

Mauller commented Apr 1, 2026

I see greptile woke up on the sassy side of the motherboard.

@Mauller
Copy link
Copy Markdown

Mauller commented Apr 1, 2026

Reduces draw calls by batching 2D elements by texture state in setup2DRenderState.

Environment Scenario (-quickstart) Before After
Main Machine Skirmish Screen 477 FPS 918 FPS
M1 Macbook (Wine) Skirmish Screen 16 FPS 61 FPS
M1 Macbook (Wine) In-Game (Match Start) 22 FPS 33 FPS

Another test would be to get a lot of units on screen then select them all so their unit info is being drawn.

The health bars make use of the line drawing classes which i believe are inherently affected by this change.

@githubawn
Copy link
Copy Markdown
Author

I was initially going to say that health bars render in drawViews() (outside this batch), but your comment made me realize it's only a 6-line change to wrap that too.

Jumped from 45 FPS to 56 FPS.

image

@Mauller
Copy link
Copy Markdown

Mauller commented Apr 1, 2026

I was initially going to say that health bars render in drawViews() (outside this batch), but your comment made me realize it's only a 6-line change to wrap that too.

Jumped from 45 FPS to 56 FPS.
image

Nice ~30% improvement there

@Skyaero42
Copy link
Copy Markdown

Compile errors need to be fixed before this can be reviewed

@githubawn githubawn force-pushed the perf/gui-batch branch 2 times, most recently from 33ccf35 to 5dd0e6e Compare April 6, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants