-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Webui/prompt processing progress #18300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Webui/prompt processing progress #18300
Conversation
|
Just a nits improvement, I think showing percentage + ETA instead of elapsed time can be more useful: |
tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte
Outdated
Show resolved
Hide resolved
|
It can still be improved; I don't know if people have prompts that take several minutes, but adding the minutes might be a good idea! (and also we calculate the tokens/s we can display them, but it will bloat, and we already have the final value) |
ngxson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice feature!
(May need approval from @allozaur too)
tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte
Outdated
Show resolved
Hide resolved
|
I made a small observation: the chunk format is very close to that expected during normal inference, which spoofs the stat bubbles displayed during inference (those with the Settings/"Keep stats visible after generation" option). It might be wise to filter at this stage if "delta.content = null". |
allozaur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall good stuff, but some changes are required in order to make it ready for merging. I will handle this on my end.
tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte
Outdated
Show resolved
Hide resolved
tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte
Outdated
Show resolved
Hide resolved
tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte
Show resolved
Hide resolved
Address review feedback from ngxson
…atMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
…atMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
2eeb45f to
c56418e
Compare
|
I'm doing a quick re-testing and we can merge it |
wait! haven't finished yet, im polishing it up |
No worries, I'll test it when you're finished, and I never merge myself :) |
|
Alright, last changes pushed and also updated PR description with new demo video. |
allozaur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright now I think that this is ready for merging
|
@ggerganov please take a look at this and let me know if you also think that it's production ready :) |
|
https://github.com/user-attachments/assets/6bec9d79-5b63-4d37-9a88-11be4aa0deae |
I see... maybe in this case i will remove this client-side countdown and will leave just the default ETA value |
|
Mathematically say, I designed the progress object such that the ETA can be inferred without keeping track of a timer on client side. The progress can be non-linear. For example, if you're doing something else (watch a video, open a new tab, etc) while prompt is processing, it can slow down the progress enough to be noticeable. |
|
Yes, the timer creates a race with server updates, causing the jump. My original approach skipped the first chunk and recalculated from total elapsed time, avoiding both the initial error and the need for client-side countdown. I'll let allozaur finish the refactoring, the update of the existing stats fields is superb! now we also have the pre-processing tokens/s! |
|
well, I've applied 4807b0f to still keep the client-side counter as one more attempt to have this smoother waiting experience, @ngxson @ServeurpersoCom please check it and test on your ends. In the end if it's working good for you, let's keep it, but if you think that this is better to just use the chunk-based calculations, then I will remove this browser time counter logic. |
|
If we want to be meticulous, it's a good idea to retest the very first batch slowly on the processor (testing in progress) Still not perfect : |
Yes we need this ! |
|
The ETA is quite inaccurate: Screen.Recording.2025-12-29.at.18.18.35.movI would suggest to remove the ETA since it is not very useful. |
can do! :D so eventually we are simplifying this and probably that's the best outcome |
|
Now it's perfect, no more strange jitter, and 1 refresh for each batch |
|
Merge needed to continue on 18226, which affects many files |
|
The formula seems to be off (that why the ETA was incorrect). I'll push a fix. |
|
Done with front-only ngxson commit. batch refresh rate + linear ETA + better UI from allozaur |
530831841-7c575e2f-fd79-4d3b-b0da-092ac798d769.mp4Now it works with what everyone brought! Perfect! |
* webui: display prompt preprocessing progress * webui: add percentage/ETA and exclude cached tokens from progress Address review feedback from ngxson * webui: add minutes and first chunk (0%) case * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com> * webui: address review feedback from allozaur * chore: update webui build output * webui: address review feedback from allozaur * nit * chore: update webui build output * feat: Enhance chat processing state * feat: Improve chat processing statistics UI * chore: update webui build output * feat: Add live generation statistics to processing state hook * feat: Persist prompt processing stats in hook for better UX * refactor: Enhance ChatMessageStatistics for live stream display * feat: Implement enhanced live chat statistics into assistant message * chore: update webui build output * fix: Proper tab for each stage of prompt processing/generation * chore: update webui build output * fix: Improved ETA calculation & display logic * chore: update webui build output * feat: Simplify logic & remove ETA from prompt progress * chore: update webui build output --------- Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>


Close #17079
Integrates existing backend 'return_progress' feature into WebUI to show real-time token processing statistics during both prompt preprocessing and generation phases.
Key Features
Implementation
useProcessingStatehook: Extended with getLiveProcessingStats() andgetLiveGenerationStats()methods, plus ETA countdown logicChatMessageStatisticscomponent: Enhanced withisLiveandisProcessingPromptprops for streaming modeChatMessageAssistantcomponent: Uses unified statistics component during loading phaseDemo
demo.mp4