task(content analytics) #34710 : Fix naming and reference inconsistencies for Engagement Dashboard#34711
Open
jcastro-dotcms wants to merge 4 commits intomainfrom
Conversation
…cies for Engagement Dashboard
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes naming inconsistencies and several correctness issues in the Engagement Dashboard analytics infrastructure (ClickHouse schema and CubeJS cube definitions).
CubeJS schema changes:
contextSiteId→siteIdacross all four cube definitions (EngagementDaily,SessionsByBrowserDaily,SessionsByDeviceDaily,SessionsByLanguageDaily) to align with the underlyingcontext_site_idcolumn conventionClickHouse schema fixes:
utc_timecolumn type fromDateTimetoDateTime64(3,'UTC')to support millisecond-precision timestampssession_states_mvaggregate functions — replaced_timestampwithutc_timeacross allminState,maxState, andargMaxStatecalls so session time windows and dimension states are correctly computedcontext_site_idString plain column fromsession_states;context_site_idis now exclusively tracked via itsargMaxaggregate state and properly materialized viaargMaxMerge(context_site_id_state)insession_facts_rmvORDER BY from (customer_id, cluster_id, sessionid, context_site_id)to(customer_id, cluster_id, sessionid)— site is a derived dimension, not a partitioning keysession_facts_rmvrolling finalization window from 72 hours to 5 days to better handle late-arriving eventsnow()tonow64(3, 'UTC')insession_facts_rmvfor consistent timestamp precisioncontent_presents_in_conversion_mvsyntax —REFRESH EVERY 15 MINUTEandAPPEND TOare now on separate lines for correctnessREFRESH EVERY 30 SECONDhints insession_facts_rmvandcontent_presents_in_conversion_mvfor local development iterationTest plan
min_ts,max_ts) and dimension states (context_site_id,user_agent,device_category,browser_family,language_id) are populated correctly insession_statesandsession_factsengagement_dailyandsessions_by_*_dailyroll-ups refresh on schedule and return correctly aggregated dataFixes #34710