diff --git a/contents/teams/experiments/objectives.mdx b/contents/teams/experiments/objectives.mdx index 686159b832de..5b2e334241cd 100644 --- a/contents/teams/experiments/objectives.mdx +++ b/contents/teams/experiments/objectives.mdx @@ -1,33 +1,26 @@ -### Q4 2025 Objectives -This quarter we’re doubling down on making experiments more powerful and easier to use – pushing towards AI automation, strengthening the feature flag foundations, and expanding the metrics and tools that help teams learn faster and act with confidence. +### Q1 2026 Objectives +This quarter we’re focused on making experiments faster, clearer, and easier to work with by improving performance and results loading, and fixing rough edges in how experiments work. We’re also expanding data and replay support, and starting to use AI to help teams come up with better experiments. +#### Query performance +For large-scale users, experiment queries are timing out or taking many minutes to load. We will build a way to precompute the heavy parts of these queries on a schedule, so that the final computation runs in a reasonable amount of time. This will be enabled for select customers and be easily toggleable. The goal is to use high-scale customers as a testing ground, so we have this solution in place before we onboard more users at this scale. -#### AI features -Motivation: We’re pushing towards more automation, using AI to make experimentation easier to set up, interpret, and act on. +#### Anonymous -> Logged in experiments: full support +Last quarter, we implemented support for running reliable experiments across the anonymous-to-authenticated flow, currently available in posthog-js. This quarter, we will extend this support to all libraries and gather user feedback on usability and whether it works well for them in practice. -* **Integrate Experiments into Tasks (AI)** – Deploy any code change behind an experiment, track its progress, and make a decision based on results. +#### Experiment phases +Currently, changing rollout conditions mid-experiment creates a poor experience, with confusing warnings and potentially invalid statistical analysis. We will introduce experiment phases: distinct periods within an experiment where rollout conditions stay the same. This makes it clear when and how an experiment changed over time, and makes the results easier to reason about. -* **“Analyze results”** AI agent – An AI agent that reviews experiment results, highlights important findings, and explains them in plain language. It can suggest whether to continue the test, stop it early, or roll out a variant. The agent should also call out risks, like low sample size or unusual data patterns, to help teams make better decisions. +#### Integrate session replay summaries +PostHog now supports session replay summaries via chat. We will integrate replay summaries into experiments so they help explain and add context to the experiment results. +#### Extend data warehouse support +We will extend data warehouse experiment support to funnels. We will also improve the overall experience by providing clear guidance on how to run data warehouse experiments, and we will two data warehouse experiments ourselves to build up hands-on knowledge within the team. -#### Feature flags foundation -Motivation: Experiments rely on flags, so we need to make sure the basics are solid and ready to support advanced use cases. +#### AI generated experiments +We will help users come up with experiment ideas interactively via chat. We will use existing insights, recordings and "signals" as input, combine them with user-provided context. Then we will generate an experiment idea, set up the experiment with metrics, and recommend how to implement it in code. -* **Anonymous -> Logged in users experiments** – Create a seamless experience for experiments that start with anonymous users but continue when the same user logs in. +#### Great results loading experience +We already precompute timeseries results. We will use these timeseries as the main experiment results, so users see freshly calculated results when they open the app, instead of having to refresh manually and wait. -* **Feature flags integration** - - Make it possible to run multiple experiments on the same feature flag - - Support boolean flags in experiments - - -#### Metrics & results -Motivation: Users need the right metrics and analysis options to get meaningful answers from experiments. - -* **Retention metric** – Add support for retention as a metric type, covering use cases like “users who return after 7 days”. This is the last major metric type missing from experiments. - -* **Metric breakdowns** – Enable users to break down experiment metrics by properties such as country, device, plan type, or any custom property. This allows deeper analysis of how variants perform across different user segments. - -* **Running time calculator** -Build a new calculator with two functions: - - Before launch: let users enter their expected traffic and baseline conversion to estimate how long the experiment will take to reach significance. - - During the experiment: show how much longer the experiment likely needs based on current progress and live data. +#### Retire legacy experiments to a read-only mode +Legacy experiments take up a large part of our codebase. We will cleanly separate them and move them to a read-only mode: still visible for audit and reference, but no longer editable or recalculable. This will reduce support burden and simplify the codebase.