|
67 | 67 | * when the exploration context is provided, make your suggestion based on the context as well as the original dataset; otherwise leverage the original dataset to suggest questions. |
68 | 68 |
|
69 | 69 | Guidelines for question suggestions: |
70 | | -1. Suggest a list of question_groups of interesting analytical questions that are not obvious that can uncover nontrivial insights, including both breadth and depth questions. |
71 | | - |
| 70 | +1. Suggest a list of question_groups of interesting analytical questions that are not obvious that can uncover nontrivial insights. |
72 | 71 | 2. Use a diverse language style to display the questions (can be questions, statements etc) |
73 | 72 | 3. If there are multiple datasets in a thread, consider relationships between them |
74 | 73 | 4. CONCISENESS: the questions should be concise and to the point |
75 | 74 | 5. QUESTION GROUP GENERATION: |
76 | 75 | - different questions groups should cover different aspects of the data analysis for user to choose from. |
77 | | - - each question_group should include both 'breadth_questions' and 'depth_questions': |
78 | | - - breadth_questions: a group of questions that are all relatively simple that helps the user understand the data in a broad sense. |
79 | | - - depth_questions: a sequence of questions that build on top of each other to answer a specific aspect of the user's goal. |
80 | | - - you have a budget of generating 4 questions in total (or as directed by the user). |
81 | | - - allocate 2-3 questions to 'breadth_questions' and 2-3 questions to 'depth_questions' based on the user's goal and the data. |
82 | | - - each question group should slightly lean towards 'breadth' or 'depth' exploration, but not too much. |
83 | | - - the more focused area can have more questions than the other area. |
| 76 | + - each question_group is a sequence of 'questions' that builds on top of each other to answer the user's goal. |
84 | 77 | - each question group should have a difficulty level (easy / medium / hard), |
85 | 78 | - simple questions should be short -- single sentence exploratory questions |
86 | 79 | - medium questions can be 1-2 sentences exploratory questions |
87 | 80 | - hard questions should introduce some new analysis concept but still make it concise |
88 | 81 | - if suitable, include a group of questions that are related to statistical analysis: forecasting, regression, or clustering. |
89 | 82 | 6. QUESTIONS WITHIN A QUESTION GROUP: |
90 | | - - all questions should be a new question based on the thread of exploration the user provided, do not repeat questions that have already been explored in the thread |
| 83 | + - raise new questions that are related to the user's goal, do not repeat questions that have already been explored in the context provided to you. |
91 | 84 | - if the user provides a start question, suggested questions should be related to the start question. |
92 | | - - when suggesting 'breadth_questions' in a question_group, they should be a group of questions: |
93 | | - - they are related to the user's goal, they should each explore a different aspect of the user's goal in parallel. |
94 | | - - questions should consider different fields, metrics and statistical methods. |
95 | | - - each question within the group should be distinct from each other that they will lead to different insights and visualizations |
96 | | - - when suggesting 'depth_questions' in a question_group, they should be a sequence of questions: |
97 | | - - start of the question should provide an overview of the data in the direction going to be explored, and it will be refined in the subsequent questions. |
98 | | - - they progressively dive deeper into the data, building on top of the previous question. |
99 | | - - each question should be related to the previous question, introducing refined analysis (e.g., updated computation, filtering, different grouping, etc.) |
| 85 | + - the questions should progressively dive deeper into the data, building on top of the previous question. |
| 86 | + - start of the question should provide an overview of the data in the direction going to be explored. |
| 87 | + - followup questions should refine the previous question, introducing refined analysis to deep dive into the data (e.g., updated computation, filtering, different grouping, etc.) |
| 88 | + - don't jump too far from the previous question so that readers can understand the flow of the questions. |
100 | 89 | - every question should be answerable with a visualization. |
101 | 90 | 7. FORMATTING: |
102 | | - - include "breadth_questions" and "depth_questions" in the question group: |
103 | | - - each question group should have 2-3 questions (or as directed by the user). |
| 91 | + - include "questions" in the question group: |
| 92 | + - each question group should have 2-4 questions (or as directed by the user). |
104 | 93 | - For each question group, include a 'goal' that summarizes the goal of the question group. |
105 | 94 | - The goal should all be a short single sentence (<12 words). |
106 | 95 | - Meaning of the 'goal' should be clear that the user won't misunderstand the actual question descibed in 'text'. |
107 | 96 | - It should capture the key computation and exploration direction of the question (do not omit any information that may lead to ambiguity), but also keep it concise. |
108 | 97 | - include the **bold** keywords for the attributes / metrics that are important to the question, especially when the goal mentions fields / metrics in the original dataset (don't have to be exact match) |
109 | 98 | - include 'difficulty' to indicate the difficulty of the question, it should be one of 'easy', 'medium', 'hard' |
110 | | - - a 'focus' field to indicate whether the overall question group leans more on 'breadth' or 'depth' exploration. |
111 | 99 |
|
112 | 100 | Output should be a list of json objects in the following format, each line should be a json object representing a question group, starting with 'data: ': |
113 | 101 |
|
114 | 102 | Format: |
115 | 103 |
|
116 | | -data: {"breadth_questions": [...], "depth_questions": [...], "goal": ..., "difficulty": ..., "focus": "..."} |
117 | | -data: {"breadth_questions": [...], "depth_questions": [...], "goal": ..., "difficulty": ..., "focus": "..."} |
| 104 | +data: {"questions": [...], "goal": ..., "difficulty": ...} |
| 105 | +data: {"questions": [...], "goal": ..., "difficulty": ...} |
118 | 106 | ... // more question groups |
119 | 107 | ''' |
120 | 108 |
|
|
0 commit comments