-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathtasks.yaml
More file actions
120 lines (107 loc) · 3.44 KB
/
tasks.yaml
File metadata and controls
120 lines (107 loc) · 3.44 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
tasks:
- id: T1
title: "Initialize repo with CI/CD"
assignee: "devops"
dependencies: []
deliverables:
- GitHub repo
- GitHub Actions pipeline
acceptance_criteria: "Push triggers build + unit tests."
- id: T2
title: "Setup Postgres with PGVector"
assignee: "backend"
dependencies: [T1]
deliverables:
- Docker Compose service for Postgres
- PGVector extension installed
acceptance_criteria: "Embedding inserted and retrieved via SQL."
- id: T3
title: "Deploy MinIO instance"
assignee: "devops"
dependencies: [T1]
deliverables:
- MinIO service in Docker Compose
- SDK integration in backend
acceptance_criteria: "File upload/download tested with SDK."
- id: T4
title: "Integrate Unstructured.io for PDF & Word"
assignee: "backend"
dependencies: [T2, T3]
deliverables:
- Parsing functions for PDF, DOCX
acceptance_criteria: "Parsed sample doc stored as JSON in Postgres."
- id: T5
title: "Implement PlantUML/Draw.io/Mermaid parsers"
assignee: "backend"
dependencies: [T4]
deliverables:
- Parser scripts/utilities
acceptance_criteria: "Diagram file produces structured JSON nodes."
- id: T6
title: "Design JSON schema for structured data"
assignee: "architect"
dependencies: [T4, T5]
deliverables:
- JSON schema spec
acceptance_criteria: "Schema validated against multiple sample docs."
- id: T7
title: "Implement embedding pipeline with context-preserving chunking"
assignee: "ml-engineer"
dependencies: [T6]
deliverables:
- Chunking strategy (sliding window + late chunking)
- Embedding storage in PGVector
acceptance_criteria: "Chunks embed successfully; semantic queries retrieve expected context."
- id: T8
title: "Integrate LangChain DeepAgent"
assignee: "ml-engineer"
dependencies: [T7]
deliverables:
- Agent setup with retrieval + generation
acceptance_criteria: "Question → answer pipeline works with sample doc."
- id: T9
title: "Implement LLM-as-judge checks"
assignee: "ml-engineer"
dependencies: [T8]
deliverables:
- JSON schema validation tool
- Consistency checker
acceptance_criteria: "System flags malformed JSON and traceability gaps."
- id: T10
title: "Frontend for labeling & review"
assignee: "frontend-dev"
dependencies: [T6]
deliverables:
- React container + API integration
acceptance_criteria: "User can label/edit record and save to DB."
- id: T11
title: "Dockerize frontend & backend"
assignee: "devops"
dependencies: [T10, T8]
deliverables:
- Dockerfiles
- Compose stack
acceptance_criteria: "Frontend+backend run as containers with API link."
- id: T12
title: "Setup logging database"
assignee: "devops"
dependencies: [T11]
deliverables:
- Logging schema
- Error pipeline
acceptance_criteria: "Errors captured and retrievable by query."
- id: T13
title: "Deploy to Kubernetes"
assignee: "devops"
dependencies: [T12]
deliverables:
- K8s manifests
- Helm chart
acceptance_criteria: "System deploys in cluster; pods healthy."
- id: T14
title: "Integrate multi-LLM providers"
assignee: "ml-engineer"
dependencies: [T8, T13]
deliverables:
- Config for OpenAI, Anthropic, local LLaMA2
acceptance_criteria: "Agent runs using at least 3 LLM backends."