You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This document is the single source of truth for the GitLab Knowledge Graph (Orbit) project.
Project Overview
The GitLab Knowledge Graph (GKG), product name Orbit, is a backend service that builds a property graph from GitLab instance data (SDLC metadata + code structure) and exposes it through a JSON-based Cypher-like DSL compiled to ClickHouse SQL. It provides a unified context API for AI systems (via MCP) and human users, and queryable APIs for data products.
GA Target: .com end of April 2026 | Dedicated/SM Q2 FY27
Deployment: Cloud native only (Kubernetes/Helm). No Omnibus packaging for the initial iteration.
GitLab Core -- PostgreSQL (OLTP) and Rails (application server). The source of all SDLC and code data. Handles authentication and authorization for graph queries. Rails proxies repository archive downloads for code indexing.
Data Insights Platform -- Siphon (CDC) streams PostgreSQL logical replication events through NATS JetStream into ClickHouse.
ClickHouse -- Columnar database serving two logical databases on one instance: the datalake (raw CDC rows from Siphon) and the graph database (indexed property graph tables).
Knowledge Graph (Orbit) -- Rust service that transforms datalake rows into a property graph, parses code via the Rails internal API, and serves graph queries over gRPC. Single binary running as indexer, webserver, scheduler, and health-check.
Note: gitlab-org/rust/knowledge-graph is the old repository for the local client-side knowledge graph, which will be archived. The code graph was taken from that repo and moved into orbit/knowledge-graph.
Zoekt code search indexer (historical context: early KG integration MRs in CNG attempted embedding KG via Zoekt FFI)
Infrastructure (ops.gitlab.net)
These repositories on ops.gitlab.net manage the Kubernetes infrastructure and deployment configs for the GitLab production and staging environments. GKG/Siphon staging infrastructure is configured here.
Terraform modules for GKE clusters, Vault integration, Private Service Connect (PSC) networking for Patroni connectivity, and CI runner signing (KMS HSM via OIDC). Contains 8+ MRs for Siphon PSC setup (Jan-Feb 2026).
docs/dev/e2e-testing.md -- full-stack e2e tests on GKE (GitLab + Siphon + GKG), runs in CI on MRs
Contributors: adding a new language
docs/dev/adding-a-language.md -- step-by-step guide for adding a new language to the v2 code indexer (DSL trait surface, define_languages! macro, fixture registration, common traps)
vm-gitlab-omnibus (n4-standard-8, includes Gitaly + PostgreSQL)
Domain
gitlab.gkg.dev
Secrets
GCP Secret Manager -> External Secrets Operator
GCP sandbox infrastructure (GKE cluster, VMs, networking) is managed via the GCP console and Helm charts. See the Terraform section for IaC references and the server configuration runbook for application config.
Staging (gitlab-helmfiles managed)
Staging is deployed to the analytics-eventsdot-stg environment. All configs live in gitlab-helmfiles:
All Terraform lives in config-mgmt on ops.gitlab.net, managed via Atlantis. No dedicated GKG Terraform project exists -- the sandbox is managed via Helm charts and GCP console.
GKG uses GitLab's consumption-based billing system (CustomersDot, Snowplow). The gkg-billing crate emits orbit_workflow_completion Snowplow events on every successful query and enforces credit quotas via the CDot API.
Billing events: BillingObserver fires on pipeline success, attaching query type, source type, and feature-qualified name. Metrics: gkg.billing.events.{emitted,dropped,rejected}.
Quota gate: QuotaService sends a HEAD request to CDot before query execution for metered source types (mcp, rest). Fail-open on CDot errors. Results cached with jittered TTL.
SOX boundary: crates/gkg-server/src/billing_adapter.rs is the single Claims → BillingInputs conversion point. This file plus crates/gkg-billing/ are the entire auditable surface.
Orbit, aka the GitLab Knowledge Graph, is a project that aims to provide a unified context API for AI systems and human users. This project has both a local Knowledge Graph for your code and a backend service for the entire SDLC.