Skip to content

junhewk/simple-graph-builder

Repository files navigation

Simple Graph Builder

This plugin builds a lightweight knowledge graph from users' Obsidian notes using LLM-powered entity extraction with a simple yet expressive ontology model to provide knowledge extraction, exploration, and RAG search. Since Obsidian provides wonderful links between notes, implementing ontology model would meet users' (especially researchers') needs.

Graph View

Why Lightweight Ontology?

Traditional knowledge graphs often require complex schemas with dozens of entity and relationship types, making them difficult to maintain and query. Simple Graph Builder takes a different approach:

  • 10 Fixed Entity Types: PERSON, ORGANIZATION, CONCEPT, PROJECT, TOOL, EVENT, PLACE, DOCUMENT, METHOD, TOPIC - covering all common knowledge domains
  • Free-form Relationship Verbs: Express relationships naturally with active verbs like "develops", "uses", "causes", "cites"
  • Detail Property: Each relationship includes a detail field for nuanced descriptions without schema explosion

This design provides structured entity classification with expressive relationships, making it easy to build, query, and maintain your personal knowledge graph.

Features

  • Lightweight Ontology Model: Simple but expressive - 10 fixed entity types + free-form relationship verbs with detail annotations
  • Hybrid Entity Resolution: Multi-stage deduplication pipeline combining fast lookups with embedding similarity and LLM verification (inspired by KGGen [3])
  • Smart Search: AI-powered natural language queries over your knowledge graph with multi-path exploration
  • Entity Extraction: Automatically extract entities from your notes using AI (configurable extraction depth)
  • Internal Link Support: Automatically processes [[wikilinks]] to build note-to-note connections
  • Multiple LLM Support: Works with Claude, OpenAI, Gemini, and Ollama (local)
  • Korean Language Support: Bigram Jaccard similarity for robust Korean text matching (handles particles and spacing variations)
  • Interactive Graph View: Visualize your knowledge graph with fCoSE force-directed layout
  • Large Graph Support: Optimized for thousands of nodes with fast rendering
  • Note Neighborhood Panel: See connections for the current note in a sidebar
  • Manual Entity Merge: Merge duplicate entities via graph view context menu
  • Quick Access: Ribbon icon menu for common actions
  • Status Bar: Real-time graph statistics display

Entity Resolution

A key insight from recent knowledge graph research is that entity resolution is critical for quality knowledge graphs [3]. Without proper deduplication, "AI", "artificial intelligence", and "Artificial Intelligence" appear as separate nodes, fragmenting your knowledge.

Simple Graph Builder uses a hybrid resolution pipeline (opt-in feature):

Stage Method Speed
1. Persistent cache Previously resolved tokens O(1)
2. Session cache Same name resolved this session O(1)
3. Exact name Hash lookup on canonical name O(1)
4. Alias match Hash lookup on stored aliases O(1)
5. Embedding similarity Cosine similarity > 0.90 = auto-merge O(n)
6. LLM verification Ambiguous matches (0.80-0.90) verified by LLM API call
7. Create new No match found -

This approach resolves most entities via fast hash lookups, reserving expensive embedding searches and LLM calls for genuinely ambiguous cases.

Commands

Command Description
Analyze current note Extract entities from the active note
Search related notes Find notes by entity name (exact/fuzzy match)
Smart Search (AI) Natural language search using LLM to explore the graph
Open graph view Show the knowledge graph visualization
Open note neighborhood panel Show current note's connections in sidebar
Remove current note from graph Remove active note from the graph
Clear all graph data Reset the entire graph

Data Model

Entity Types (10 Fixed Types)

The LLM must classify each entity into one of these types:

Type Description Examples
PERSON People, individuals Authors, researchers, team members
ORGANIZATION Companies, institutions Google, MIT, research labs
CONCEPT Ideas, theories, principles Machine learning, API design
PROJECT Projects, products, initiatives Obsidian, GraphRAG
TOOL Software, hardware, instruments Python, VS Code, Docker
EVENT Meetings, conferences, milestones NeurIPS 2024, sprint review
PLACE Locations, venues, geography San Francisco, AWS us-east-1
DOCUMENT Papers, books, articles, notes "Attention Is All You Need"
METHOD Techniques, approaches, workflows Agile, TDD, fine-tuning
TOPIC Subjects, themes, fields, domains NLP, distributed systems

Relationships (Free-form Verbs)

Relationships are expressed as active verbs describing how entities relate:

Verb Examples Meaning
develops, creates, builds Creation, authorship
uses, applies, implements Usage, application
causes, leads to, enables Causality, dependency
contains, includes, has Composition, membership
cites, references, based on Citation, source
relates to, similar to General association

Each relationship also includes an optional detail field for additional context.

UI Elements

Ribbon Icon

Click the graph icon in the left ribbon to access:

  • Analyze current note
  • Open graph view

Status Bar

Shows real-time graph statistics with node counts by label.

Note Neighborhood Panel

A sidebar panel showing:

  • Extracted Nodes: Entities from the current note with entity type badges
  • Connected Nodes: Grouped by entity type (PERSON, CONCEPT, TOOL, etc.)
  • Relationships: Shows relationship verb and detail for each connection
  • Click nodes to see source notes and relationship details

Graph View Context Menu

Right-click a node to:

  • Merge into...: Manually merge duplicate entities (source becomes alias of target)

Settings

API Configuration

  • API Provider: Choose between Claude, OpenAI, Gemini, or Ollama
  • API Key: Your API key (not needed for Ollama)
  • Model: Select or enter a custom model name

Analysis Settings

  • Extraction Mode: Control extraction depth
    • Standard: Max 15 entities per chunk (fast, low cost)
    • Thorough: No limits per chunk (comprehensive extraction)
  • Chunked Processing: Long notes are automatically split into ~500 token chunks and processed in parallel (max 3 concurrent)
  • Auto-analyze on save: Automatically analyze notes when you save them (2-second debounce)
  • Analyze entire vault: Batch analyze all notes with progress tracking and cancellation support

Smart Search Model

You can configure a separate model for Smart Search queries, allowing you to use faster/cheaper models for extraction while using more capable models for search:

  • Use separate model for smart search: Enable to configure a different model
  • Smart search provider: Choose provider (Claude, OpenAI, Gemini, Ollama)
  • Smart search model: Select or enter a custom model name

This is useful for optimizing cost vs. quality - e.g., use GPT-4o-mini for extraction and GPT-4o for search.

Entity Resolution (Opt-in)

Enable embedding-based entity resolution for intelligent deduplication:

  • Enable embeddings: Turn on the hybrid resolution pipeline
  • Embedding provider: OpenAI, Gemini, or Ollama (can differ from main LLM provider)
  • Embedding API key: Separate key for embedding API calls
  • Embedding model:
    • OpenAI: text-embedding-3-small (1536 dims), text-embedding-3-large (3072 dims)
    • Gemini: text-embedding-004 (768 dims)
    • Ollama: nomic-embed-text (768 dims), mxbai-embed-large (1024 dims)
  • High confidence threshold: Auto-merge above this similarity (default: 0.90)
  • Low confidence threshold: LLM verification range floor (default: 0.80)
  • Enable LLM verification: Verify ambiguous matches with LLM calls
  • Compute embeddings: Generate embeddings for existing nodes
  • Clear resolution cache: Reset learned token mappings

View Settings

  • Open graph in main window: Toggle to open the graph visualization in a main tab instead of the right sidebar

Data Management

  • View graph statistics (nodes by entity type, total relationships)
  • Clear all graph data

Installation

From Obsidian Community Plugins

  1. Open Settings → Community plugins
  2. Search for "Simple Graph Builder"
  3. Click Install, then Enable

Using BRAT (Recommended for Beta)

  1. Install BRAT from Community Plugins
  2. Open command palette → "BRAT: Add a beta plugin"
  3. Enter: junhewk/simple-graph-builder
  4. Enable the plugin in Settings → Community plugins

Manual Installation

  1. Download main.js, styles.css, and manifest.json from the latest release
  2. Create folder: VaultFolder/.obsidian/plugins/simple-graph-builder/
  3. Copy the downloaded files into the folder
  4. Reload Obsidian and enable the plugin

Usage

Quick Start

  1. Configure your API key in Settings → Simple Graph Builder
  2. Open a note and run command: Analyze current note
  3. View results with command: Open graph view

Graph View

  • Click a node to highlight its connections
  • Double-click a node to open search with that term
  • Right-click a node to access merge options
  • Hover on edges to see relationship type and detail
  • Click the background to reset highlights
  • Scroll to zoom in/out
  • Drag to pan around the graph

Node colors are determined by entity type (10 predefined colors). Edges use unified gray styling with relationship verbs shown on hover.

Search

Two search modes are available:

Basic Search

  1. Run command: Search related notes
  2. Enter a concept or entity name
  3. Toggle Exact match for precise matching
  4. Click results to navigate to notes

Smart Search (AI)

  1. Run command: Smart Search (AI)
  2. Enter a natural language question (e.g., "What methods did we use for the recommendation project?")
  3. The LLM explores the graph using tool calls, following multiple paths
  4. View the AI-generated answer with relevant nodes and source notes
  5. Click source note links to navigate

Note: Smart Search requires models with tool calling support. Some Ollama models (deepseek-r1:*, gemma3:*) have limited support. Recommended: qwen3:*, gpt-oss:* for Ollama.

API Costs

This plugin makes API calls to extract entities from your notes.

  • Claude, OpenAI, Gemini: Each note analysis and Smart Search query will incur API costs based on your provider's pricing
  • Ollama: Free (runs locally on your machine)

Embedding Costs (if enabled)

  • OpenAI: ~$0.02 per 1M tokens for text-embedding-3-small
  • Gemini: Free tier available for text-embedding-004
  • Ollama: Free (local models like nomic-embed-text)

Consider using Ollama for cost-free operation, or batch analyze during off-peak hours to manage costs.

Privacy

  • Your notes are sent to the configured LLM provider for entity extraction
  • No data is stored externally; all graph data stays in your vault
  • Consider using Ollama for fully local, private processing
  • Embeddings are stored locally in binary format (embeddings.bin)

Technical Background

This plugin's entity resolution approach is inspired by recent advances in knowledge graph construction:

  • LightRAG [1] demonstrated lightweight graph-based RAG but lacks entity resolution
  • Microsoft GraphRAG [2] provides comprehensive extraction but at high cost ($50-100+ per corpus)
  • KGGen [3] introduced the insight that entity resolution is critical for quality knowledge graphs

Simple Graph Builder combines the simplicity of LightRAG with KGGen's hybrid resolution approach, adapted for Obsidian's local-first architecture.

References

[1] Guo, Z., et al. (2024). "LightRAG: Simple and Fast Retrieval-Augmented Generation." https://github.com/HKUDS/LightRAG

[2] Edge, D., et al. (2024). "From Local to Global: A Graph RAG Approach to Query-Focused Summarization." arXiv:2404.16130. https://github.com/microsoft/graphrag

[3] Shu, Y., et al. (2025). "KGGen: Extracting Knowledge Graphs from Plain Text with Language Models." NeurIPS 2025. arXiv:2502.09956. https://github.com/stair-lab/kggen

[4] Neo4j, Inc. (2024). "Neo4j GraphRAG Package for Python." https://neo4j.com/docs/neo4j-graphrag-python/current/

[5] Veen, A. (2024). "pgvector: Open-source vector similarity search for Postgres." https://github.com/pgvector/pgvector

Support

License

MIT License - see LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 28