-
Notifications
You must be signed in to change notification settings - Fork 0
Simplify CLI: Accept URLs directly for indexing #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Parse GitHub URLs (owner/repo, tree/branch, commit/sha) - Parse GitLab URLs (project path, subgroups, /-/tree/branch) - Parse Bitbucket URLs (workspace/repo, src/branch, branch/name) - Fallback to website source for unknown URLs - Extract default index names from URLs - Support self-hosted GitLab and Bitbucket instances - Export parseSourceUrl and ParsedUrl from @augmentcode/context-connectors/core - Add comprehensive unit tests (19 test cases) Agent-Id: agent-8394bd07-7a81-41d0-ac95-1ca62623e6fb
Adds command that auto-detects source type (GitHub, GitLab, Bitbucket, or website) from the URL and creates the appropriate source. Features: - Parses URL using parseSourceUrl() to determine source type - Supports --ref option to override URL-detected branch/tag - Supports -i/--index option to override default index name - Supports --store and --store-path options - Default index name derived from repo/project name - Graceful error handling for invalid URLs Agent-Id: agent-c9423996-94bb-4ab3-8311-ca0cc822da14
Adds pre-parse argument rewriting that auto-detects when a URL is passed directly to 'ctxc index' and transparently inserts the 'url' subcommand. Before: ctxc index url https://github.com/owner/repo After: ctxc index https://github.com/owner/repo Both syntaxes now work. Existing subcommands (github, gitlab, etc.) are unchanged and continue to work. Agent-Id: agent-ce81a04d-72f2-4289-8eb7-c3074d7d8030
|
augment review |
🤖 Augment PR SummarySummary: This PR simplifies the indexing CLI by allowing users to pass a repository/website URL directly and having the tool auto-detect the source type. Changes:
Technical Notes: Self-hosted GitLab/Bitbucket URLs are supported via 🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
|
|
||
| const owner = pathParts[0]; | ||
| const repo = pathParts[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
URLs pasted from clone endpoints often end with .git; currently repo/projectName/defaultIndexName will include the suffix, which likely breaks GitHub/GitLab/Bitbucket API paths and yields odd default index keys. Consider stripping a trailing .git from repo/project segments when parsing.
Other Locations
src/core/url-parser.ts:125src/core/url-parser.ts:152
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| } | ||
|
|
||
| // GitLab (gitlab.com or hostname contains "gitlab") | ||
| if (hostname === "gitlab.com" || hostname.includes("gitlab")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using hostname.includes("gitlab") / hostname.includes("bitbucket") can misclassify unrelated websites whose domain happens to contain those substrings and then fail parsing instead of falling back to website. Consider a more conservative self-hosted detection strategy to reduce false positives.
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| program.parse(); | ||
| // Auto-detect URL mode: ctxc index <url> -> ctxc index url <url> | ||
| // This allows users to skip the 'url' subcommand when providing a URL directly | ||
| const indexIdx = process.argv.indexOf("index"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The URL auto-rewrite only triggers when the URL is the first argument after index, so ctxc index -i myidx https://… won’t be rewritten and will likely error. Is that limitation intentional, or should the rewrite scan forward for the first non-option arg?
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Summary
Simplifies the
ctxc indexcommand to accept URLs directly, eliminating the need for verbose source-specific flags.Before
After
Changes
1. URL Parser Module (
src/core/url-parser.ts)parseSourceUrl()function that auto-detects source type from URLs/tree/main2. CLI URL Mode (
src/bin/cmd-index.ts)urlsubcommand:ctxc index url <url>-i)--refto override branch/tag3. Direct URL Syntax (
src/bin/index.ts)ctxc index <url>without theurlsubcommandurlwhen a URL is detectedBackward Compatibility
✅ All existing subcommands work unchanged:
ctxc index github --owner augmentcode --repo context-connectors # Still worksTesting
Examples
Pull Request opened by Augment Code with guidance from the PR author