Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 193 additions & 0 deletions .ai-ready.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
# .ai-ready.yml — Project metadata for AI-assisted development
# https://github.com/nichochar/ai-ready

project:
name: ferrum
description: Ruby headless Chrome driver using the Chrome DevTools Protocol (CDP)
version: 0.17.2
language: ruby
min_ruby_version: "3.1"
license: MIT
homepage: https://ferrum.rubycdp.com/

architecture:
style: "Object-oriented library wrapping Chrome DevTools Protocol over WebSocket"
summary: |
Ferrum is a pure-Ruby driver for headless Chrome/Chromium. It launches a browser
process, connects to it via WebSocket using the Chrome DevTools Protocol (CDP),
and provides a Ruby object model for controlling pages, frames, network, input,
and screenshots. There is no Selenium dependency — Ferrum talks directly to Chrome.

object_model:
- name: Browser
path: lib/ferrum/browser.rb
role: "Top-level entry point. Spawns a Chrome process, holds the CDP Client, manages Contexts. Delegates most page-level methods to the default page via Forwardable."

- name: Browser::Process
path: lib/ferrum/browser/process.rb
role: "Spawns and manages the Chrome OS process. Parses the WebSocket URL from Chrome's stderr output. Handles process lifecycle (start/stop/kill) and temp user data directories."

- name: Browser::Options
path: lib/ferrum/browser/options.rb
role: "Parses and validates all browser configuration (headless, timeout, proxy, window_size, extensions, etc). Immutable after initialization."

- name: Client
path: lib/ferrum/client.rb
role: "CDP WebSocket client. Sends JSON-RPC commands, receives responses via Concurrent::IVar. Routes CDP events to the Subscriber. Each command gets an incrementing ID."

- name: SessionClient
path: lib/ferrum/client.rb
role: "Wraps Client for a specific CDP session (target). Appends sessionId to all outgoing messages. Used in flatten mode where one WebSocket serves all targets."

- name: Client::WebSocket
path: lib/ferrum/client/web_socket.rb
role: "Low-level WebSocket I/O using websocket-driver gem over a raw TCPSocket. Runs a reader thread that pushes parsed JSON messages into a Queue."

- name: Client::Subscriber
path: lib/ferrum/client/subscriber.rb
role: "Event dispatch system with two priority levels. Priority queue handles Fetch.requestPaused and Fetch.authRequired (network interception). Regular queue handles all other CDP events."

- name: Contexts
path: lib/ferrum/contexts.rb
role: "Manages browser contexts (CDP's isolation mechanism, similar to incognito profiles). Subscribes to Target.* events for target lifecycle management."

- name: Context
path: lib/ferrum/context.rb
role: "A single browser context holding multiple Targets. Creates pages, manages target attachment."

- name: Target
path: lib/ferrum/target.rb
role: "Represents a CDP target (page or iframe). Builds Page instances and SessionClient connections. Used by cuprite to inject custom Page subclasses."

- name: Page
path: lib/ferrum/page.rb
role: "Central class for interacting with a browser tab. Composes Mouse, Keyboard, Headers, Cookies, Network, Downloads, Tracing. Subscribes to CDP events for frame lifecycle and navigation."

- name: Frame
path: lib/ferrum/frame.rb
role: "Represents a document frame in the page tree. Includes DOM (CSS/XPath finders) and Runtime (JS evaluation) modules. Tracks execution context IDs via Concurrent::MVar."

- name: Frame::DOM
path: lib/ferrum/frame/dom.rb
role: "CSS and XPath selectors (at_css, at_xpath, css, xpath), body/title/url accessors, script/style tag injection. All implemented via JS evaluation through Runtime."

- name: Frame::Runtime
path: lib/ferrum/frame/runtime.rb
role: "JavaScript evaluation engine. Calls Runtime.callFunctionOn, handles return value deserialization (primitives, arrays, objects, DOM nodes, cyclic objects). Retries on intermittent context errors."

- name: Node
path: lib/ferrum/node.rb
role: "Represents a DOM element. Provides click, type, focus, scroll, select, attribute access, computed styles. Click uses coordinate-based approach with movement detection."

- name: Network
path: lib/ferrum/network.rb
role: "Network traffic monitoring and interception. Tracks all exchanges (request/response/error triples). Supports blocklist/allowlist, authorization, network condition emulation."

- name: Network::Exchange
path: lib/ferrum/network/exchange.rb
role: "Groups a request, response, and optional error into one traffic entry. Tracks loading state (pending/finished)."

key_patterns:
- name: "CDP command abstraction"
description: "All Chrome interaction goes through `command(method, **params)` which maps to CDP JSON-RPC. Page adds wait/slowmo semantics on top."

- name: "Forwardable delegation chains"
description: "Browser delegates to Page, Page delegates to Frame. This lets users call `browser.at_css(...)` which flows through: Browser -> default_context -> default_target -> page -> main_frame."

- name: "Event subscription with on/off"
description: "CDP events are subscribed via `on('Domain.event') { |params| ... }`. Page provides symbolic shortcuts (:dialog, :request, :auth) that map to specific CDP events."

- name: "Concurrent data structures everywhere"
description: "Uses concurrent-ruby extensively: Concurrent::Map for frames/targets/contexts, Concurrent::IVar for pending command responses, Concurrent::MVar for execution context IDs, Concurrent::Hash for thread-safe storage."

- name: "Two-queue priority subscriber"
description: "Network interception events (Fetch.*) go to a priority queue processed by a dedicated thread, so they don't get blocked behind regular events."

- name: "Flatten mode (single WebSocket)"
description: "By default, one WebSocket connection serves the browser and all pages. SessionClient adds sessionId to route messages to the correct target. This is more efficient than per-page connections."

dependencies:
runtime:
- name: websocket-driver
purpose: "WebSocket protocol implementation for CDP communication"
- name: concurrent-ruby
purpose: "Thread-safe data structures (Map, IVar, MVar, Hash, Array) used throughout for concurrent CDP event handling"
- name: addressable
purpose: "URI parsing for WebSocket URLs and base_url handling"
- name: webrick
purpose: "Used internally for HTTP utilities"
- name: base64
purpose: "Encoding/decoding screenshots and PDF output"

development:
- name: rspec
purpose: "Test framework"
- name: rspec-wait
purpose: "Async-aware RSpec matchers for browser interaction tests"
- name: sinatra
purpose: "Test web application that specs run against"
- name: puma
purpose: "Test server for the Sinatra application"
- name: rubocop
purpose: "Code linting"
- name: yard
purpose: "API documentation generation"
- name: rbs
purpose: "Type signatures (sig/ directory)"

testing:
framework: rspec
command: "bundle exec rake"
ci_matrix: "Ruby 3.1, 3.2, 3.3, 3.4, 4.0 on ubuntu-latest"
structure: |
Tests are integration-heavy — they launch a real Chrome browser and interact with
a local Sinatra test app (spec/support/application.rb served by spec/support/server.rb).
Spec files mirror the lib/ structure: spec/browser_spec.rb, spec/page_spec.rb,
spec/network_spec.rb, etc. Unit tests live under spec/unit/.

The shared context "Global helpers" (spec/support/global_helpers.rb) provides
browser, page, network, and traffic accessors. Each spec group gets its own
Browser instance via before(:all), and reset is called after each example.

environment_variables:
- FERRUM_DEFAULT_TIMEOUT: "Override default CDP timeout (default: 5s)"
- FERRUM_PROCESS_TIMEOUT: "Override browser process startup timeout (default: 10s)"
- FERRUM_INTERMITTENT_ATTEMPTS: "Retry count for intermittent JS context errors (default: 6)"
- FERRUM_INTERMITTENT_SLEEP: "Sleep between intermittent retries (default: 0.1s)"
- FERRUM_NODE_MOVING_WAIT: "Delay for node movement detection (default: 0.01s)"
- FERRUM_NODE_MOVING_ATTEMPTS: "Max attempts for node stop-moving check (default: 50)"
- FERRUM_GOTO_WAIT: "Wait after navigation command (default: 0.1s)"
- FERRUM_NEW_WINDOW_WAIT: "Wait for new window events (default: 0.3s)"
- FERRUM_DEBUG: "Enable debug logging to stdout"
- FERRUM_LOGGING_SCREENSHOTS: "Include base64 screenshot data in logs"
- HEADLESS: "Set to 'false' to run browser visibly"
- SLOWMO: "Add delay between commands (seconds)"
- BROWSER_PATH: "Path to Chrome/Chromium binary"
- CI: "Enables CI-specific logging and screenshot capture on failure"

file_structure:
lib/ferrum.rb: "Entry point, requires core modules"
lib/ferrum/browser.rb: "Browser class — main public API"
lib/ferrum/browser/: "Browser subsystem (process, options, command, binary detection, xvfb)"
lib/ferrum/client.rb: "CDP WebSocket client and SessionClient"
lib/ferrum/client/: "WebSocket and Subscriber internals"
lib/ferrum/page.rb: "Page class with navigation, events, viewport"
lib/ferrum/page/: "Page modules (screenshot, screencast, animation, frames, tracing, stream)"
lib/ferrum/frame.rb: "Frame class"
lib/ferrum/frame/: "Frame modules (dom finders, JS runtime evaluation)"
lib/ferrum/node.rb: "DOM node interaction (click, type, attributes)"
lib/ferrum/network.rb: "Network monitoring and interception"
lib/ferrum/network/: "Network types (request, response, exchange, error, intercepted_request)"
lib/ferrum/cookies.rb: "Cookie management"
lib/ferrum/headers.rb: "HTTP header management"
lib/ferrum/keyboard.rb: "Keyboard input simulation"
lib/ferrum/mouse.rb: "Mouse input simulation"
lib/ferrum/downloads.rb: "File download handling"
lib/ferrum/proxy.rb: "Per-page proxy support"
lib/ferrum/dialog.rb: "JavaScript dialog (alert/confirm/prompt) handling"
lib/ferrum/errors.rb: "Error class hierarchy"
lib/ferrum/utils/: "Utilities (elapsed time, platform detection, thread spawning, event, retry)"
sig/: "RBS type signatures"
spec/: "RSpec tests mirroring lib/ structure"
spec/support/: "Test infrastructure (Sinatra app, Puma server, helpers)"
docs/: "Markdown documentation organized by feature"
138 changes: 138 additions & 0 deletions .cursorrules
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Cursor Rules for Ferrum

## Project context

Ferrum is a Ruby gem that controls headless Chrome/Chromium via the Chrome DevTools Protocol (CDP). Direct WebSocket connection, no Selenium, no chromedriver. ~2k GitHub stars, used as the backend for the cuprite Capybara driver.

## Architecture at a glance

```
Browser
├── Browser::Process (spawns Chrome, parses WS URL from stderr)
├── Browser::Options (immutable config: headless, timeout, proxy, etc.)
├── Client (WebSocket CDP JSON-RPC: send commands, receive responses)
│ ├── Client::WebSocket (raw TCP + websocket-driver, reader thread)
│ └── Client::Subscriber (two-queue event dispatch: priority for Fetch.*, regular for rest)
├── Contexts (manages browser contexts via Target.* CDP events)
│ └── Context (holds Targets, creates pages)
│ └── Target (CDP target, builds Page + SessionClient)
└── (delegates to) Page
├── Frame (DOM finders + JS Runtime evaluation)
├── Mouse, Keyboard (input simulation)
├── Network (traffic monitoring, interception, Exchange objects)
├── Headers, Cookies, Downloads
├── Page::Screenshot, Page::Screencast, Page::Animation, Page::Tracing
└── Node (DOM element: click, type, attributes, computed styles)
```

## Code style rules

- Always use `# frozen_string_literal: true` at the top of every Ruby file.
- One class/module per file. File path mirrors the namespace: `Ferrum::Page::Screenshot` -> `lib/ferrum/page/screenshot.rb`.
- Use `extend Forwardable` and `delegate` for method forwarding. The Browser -> Page -> Frame chain relies heavily on this.
- Prefer composition over inheritance. Page composes Mouse, Keyboard, Network, etc. as instance variables.
- Page capabilities are Ruby modules included into Page (Screenshot, Frames, Animation, etc.).
- Keep runtime dependencies minimal. Currently only 5 gems. Do not add new runtime deps without strong justification.

## CDP interaction pattern

All Chrome communication follows this pattern:

```ruby
# Raw CDP call
client.command("Domain.method", paramName: value)

# Page-level call with navigation wait semantics
command("Page.navigate", wait: GOTO_WAIT, url: url)

# Page-level call with slowmo support
command("Page.reload", wait: timeout, slowmoable: true)
```

CDP method names are PascalCase domains with camelCase methods: `Page.captureScreenshot`, `Runtime.callFunctionOn`, `DOM.getDocument`, `Network.enable`.

Parameters use camelCase: `nodeId`, `objectId`, `executionContextId`, `browserContextId`.

## Thread safety rules

Ferrum runs multiple threads (WebSocket reader, two subscriber threads, main thread). Follow these rules:

- Use `Concurrent::Map` instead of `Hash` for any data shared between threads.
- Use `Concurrent::IVar` for one-shot futures (pending CDP responses).
- Use `Concurrent::MVar` for values that get set/cleared repeatedly (execution context IDs).
- Use `Concurrent::Array` for thread-safe ordered collections.
- Never use plain Ruby `Hash`, `Array`, or instance variables for cross-thread data without synchronization.
- Spawn threads via `Utils::Thread.spawn` which names threads and sets `abort_on_exception`.

## Testing patterns

```ruby
# Tests use a real Chrome browser + local Sinatra app
# Available helpers from "Global helpers" shared context:
browser # Ferrum::Browser instance
page # Creates a new page via browser.create_page
network # page.network shortcut
traffic # Filtered network traffic (excludes chrome-error:// URLs)
server # Test server instance
base_url # Test server URL

# Navigation to test views:
page.go_to("/view_name") # loads spec/support/views/view_name.erb

# Common patterns:
page.at_css("#element") # find single element
page.css(".elements") # find multiple elements
page.evaluate("js code") # evaluate JavaScript
node.click # click element
node.text # get text content
```

- Add HTML fixtures as ERB files in `spec/support/views/`
- Add routes in `spec/support/application.rb` (Sinatra)
- Run tests: `bundle exec rake`
- CI matrix: Ruby 3.1, 3.2, 3.3, 3.4, 4.0

## Error handling

Map CDP errors to Ferrum error classes defined in `lib/ferrum/errors.rb`:

| CDP error message | Ruby exception |
|---|---|
| "No node with given id found" | `NodeNotFoundError` |
| "Cannot find context with specified id" | `NoExecutionContextError` |
| "No target with given id found" | `NoSuchPageError` |
| "Could not compute content quads" | `CoordinatesNotFoundError` |
| Timeout waiting for response | `TimeoutError` |
| WebSocket closed | `DeadBrowserError` |

Use `Utils::Attempt.with_retry` for transient errors during page loads (NodeNotFoundError, NoExecutionContextError).

## JS evaluation internals

All JavaScript runs through `Frame::Runtime#call` -> `Runtime.callFunctionOn`:

1. Expression is wrapped in a function: `"function() { return <expr> }"`
2. Called on an execution context (frame) or a specific object (node)
3. Return values deserialized by `handle_response`: primitives, arrays, objects, nodes, dates, cyclic objects
4. Node references come back as `Node` instances (via DOM.requestNode + DOM.describeNode)
5. Retries on `NodeNotFoundError` and `NoExecutionContextError` (up to INTERMITTENT_ATTEMPTS times)

## Environment variables

| Variable | Purpose |
|---|---|
| `FERRUM_DEFAULT_TIMEOUT` | CDP command timeout (default: 5s) |
| `FERRUM_PROCESS_TIMEOUT` | Chrome startup timeout (default: 10s) |
| `FERRUM_DEBUG` | Enable debug logging |
| `BROWSER_PATH` | Chrome binary path |
| `HEADLESS` | "false" for visible browser |
| `SLOWMO` | Delay between commands |

## Things to avoid

- Do not add Selenium or WebDriver dependencies.
- Do not use `sleep` for synchronization. Use Event, IVar, or MVar.
- Do not break the Forwardable delegation chain (users call `browser.at_css` and expect it to work).
- Do not cache node IDs across navigations (they become stale).
- Do not add global mutable state. Configuration is immutable after Browser initialization.
- Do not introduce per-page WebSocket connections unless explicitly needed (flatten mode is the default).
Loading