Agentic Browser Proxy

An autonomous AI-powered browser agent that can perform tasks on websites using natural language goals. Built with Playwright and integrated with LLM backends for intelligent decision-making.

Features

30+ browser actions - click, type, scroll, extract data, fill forms, and more
Human-like behavior - realistic typing, random delays, error recovery
Skill system - pre-configured automation recipes for specific websites
Profile management - multiple browser profiles for parallel sessions
Duplicate tracking - avoid repeating already completed tasks
Persistent memory - learns from past sessions on each domain

Prerequisites

Python 3.10+
Chrome/Chromium browser
Windows, macOS, or Linux

Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Start the LLM Proxy

# Terminal 1 - Start the proxy server
python glm_proxy.py

3. Run the Agent

# Terminal 2 - Run the agent
python run_agent.py "https://google.com" "search for python tutorials" 50

Usage

Generic Mode

For general website automation without a skill:

python run_agent.py "<url>" "<goal>" [max_steps]

Example:

python run_agent.py "https://amazon.com" "find shoes under $100" 50

Skill Mode

For websites with pre-configured skills:

python run_agent.py --skill <skill_name> "<goal>" [max_steps]

Examples:

python run_agent.py --skill quora "answer 5 questions about python" 0
python run_agent.py --skill twitter "post a tweet about AI" 0

Using 0 for max_steps enables unlimited running until the skill's rules are met.

Skill System

The .yaser/ directory contains skill configuration files for specific websites. Each skill defines:

Profiles - Multiple browser profiles for parallel sessions
Rules - Answer limits, wait times, profile switching behavior
Selectors - Custom CSS selectors for site-specific elements
Instructions - AI instructions for that particular website

Creating a Skill

Create a .yaser/<site>.md file with:

site: quora.com
start_url: https://www.quora.com

profiles:
  - name: profile1
    cookies: []
  - name: profile2
    cookies: []

rules:
  answers_per_session: 5
  wait_between_posts_seconds: 30
  switch_profile_after: 5
  ask_user_to_login: true

tracking:
  completed_file: completed_quora.txt

selectors:
  answer_button: "a.q-text"
  submit_button: "button.q-text"

instructions: |
  Focus on unanswered questions with high views.
  Always be helpful and provide detailed answers.

Available Actions

The agent can perform these actions:

Action	Description
`click`	Click an element by index
`type`	Type text into an input
`select_option`	Select from dropdown
`scroll`	Scroll the page
`goto`	Navigate to URL
`extract`	Extract page content
`wait`	Wait for specified seconds
`goal_completed`	Mark task as done
`goal_failed`	Mark task as failed

Skill Actions (when skill loaded)

Action	Description
`check_duplicate`	Check if URL already completed
`mark_completed_item`	Mark URL as completed
`switch_profile`	Switch to next browser profile

Configuration

Edit agentic_browser_v2/config.py to customize:

MAX_STEPS - Default max steps per session
LONG_RUN_MODE - Enable unlimited steps
VISION_FAILURE_THRESHOLD - Screenshot capture threshold
YASER_DIR - Skill files directory

Project Structure

.
├── agentic_browser_v2/     # Core agent modules
│   ├── agent.py           # Main agent logic
│   ├── actions.py         # Browser actions
│   ├── prompts.py         # LLM prompts
│   ├── skill_loader.py    # Skill system
│   ├── profile_manager.py # Profile management
│   └── duplicate_tracker.py
├── glm_proxy.py           # LLM proxy server
├── proxy_api.py           # REST API server
├── run_agent.py           # Agent launcher
├── .yaser/                # Skill configurations
└── requirements.txt       # Python dependencies

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
agentic_browser_v2		agentic_browser_v2
.gitignore		.gitignore
README.md		README.md
SKILL.md		SKILL.md
glm_proxy.py		glm_proxy.py
launch.py		launch.py
proxy_api.py		proxy_api.py
run_agent.py		run_agent.py
test_cursor_movement.py		test_cursor_movement.py
twitter-agent.bat		twitter-agent.bat
twitter-agent.sh		twitter-agent.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic Browser Proxy

Features

Prerequisites

Quick Start

1. Install Dependencies

2. Start the LLM Proxy

3. Run the Agent

Usage

Generic Mode

Skill Mode

Skill System

Creating a Skill

Available Actions

Skill Actions (when skill loaded)

Configuration

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agentic Browser Proxy

Features

Prerequisites

Quick Start

1. Install Dependencies

2. Start the LLM Proxy

3. Run the Agent

Usage

Generic Mode

Skill Mode

Skill System

Creating a Skill

Available Actions

Skill Actions (when skill loaded)

Configuration

Project Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages