diff --git a/cookbook/company-info/scrapegraph_sdk.ipynb b/cookbook/company-info/scrapegraph_sdk.ipynb
new file mode 100644
index 0000000..cf207c9
--- /dev/null
+++ b/cookbook/company-info/scrapegraph_sdk.ipynb
@@ -0,0 +1,1942 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "jEkuKbcRrPcK"
+ },
+ "source": [
+ "## \ud83d\udd77\ufe0f Extract Company Info with Official Scrapegraph SDK\n",
+ "\n",
+ "[](https://www.runalph.ai/notebooks/scrapegraphai/scrapegraph-sdk) [](https://colab.research.google.com/drive/12d7LycLAYO2bFsBo_jtPHXSaIg7AqR3O?usp=sharing)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "3Q5VM3SsRlxO"
+ },
+ "source": [
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "IzsyDXEWwPVt"
+ },
+ "source": [
+ "### \ud83d\udd27 Install `dependencies`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "os_vm0MkIxr9"
+ },
+ "outputs": [],
+ "source": [
+ "%%capture\n",
+ "!pip install scrapegraph-py"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "apBsL-L2KzM7"
+ },
+ "source": [
+ "### \ud83d\udd11 Import `ScrapeGraph` API key"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ol9gQbAFkh9b"
+ },
+ "source": [
+ "You can find the Scrapegraph API key [here](https://dashboard.scrapegraphai.com/)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "executionInfo": {
+ "elapsed": 6877,
+ "status": "ok",
+ "timestamp": 1734532300517,
+ "user": {
+ "displayName": "ScrapeGraphAI",
+ "userId": "10474323355016263615"
+ },
+ "user_tz": -60
+ },
+ "id": "sffqFG2EJ8bI",
+ "outputId": "f6b837cd-0f00-49cc-cb6f-f2bca57544f5"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "SGAI_API_KEY not found in environment.\n",
+ "Please enter your SGAI_API_KEY: \u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\n",
+ "SGAI_API_KEY has been set in the environment.\n"
+ ]
+ }
+ ],
+ "source": [
+ "import getpass\n",
+ "import os\n",
+ "\n",
+ "if not os.environ.get(\"SGAI_API_KEY\"):\n",
+ " os.environ[\"SGAI_API_KEY\"] = getpass.getpass(\"Scrapegraph API key:\\n\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "jnqMB2-xVYQ7"
+ },
+ "source": [
+ "### \ud83d\udcdd Defining an `Output Schema` for Webpage Content Extraction\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "VZvxbjfXvbgd"
+ },
+ "source": [
+ "If you already know what you want to extract from a webpage, you can **define an output schema** using **Pydantic**. This schema acts as a \"blueprint\" that tells the AI how to structure the response.\n",
+ "\n",
+ "\n",
+ " Pydantic Schema Quick Guide
\n",
+ "\n",
+ "Types of Schemas \n",
+ "\n",
+ "1. Simple Schema \n",
+ "Use this when you want to extract straightforward information, such as a single piece of content. \n",
+ "\n",
+ "```python\n",
+ "from pydantic import BaseModel, Field\n",
+ "\n",
+ "# Simple schema for a single webpage\n",
+ "class PageInfoSchema(BaseModel):\n",
+ " title: str = Field(description=\"The title of the webpage\")\n",
+ " description: str = Field(description=\"The description of the webpage\")\n",
+ "\n",
+ "# Example Output JSON after AI extraction\n",
+ "{\n",
+ " \"title\": \"ScrapeGraphAI: The Best Content Extraction Tool\",\n",
+ " \"description\": \"ScrapeGraphAI provides powerful tools for structured content extraction from websites.\"\n",
+ "}\n",
+ "```\n",
+ "\n",
+ "2. Complex Schema (Nested) \n",
+ "If you need to extract structured information with multiple related items (like a list of repositories), you can **nest schemas**.\n",
+ "\n",
+ "```python\n",
+ "from pydantic import BaseModel, Field\n",
+ "from typing import List\n",
+ "\n",
+ "# Define a schema for a single repository\n",
+ "class RepositorySchema(BaseModel):\n",
+ " name: str = Field(description=\"Name of the repository (e.g., 'owner/repo')\")\n",
+ " description: str = Field(description=\"Description of the repository\")\n",
+ " stars: int = Field(description=\"Star count of the repository\")\n",
+ " forks: int = Field(description=\"Fork count of the repository\")\n",
+ " today_stars: int = Field(description=\"Stars gained today\")\n",
+ " language: str = Field(description=\"Programming language used\")\n",
+ "\n",
+ "# Define a schema for a list of repositories\n",
+ "class ListRepositoriesSchema(BaseModel):\n",
+ " repositories: List[RepositorySchema] = Field(description=\"List of GitHub trending repositories\")\n",
+ "\n",
+ "# Example Output JSON after AI extraction\n",
+ "{\n",
+ " \"repositories\": [\n",
+ " {\n",
+ " \"name\": \"google-gemini/cookbook\",\n",
+ " \"description\": \"Examples and guides for using the Gemini API\",\n",
+ " \"stars\": 8036,\n",
+ " \"forks\": 1001,\n",
+ " \"today_stars\": 649,\n",
+ " \"language\": \"Jupyter Notebook\"\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"TEN-framework/TEN-Agent\",\n",
+ " \"description\": \"TEN Agent is a conversational AI powered by TEN, integrating Gemini 2.0 Multimodal Live API, OpenAI Realtime API, RTC, and more.\",\n",
+ " \"stars\": 3224,\n",
+ " \"forks\": 311,\n",
+ " \"today_stars\": 361,\n",
+ " \"language\": \"Python\"\n",
+ " }\n",
+ " ]\n",
+ "}\n",
+ "```\n",
+ "\n",
+ "Key Takeaways \n",
+ "- **Simple Schema**: Perfect for small, straightforward extractions. \n",
+ "- **Complex Schema**: Use nesting to extract lists or structured data, like \"a list of repositories.\" \n",
+ "\n",
+ "Both approaches give the AI a clear structure to follow, ensuring that the extracted content matches exactly what you need.\n",
+ " \n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "dlrOEgZk_8V4"
+ },
+ "outputs": [],
+ "source": [
+ "from pydantic import BaseModel, Field\n",
+ "from typing import List, Dict, Optional\n",
+ "\n",
+ "# Schema for founder information\n",
+ "class FounderSchema(BaseModel):\n",
+ " name: str = Field(description=\"Name of the founder\")\n",
+ " role: str = Field(description=\"Role of the founder in the company\")\n",
+ " linkedin: str = Field(description=\"LinkedIn profile of the founder\")\n",
+ "\n",
+ "# Schema for pricing plans\n",
+ "class PricingPlanSchema(BaseModel):\n",
+ " tier: str = Field(description=\"Name of the pricing tier\")\n",
+ " price: str = Field(description=\"Price of the plan\")\n",
+ " credits: int = Field(description=\"Number of credits included in the plan\")\n",
+ "\n",
+ "# Schema for social links\n",
+ "class SocialLinksSchema(BaseModel):\n",
+ " linkedin: str = Field(description=\"LinkedIn page of the company\")\n",
+ " twitter: str = Field(description=\"Twitter page of the company\")\n",
+ " github: str = Field(description=\"GitHub page of the company\")\n",
+ "\n",
+ "# Schema for company information\n",
+ "class CompanyInfoSchema(BaseModel):\n",
+ " company_name: str = Field(description=\"Name of the company\")\n",
+ " description: str = Field(description=\"Brief description of the company\")\n",
+ " founders: List[FounderSchema] = Field(description=\"List of company founders\")\n",
+ " logo: str = Field(description=\"Logo URL of the company\")\n",
+ " partners: List[str] = Field(description=\"List of company partners\")\n",
+ " pricing_plans: List[PricingPlanSchema] = Field(description=\"Details of pricing plans\")\n",
+ " contact_emails: List[str] = Field(description=\"Contact emails of the company\")\n",
+ " social_links: SocialLinksSchema = Field(description=\"Social links of the company\")\n",
+ " privacy_policy: str = Field(description=\"URL to the privacy policy\")\n",
+ " terms_of_service: str = Field(description=\"URL to the terms of service\")\n",
+ " api_status: str = Field(description=\"API status page URL\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "cDGH0b2DkY63"
+ },
+ "source": [
+ "### \ud83d\ude80 Initialize `SGAI Client` and start extraction"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4SLJgXgcob6L"
+ },
+ "source": [
+ "Initialize the client for scraping (there's also an async version [here](https://github.com/ScrapeGraphAI/scrapegraph-sdk/blob/main/scrapegraph-py/examples/async_smartscraper_example.py))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "PQI25GZvoCSk"
+ },
+ "outputs": [],
+ "source": [
+ "from scrapegraph_py import Client\n",
+ "\n",
+ "# Initialize the client with explicit API key\n",
+ "sgai_client = Client(api_key=sgai_api_key)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "M1KSXffZopUD"
+ },
+ "source": [
+ "Here we use `Smartscraper` service to extract structured data using AI from a webpage.\n",
+ "\n",
+ "\n",
+ "> If you already have an HTML file, you can upload it and use `Localscraper` instead.\n",
+ "\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "2FIKomclLNFx"
+ },
+ "outputs": [],
+ "source": [
+ "# Request for Trending Repositories\n",
+ "repo_response = sgai_client.smartscraper(\n",
+ " website_url=\"https://scrapegraphai.com/\",\n",
+ " user_prompt=\"Extract info about the company\",\n",
+ " output_schema=CompanyInfoSchema,\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "YZz1bqCIpoL8"
+ },
+ "source": [
+ "Print the response"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "executionInfo": {
+ "elapsed": 339,
+ "status": "ok",
+ "timestamp": 1734532533318,
+ "user": {
+ "displayName": "ScrapeGraphAI",
+ "userId": "10474323355016263615"
+ },
+ "user_tz": -60
+ },
+ "id": "F1VfD8B4LPc8",
+ "outputId": "8d7b2955-1569-4b3a-8ffe-014a8442dd12"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Request ID: 87a7ea1a-9dd4-4d1d-ae76-b419ead57c11\n",
+ "Company Info:\n",
+ "{\n",
+ " \"company_name\": \"ScrapeGraphAI\",\n",
+ " \"description\": \"ScrapeGraphAI is a powerful AI scraping API designed for efficient web data extraction to power LLM applications and AI agents. It enables developers to perform intelligent AI scraping and extract structured information from websites using advanced AI techniques.\",\n",
+ " \"founders\": [\n",
+ " {\n",
+ " \"name\": \"\",\n",
+ " \"role\": \"Founder & Technical Lead\",\n",
+ " \"linkedin\": \"https://www.linkedin.com/in/perinim/\"\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"Marco Vinciguerra\",\n",
+ " \"role\": \"Founder & Software Engineer\",\n",
+ " \"linkedin\": \"https://www.linkedin.com/in/marco-vinciguerra-7ba365242/\"\n",
+ " },\n",
+ " {\n",
+ " \"name\": \"Lorenzo Padoan\",\n",
+ " \"role\": \"Founder & Product Engineer\",\n",
+ " \"linkedin\": \"https://www.linkedin.com/in/lorenzo-padoan-4521a2154/\"\n",
+ " }\n",
+ " ],\n",
+ " \"logo\": \"https://scrapegraphai.com/images/scrapegraphai_logo.svg\",\n",
+ " \"partners\": [\n",
+ " \"PostHog\",\n",
+ " \"AWS\",\n",
+ " \"NVIDIA\",\n",
+ " \"JinaAI\",\n",
+ " \"DagWorks\",\n",
+ " \"Browserbase\",\n",
+ " \"ScrapeDo\",\n",
+ " \"HackerNews\",\n",
+ " \"Medium\",\n",
+ " \"HackADay\"\n",
+ " ],\n",
+ " \"pricing_plans\": [\n",
+ " {\n",
+ " \"tier\": \"Free\",\n",
+ " \"price\": \"$0\",\n",
+ " \"credits\": 100\n",
+ " },\n",
+ " {\n",
+ " \"tier\": \"Starter\",\n",
+ " \"price\": \"$20/month\",\n",
+ " \"credits\": 5000\n",
+ " },\n",
+ " {\n",
+ " \"tier\": \"Growth\",\n",
+ " \"price\": \"$100/month\",\n",
+ " \"credits\": 40000\n",
+ " },\n",
+ " {\n",
+ " \"tier\": \"Pro\",\n",
+ " \"price\": \"$500/month\",\n",
+ " \"credits\": 250000\n",
+ " }\n",
+ " ],\n",
+ " \"contact_emails\": [\n",
+ " \"contact@scrapegraphai.com\"\n",
+ " ],\n",
+ " \"social_links\": {\n",
+ " \"linkedin\": \"https://www.linkedin.com/company/101881123\",\n",
+ " \"twitter\": \"https://x.com/scrapegraphai\",\n",
+ " \"github\": \"https://github.com/ScrapeGraphAI/Scrapegraph-ai\"\n",
+ " },\n",
+ " \"privacy_policy\": \"https://scrapegraphai.com/privacy\",\n",
+ " \"terms_of_service\": \"https://scrapegraphai.com/terms\",\n",
+ " \"api_status\": \"https://scrapegraphapi.openstatus.dev\"\n",
+ "}\n"
+ ]
+ }
+ ],
+ "source": [
+ "import json\n",
+ "\n",
+ "# Print the response\n",
+ "request_id = repo_response['request_id']\n",
+ "result = repo_response['result']\n",
+ "\n",
+ "print(f\"Request ID: {request_id}\")\n",
+ "print(\"Company Info:\")\n",
+ "print(json.dumps(result, indent=2))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "2as65QLypwdb"
+ },
+ "source": [
+ "### \ud83d\udcbe Save the output to a `CSV` file"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "HTLVFgbVLLBR"
+ },
+ "source": [
+ "Let's create a pandas dataframe and show the tables with the extracted content"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "1lS9O1KOI51y"
+ },
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "\n",
+ "# Flatten and save main company information\n",
+ "company_info = {\n",
+ " \"company_name\": result[\"company_name\"],\n",
+ " \"description\": result[\"description\"],\n",
+ " \"logo\": result[\"logo\"],\n",
+ " \"contact_emails\": \", \".join(result[\"contact_emails\"]),\n",
+ " \"privacy_policy\": result[\"privacy_policy\"],\n",
+ " \"terms_of_service\": result[\"terms_of_service\"],\n",
+ " \"api_status\": result[\"api_status\"],\n",
+ " \"linkedin\": result[\"social_links\"][\"linkedin\"],\n",
+ " \"twitter\": result[\"social_links\"][\"twitter\"],\n",
+ " \"github\": result[\"social_links\"].get(\"github\", None)\n",
+ "}\n",
+ "\n",
+ "# Creating dataframes\n",
+ "df_company = pd.DataFrame([company_info])\n",
+ "df_founders = pd.DataFrame(result[\"founders\"])\n",
+ "df_pricing = pd.DataFrame(result[\"pricing_plans\"])\n",
+ "df_partners = pd.DataFrame({\"partner\": result[\"partners\"]})"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JJI9huPkOY9t"
+ },
+ "source": [
+ "Show flattened tables"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 153
+ },
+ "executionInfo": {
+ "elapsed": 199,
+ "status": "ok",
+ "timestamp": 1734533012061,
+ "user": {
+ "displayName": "ScrapeGraphAI",
+ "userId": "10474323355016263615"
+ },
+ "user_tz": -60
+ },
+ "id": "vZs8ZutKOT63",
+ "outputId": "1278a9b9-2ab8-4150-8d37-328d4eb27e49"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "summary": "{\n \"name\": \"df_company\",\n \"rows\": 1,\n \"fields\": [\n {\n \"column\": \"company_name\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"ScrapeGraphAI\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"description\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"ScrapeGraphAI is a powerful AI scraping API designed for efficient web data extraction to power LLM applications and AI agents. It enables developers to perform intelligent AI scraping and extract structured information from websites using advanced AI techniques.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"logo\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"https://scrapegraphai.com/images/scrapegraphai_logo.svg\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"contact_emails\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"contact@scrapegraphai.com\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"privacy_policy\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"https://scrapegraphai.com/privacy\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"terms_of_service\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"https://scrapegraphai.com/terms\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"api_status\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"https://scrapegraphapi.openstatus.dev\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"linkedin\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"https://www.linkedin.com/company/101881123\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"twitter\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"https://x.com/scrapegraphai\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"github\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 1,\n \"samples\": [\n \"https://github.com/ScrapeGraphAI/Scrapegraph-ai\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
+ "type": "dataframe",
+ "variable_name": "df_company"
+ },
+ "text/html": [
+ "\n",
+ "
\n",
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " company_name | \n",
+ " description | \n",
+ " logo | \n",
+ " contact_emails | \n",
+ " privacy_policy | \n",
+ " terms_of_service | \n",
+ " api_status | \n",
+ " linkedin | \n",
+ " twitter | \n",
+ " github | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " ScrapeGraphAI | \n",
+ " ScrapeGraphAI is a powerful AI scraping API de... | \n",
+ " https://scrapegraphai.com/images/scrapegraphai... | \n",
+ " contact@scrapegraphai.com | \n",
+ " https://scrapegraphai.com/privacy | \n",
+ " https://scrapegraphai.com/terms | \n",
+ " https://scrapegraphapi.openstatus.dev | \n",
+ " https://www.linkedin.com/company/101881123 | \n",
+ " https://x.com/scrapegraphai | \n",
+ " https://github.com/ScrapeGraphAI/Scrapegraph-ai | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
\n",
+ "
\n",
+ "
\n"
+ ],
+ "text/plain": [
+ " company_name description \\\n",
+ "0 ScrapeGraphAI ScrapeGraphAI is a powerful AI scraping API de... \n",
+ "\n",
+ " logo \\\n",
+ "0 https://scrapegraphai.com/images/scrapegraphai... \n",
+ "\n",
+ " contact_emails privacy_policy \\\n",
+ "0 contact@scrapegraphai.com https://scrapegraphai.com/privacy \n",
+ "\n",
+ " terms_of_service api_status \\\n",
+ "0 https://scrapegraphai.com/terms https://scrapegraphapi.openstatus.dev \n",
+ "\n",
+ " linkedin twitter \\\n",
+ "0 https://www.linkedin.com/company/101881123 https://x.com/scrapegraphai \n",
+ "\n",
+ " github \n",
+ "0 https://github.com/ScrapeGraphAI/Scrapegraph-ai "
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df_company"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 143
+ },
+ "executionInfo": {
+ "elapsed": 304,
+ "status": "ok",
+ "timestamp": 1734533051319,
+ "user": {
+ "displayName": "ScrapeGraphAI",
+ "userId": "10474323355016263615"
+ },
+ "user_tz": -60
+ },
+ "id": "QR-fyx5cOetl",
+ "outputId": "4b7d55ed-9ef4-44f9-9008-688d734ca820"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "summary": "{\n \"name\": \"df_founders\",\n \"rows\": 3,\n \"fields\": [\n {\n \"column\": \"name\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 3,\n \"samples\": [\n \"\",\n \"Marco Vinciguerra\",\n \"Lorenzo Padoan\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"role\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 3,\n \"samples\": [\n \"Founder & Technical Lead\",\n \"Founder & Software Engineer\",\n \"Founder & Product Engineer\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"linkedin\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 3,\n \"samples\": [\n \"https://www.linkedin.com/in/perinim/\",\n \"https://www.linkedin.com/in/marco-vinciguerra-7ba365242/\",\n \"https://www.linkedin.com/in/lorenzo-padoan-4521a2154/\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
+ "type": "dataframe",
+ "variable_name": "df_founders"
+ },
+ "text/html": [
+ "\n",
+ " \n",
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " name | \n",
+ " role | \n",
+ " linkedin | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " | \n",
+ " Founder & Technical Lead | \n",
+ " https://www.linkedin.com/in/perinim/ | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Marco Vinciguerra | \n",
+ " Founder & Software Engineer | \n",
+ " https://www.linkedin.com/in/marco-vinciguerra-... | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Lorenzo Padoan | \n",
+ " Founder & Product Engineer | \n",
+ " https://www.linkedin.com/in/lorenzo-padoan-452... | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
\n",
+ "
\n",
+ "
\n"
+ ],
+ "text/plain": [
+ " name role \\\n",
+ "0 Founder & Technical Lead \n",
+ "1 Marco Vinciguerra Founder & Software Engineer \n",
+ "2 Lorenzo Padoan Founder & Product Engineer \n",
+ "\n",
+ " linkedin \n",
+ "0 https://www.linkedin.com/in/perinim/ \n",
+ "1 https://www.linkedin.com/in/marco-vinciguerra-... \n",
+ "2 https://www.linkedin.com/in/lorenzo-padoan-452... "
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df_founders"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 175
+ },
+ "executionInfo": {
+ "elapsed": 312,
+ "status": "ok",
+ "timestamp": 1734533059550,
+ "user": {
+ "displayName": "ScrapeGraphAI",
+ "userId": "10474323355016263615"
+ },
+ "user_tz": -60
+ },
+ "id": "SWpCvl53OgyQ",
+ "outputId": "c256f5e5-227a-4df4-da16-d0021aaf03a1"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "summary": "{\n \"name\": \"df_pricing\",\n \"rows\": 4,\n \"fields\": [\n {\n \"column\": \"tier\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 4,\n \"samples\": [\n \"Starter\",\n \"Pro\",\n \"Free\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"price\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 4,\n \"samples\": [\n \"$20/month\",\n \"$500/month\",\n \"$0\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"credits\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 118819,\n \"min\": 100,\n \"max\": 250000,\n \"num_unique_values\": 4,\n \"samples\": [\n 5000,\n 250000,\n 100\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
+ "type": "dataframe",
+ "variable_name": "df_pricing"
+ },
+ "text/html": [
+ "\n",
+ " \n",
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " tier | \n",
+ " price | \n",
+ " credits | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " Free | \n",
+ " $0 | \n",
+ " 100 | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Starter | \n",
+ " $20/month | \n",
+ " 5000 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Growth | \n",
+ " $100/month | \n",
+ " 40000 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " Pro | \n",
+ " $500/month | \n",
+ " 250000 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
\n",
+ "
\n",
+ "
\n"
+ ],
+ "text/plain": [
+ " tier price credits\n",
+ "0 Free $0 100\n",
+ "1 Starter $20/month 5000\n",
+ "2 Growth $100/month 40000\n",
+ "3 Pro $500/month 250000"
+ ]
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df_pricing"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 363
+ },
+ "executionInfo": {
+ "elapsed": 216,
+ "status": "ok",
+ "timestamp": 1734533067079,
+ "user": {
+ "displayName": "ScrapeGraphAI",
+ "userId": "10474323355016263615"
+ },
+ "user_tz": -60
+ },
+ "id": "jNLaHXlEOisi",
+ "outputId": "6f075db5-fc3f-437d-9aaa-d6f8e3085c49"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "summary": "{\n \"name\": \"df_partners\",\n \"rows\": 10,\n \"fields\": [\n {\n \"column\": \"partner\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 10,\n \"samples\": [\n \"Medium\",\n \"AWS\",\n \"Browserbase\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
+ "type": "dataframe",
+ "variable_name": "df_partners"
+ },
+ "text/html": [
+ "\n",
+ " \n",
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " partner | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " PostHog | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " AWS | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " NVIDIA | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " JinaAI | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " DagWorks | \n",
+ "
\n",
+ " \n",
+ " | 5 | \n",
+ " Browserbase | \n",
+ "
\n",
+ " \n",
+ " | 6 | \n",
+ " ScrapeDo | \n",
+ "
\n",
+ " \n",
+ " | 7 | \n",
+ " HackerNews | \n",
+ "
\n",
+ " \n",
+ " | 8 | \n",
+ " Medium | \n",
+ "
\n",
+ " \n",
+ " | 9 | \n",
+ " HackADay | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
\n",
+ "
\n",
+ "
\n"
+ ],
+ "text/plain": [
+ " partner\n",
+ "0 PostHog\n",
+ "1 AWS\n",
+ "2 NVIDIA\n",
+ "3 JinaAI\n",
+ "4 DagWorks\n",
+ "5 Browserbase\n",
+ "6 ScrapeDo\n",
+ "7 HackerNews\n",
+ "8 Medium\n",
+ "9 HackADay"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df_partners"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "v0CBYVk7qA5Z"
+ },
+ "source": [
+ "Save the results to CSV"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "executionInfo": {
+ "elapsed": 213,
+ "status": "ok",
+ "timestamp": 1734533092882,
+ "user": {
+ "displayName": "ScrapeGraphAI",
+ "userId": "10474323355016263615"
+ },
+ "user_tz": -60
+ },
+ "id": "BtEbB9pmQGhO",
+ "outputId": "3f05c8ba-7b34-4b53-ab20-bfcc78060557"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Data saved to CSV files\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Save the DataFrames to a CSV file\n",
+ "df_company.to_csv(\"company_info.csv\", index=False)\n",
+ "df_founders.to_csv(\"founders.csv\", index=False)\n",
+ "df_pricing.to_csv(\"pricing_plans.csv\", index=False)\n",
+ "df_partners.to_csv(\"partners.csv\", index=False)\n",
+ "# Print confirmation\n",
+ "print(\"Data saved to CSV files\")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "-1SZT8VzTZNd"
+ },
+ "source": [
+ "## \ud83d\udd17 Resources"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dUi2LtMLRDDR"
+ },
+ "source": [
+ "\n",
+ "\n",
+ "
\n",
+ "
\n",
+ "\n",
+ "\n",
+ "- \ud83d\ude80 **Get your API Key:** [ScrapeGraphAI Dashboard](https://dashboard.scrapegraphai.com) \n",
+ "- \ud83d\udc19 **GitHub:** [ScrapeGraphAI GitHub](https://github.com/scrapegraphai) \n",
+ "- \ud83d\udcbc **LinkedIn:** [ScrapeGraphAI LinkedIn](https://www.linkedin.com/company/scrapegraphai/) \n",
+ "- \ud83d\udc26 **Twitter:** [ScrapeGraphAI Twitter](https://twitter.com/scrapegraphai) \n",
+ "- \ud83d\udcac **Discord:** [Join our Discord Community](https://discord.gg/uJN7TYcpNa) \n",
+ "\n",
+ "Made with \u2764\ufe0f by the [ScrapeGraphAI](https://scrapegraphai.com) Team \n"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "authorship_tag": "ABX9TyO57uo4LpNqAm10rmE0B6Q5",
+ "collapsed_sections": [
+ "IzsyDXEWwPVt"
+ ],
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.12.11"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}