Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Structured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio python SDK for intelligent web data gathering.

License

Notifications You must be signed in to change notification settings

oxylabs/oxylabs-ai-studio-py

Repository files navigation

OxyLabs AI Studio Python SDK

AI-Studio Python (1)

YouTube

A simple Python SDK for seamlessly interacting with Oxylabs AI Studio API services, including AI-Scraper, AI-Crawler, AI-Browser-Agent and other data extraction tools.

Requirements

  • python 3.10 and above
  • API KEY

Installation

pip install oxylabs-ai-studio

Usage

Crawl (AiCrawler.crawl)

from oxylabs_ai_studio.apps.ai_crawler import AiCrawler

crawler = AiCrawler(api_key="<API_KEY>")

url = "https://oxylabs.io"
result = crawler.crawl(
    url=url,
    user_prompt="Find all pages with proxy products pricing",
    output_format="markdown",
    render_javascript=False,
    return_sources_limit=3,
    geo_location="US",
)
print("Results:")
for item in result.data:
    print(item, "\n")

Parameters:

  • url (str): Starting URL to crawl (required)
  • user_prompt (str): Natural language prompt to guide extraction (required)
  • output_format (Literal["json", "markdown"]): Output format (default: "markdown")
  • schema (dict | None): OpenAPI schema for structured extraction (required if output_format is "json")
  • render_javascript (bool): Render JavaScript (default: False)
  • return_sources_limit (int): Max number of sources to return (default: 25)
  • geo_location (str): proxy location in ISO2 format.

Scrape (AiScraper.scrape)

from oxylabs_ai_studio.apps.ai_scraper import AiScraper

scraper = AiScraper(api_key="<API_KEY>")

schema = scraper.generate_schema(prompt="want to parse developer, platform, type, price game title, genre (array) and description")
print(f"Generated schema: {schema}")

url = "https://sandbox.oxylabs.io/products/3"
result = scraper.scrape(
    url=url,
    output_format="json",
    schema=schema,
    render_javascript=False,
)
print(result)

Parameters:

  • url (str): Target URL to scrape (required)
  • output_format (Literal["json", "markdown"]): Output format (default: "markdown")
  • schema (dict | None): OpenAPI schema for structured extraction (required if output_format is "json")
  • render_javascript (bool): Render JavaScript (default: False)
  • geo_location (str): proxy location in ISO2 format.

Browser Agent (BrowserAgent.run)

from oxylabs_ai_studio.apps.browser_agent import BrowserAgent

browser_agent = BrowserAgent(api_key="<API_KEY>")

schema = browser_agent.generate_schema(
    prompt="game name, platform, review stars and price"
)
print("schema: ", schema)

prompt = "Find if there is game 'super mario odyssey' in the store. If there is, find the price. Use search bar to find the game."
url = "https://sandbox.oxylabs.io/"
result = browser_agent.run(
    url=url,
    user_prompt=prompt,
    output_format="json",
    schema=schema,
)
print(result.data)

Parameters:

  • url (str): Starting URL to browse (required)
  • user_prompt (str): Natural language prompt for extraction (required)
  • output_format (Literal["json", "markdown", "html", "screenshot"]): Output format (default: "markdown")
  • schema (dict | None): OpenAPI schema for structured extraction (required if output_format is "json")
  • geo_location (str): proxy location in ISO2 format.

Search (AiSearch.search)

from oxylabs_ai_studio.apps.ai_search import AiSearch


search = AiSearch(api_key="<API_KEY>")

query = "lasagna recipe"
result = search.search(
    query=query,
    limit=5,
    render_javascript=False,
    return_content=True,
)
print(result.data)

Parameters:

  • query (str): What to search for (required)
  • limit (int): Maximum number of results to return (default: 10, maximum: 50)
  • render_javascript (bool): Render JavaScript (default: False)
  • return_content (bool): Whether to return markdown contents in results (default: True)
  • geo_location (str): search proxy location in ISO2 format.

Map (AiMap.map)

from oxylabs_ai_studio.apps.ai_map import AiMap


ai_map = AiMap(api_key="<API_KEY>")
payload = {
    "url": "https://career.oxylabs.io",
    "user_prompt": "job ad pages",
    "return_sources_limit": 10,
    "geo_location": None,
    "render_javascript": False,
}
result = ai_map.map(**payload)
print(result.data)

Parameters:

  • url (str): Starting URL to crawl (required)
  • user_prompt (str): Natural language prompt to guide extraction (required)
  • render_javascript (bool): Render JavaScript (default: False)
  • return_sources_limit (int): Max number of sources to return (default: 25)
  • geo_location (str): proxy location in ISO2 format.

See the examples folder for usage examples of each method. Each method has corresponding async version.

About

Structured data gathering from any website using AI-powered scraper, crawler, and browser automation. Scraping and crawling with natural language prompts. Equip your LLM agents with fresh data. AI Studio python SDK for intelligent web data gathering.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •