
Lucas Mitchell
Automation Engineer
TLDR: This guide demonstrates how to combine Botasaurus, a Python web scraping framework with built-in anti-detection features, and CapSolver, a captcha solving API, to automatically bypass reCAPTCHA v2, reCAPTCHA v3, and Cloudflare Turnstile during large-scale web scraping. The core process involves setting up the environment, using the CapSolver browser extension to identify captcha parameters, calling the CapSolver API via a Python helper function to get a solution token, and finally using Botasaurus to inject the token into the webpage for form submission.

Web scraping at scale often encounters captchas that block automated access. This guide demonstrates how to combine Botasaurus, a powerful web scraping framework, with CapSolver to automatically solve reCAPTCHA v2, reCAPTCHA v3, and Cloudflare Turnstile captchas.
Botasaurus is a Python web scraping framework that simplifies browser automation with built-in anti-detection capabilities. It provides a clean decorator-based API for browser tasks.
Key Features:
@browserInstallation:
pip install botasaurus
Basic usage:
from botasaurus.browser import browser, Driver
@browser()
def scrape_page(driver: Driver, data):
driver.get("https://example.com")
title = driver.get_text("h1")
return {"title": title}
# Run the scraper
result = scrape_page()
CapSolver is a captcha solving service that provides an API to solve various captcha types including reCAPTCHA and Cloudflare Turnstile.
Supported captcha types:
Getting your API key:
CAP-)pip install botasaurus capsolver requests python-dotenv
Create a .env file in your project root:
CAPSOLVER_API_KEY=CAP-YOUR_API_KEY_HERE
Create a shared configuration loader:
# shared/config.py
import os
from pathlib import Path
from dotenv import load_dotenv
# Load .env file from project root
ROOT_DIR = Path(__file__).parent.parent
load_dotenv(ROOT_DIR / ".env")
class Config:
"""Configuration class for CapSolver integration."""
# CapSolver API Key
CAPSOLVER_API_KEY: str = os.getenv("CAPSOLVER_API_KEY", "")
# CapSolver API endpoints
CAPSOLVER_API_URL = "https://api.capsolver.com"
CREATE_TASK_ENDPOINT = f"{CAPSOLVER_API_URL}/createTask"
GET_RESULT_ENDPOINT = f"{CAPSOLVER_API_URL}/getTaskResult"
@classmethod
def validate(cls) -> bool:
"""Check if the configuration is valid."""
if not cls.CAPSOLVER_API_KEY:
print("Error: CAPSOLVER_API_KEY not set!")
return False
return True
Before integrating with the API, you need to identify the correct parameters for the target captcha. The CapSolver browser extension provides an easy way to detect all required parameters automatically.
Install the CapSolver extension from the Chrome Web Store.
Important: Do not close the Capsolver panel before triggering the captcha, as closing it erases previously detected information.
The extension automatically identifies all required reCAPTCHA parameters:
The detector provides a formatted JSON output ready for API integration, making it easy to copy the exact parameters needed for your solving tasks.
For more details, see the complete guide on identifying captcha parameters.
reCAPTCHA v2 is the classic "I'm not a robot" checkbox captcha. It may present image selection challenges to users.
You can use the CapSolver extension detector (described above) or find the site key manually:
Look in the page HTML for:
<div class="g-recaptcha" data-sitekey="6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-"></div>
Or in JavaScript:
grecaptcha.render('container', {'sitekey': '6Le-xxxxx...'});
# utils/capsolver_helper.py
import time
import requests
from shared.config import Config
def solve_recaptcha_v2(
website_url: str,
website_key: str,
is_invisible: bool = False,
timeout: int = 120
) -> dict:
"""
Solve reCAPTCHA v2 using CapSolver API.
Args:
website_url: The URL of the page with the captcha
website_key: The reCAPTCHA site key
is_invisible: Whether it's invisible reCAPTCHA v2
timeout: Maximum time to wait for solution (seconds)
Returns:
dict with 'gRecaptchaResponse' token
"""
if not Config.validate():
raise Exception("Invalid configuration - check your API key")
# Build task payload
task = {
"type": "ReCaptchaV2TaskProxyLess",
"websiteURL": website_url,
"websiteKey": website_key,
}
if is_invisible:
task["isInvisible"] = True
payload = {
"clientKey": Config.CAPSOLVER_API_KEY,
"task": task
}
# Create task
response = requests.post(Config.CREATE_TASK_ENDPOINT, json=payload)
result = response.json()
if result.get("errorId") and result.get("errorId") != 0:
raise Exception(f"Failed to create task: {result.get('errorDescription')}")
task_id = result.get("taskId")
# Poll for result
start_time = time.time()
while time.time() - start_time < timeout:
time.sleep(2)
result_payload = {
"clientKey": Config.CAPSOLVER_API_KEY,
"taskId": task_id
}
response = requests.post(Config.GET_RESULT_ENDPOINT, json=result_payload)
result = response.json()
if result.get("status") == "ready":
return result.get("solution", {})
elif result.get("status") == "failed":
raise Exception(f"Task failed: {result.get('errorDescription')}")
raise Exception(f"Timeout after {timeout} seconds")
from botasaurus.browser import browser, Driver
from shared.config import Config
from utils.capsolver_helper import solve_recaptcha_v2
DEMO_URL = "https://www.google.com/recaptcha/api2/demo"
DEMO_SITEKEY = "6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-"
@browser(headless=False)
def solve_recaptcha_v2_with_api(driver: Driver, data: dict):
"""Solve reCAPTCHA v2 using CapSolver API and inject the token."""
url = data.get("url", DEMO_URL)
site_key = data.get("site_key", DEMO_SITEKEY)
# Step 1: Load the page
driver.get(url)
driver.sleep(2)
# Step 2: Extract site key from page (optional)
extracted_key = driver.run_js("""
const recaptchaDiv = document.querySelector('.g-recaptcha');
return recaptchaDiv ? recaptchaDiv.getAttribute('data-sitekey') : null;
""")
if extracted_key:
site_key = extracted_key
# Step 3: Solve captcha via CapSolver API
solution = solve_recaptcha_v2(
website_url=url,
website_key=site_key
)
token = solution.get("gRecaptchaResponse")
# Step 4: Inject token into the page
driver.run_js(f"""
// Set the hidden textarea value
const responseField = document.querySelector('[name="g-recaptcha-response"]');
if (responseField) {{
responseField.value = "{token}";
}}
// Trigger callback if available
if (typeof ___grecaptcha_cfg !== 'undefined') {{
try {{
const clients = ___grecaptcha_cfg.clients;
for (const key in clients) {{
const client = clients[key];
if (client && client.callback) {{
client.callback("{token}");
}}
}}
}} catch (e) {{}}
}}
""")
# Step 5: Submit the form
submit_button = driver.select('input[type="submit"]')
if submit_button:
submit_button.click()
driver.sleep(2)
return {"success": True, "token_length": len(token)}
# Run the demo
result = solve_recaptcha_v2_with_api(data={"url": DEMO_URL, "site_key": DEMO_SITEKEY})
reCAPTCHA v3 is invisible and works by analyzing user behavior to generate a score from 0.0 to 1.0.
Key difference from v2: reCAPTCHA v3 requires a pageAction parameter.
The easiest way to find the pageAction is using the CapSolver extension detector. Alternatively, look in the page JavaScript for:
grecaptcha.execute('siteKey', {action: 'login'})
// The 'login' is your pageAction
def solve_recaptcha_v3(
website_url: str,
website_key: str,
page_action: str,
timeout: int = 120
) -> dict:
"""
Solve reCAPTCHA v3 using CapSolver API.
Args:
website_url: The URL of the page with the captcha
website_key: The reCAPTCHA site key
page_action: The action parameter (REQUIRED for v3)
timeout: Maximum time to wait for solution (seconds)
Returns:
dict with 'gRecaptchaResponse' token
"""
if not Config.validate():
raise Exception("Invalid configuration - check your API key")
if not page_action:
raise Exception("pageAction is REQUIRED for reCAPTCHA v3")
# Build task payload
task = {
"type": "ReCaptchaV3TaskProxyLess",
"websiteURL": website_url,
"websiteKey": website_key,
"pageAction": page_action, # REQUIRED for v3
}
payload = {
"clientKey": Config.CAPSOLVER_API_KEY,
"task": task
}
# Create task
response = requests.post(Config.CREATE_TASK_ENDPOINT, json=payload)
result = response.json()
if result.get("errorId") and result.get("errorId") != 0:
raise Exception(f"Failed to create task: {result.get('errorDescription')}")
task_id = result.get("taskId")
# Poll for result
start_time = time.time()
while time.time() - start_time < timeout:
time.sleep(2)
result_payload = {
"clientKey": Config.CAPSOLVER_API_KEY,
"taskId": task_id
}
response = requests.post(Config.GET_RESULT_ENDPOINT, json=result_payload)
result = response.json()
if result.get("status") == "ready":
return result.get("solution", {})
elif result.get("status") == "failed":
raise Exception(f"Task failed: {result.get('errorDescription')}")
raise Exception(f"Timeout after {timeout} seconds")
from botasaurus.browser import browser, Driver
from shared.config import Config
from utils.capsolver_helper import solve_recaptcha_v3
DEMO_URL = "https://recaptcha-demo.appspot.com/recaptcha-v3-request-scores.php"
DEMO_SITEKEY = "6LdyC2cUAAAAACGuDKpXeDorzUDWXmdqeg-xy696"
PAGE_ACTION = "examples/v3scores"
@browser(headless=False)
def solve_recaptcha_v3_with_api(driver: Driver, data: dict):
"""Solve reCAPTCHA v3 using CapSolver API and inject the token."""
url = data.get("url", DEMO_URL)
site_key = data.get("site_key", DEMO_SITEKEY)
page_action = data.get("page_action", PAGE_ACTION)
# Step 1: Load the page
driver.get(url)
driver.sleep(2)
# Step 2: Solve captcha via CapSolver API
solution = solve_recaptcha_v3(
website_url=url,
website_key=site_key,
page_action=page_action
)
token = solution.get("gRecaptchaResponse")
# Step 3: Inject token into the page
driver.run_js(f"""
const token = "{token}";
// Set hidden field if exists
const responseField = document.querySelector('[name="g-recaptcha-response"]');
if (responseField) {{
responseField.value = token;
}}
// Create hidden field if form exists but field doesn't
const forms = document.querySelectorAll('form');
forms.forEach(form => {{
let field = form.querySelector('[name="g-recaptcha-response"]');
if (!field) {{
field = document.createElement('input');
field.type = 'hidden';
field.name = 'g-recaptcha-response';
form.appendChild(field);
}}
field.value = token;
}});
""")
# Step 4: Submit or verify
buttons = driver.select_all("button")
for button in buttons:
if "verify" in button.text.lower() or "submit" in button.text.lower():
button.click()
driver.sleep(2)
break
return {"success": True, "token_length": len(token)}
# Run the demo
result = solve_recaptcha_v3_with_api(data={
"url": DEMO_URL,
"site_key": DEMO_SITEKEY,
"page_action": PAGE_ACTION
})
Cloudflare Turnstile is a privacy-focused captcha alternative designed to be less intrusive than traditional captchas.
Key differences from reCAPTCHA:
AntiTurnstileTaskProxyLesstoken (not gRecaptchaResponse)0x4Look in the page HTML for:
<div class="cf-turnstile" data-sitekey="0x4AAAAAAABS7vwvV6VFfMcD"></div>
def solve_turnstile(
website_url: str,
website_key: str,
action: str = None,
cdata: str = None,
timeout: int = 120
) -> dict:
"""
Solve Cloudflare Turnstile using CapSolver API.
Args:
website_url: The URL of the page with Turnstile
website_key: The Turnstile site key (starts with 0x4)
action: Optional action from data-action attribute
cdata: Optional cdata from data-cdata attribute
timeout: Maximum time to wait for solution (seconds)
Returns:
dict with 'token' field
"""
if not Config.validate():
raise Exception("Invalid configuration - check your API key")
# Build task payload
task = {
"type": "AntiTurnstileTaskProxyLess",
"websiteURL": website_url,
"websiteKey": website_key,
}
# Add optional metadata
metadata = {}
if action:
metadata["action"] = action
if cdata:
metadata["cdata"] = cdata
if metadata:
task["metadata"] = metadata
payload = {
"clientKey": Config.CAPSOLVER_API_KEY,
"task": task
}
# Create task
response = requests.post(Config.CREATE_TASK_ENDPOINT, json=payload)
result = response.json()
if result.get("errorId") and result.get("errorId") != 0:
raise Exception(f"Failed to create task: {result.get('errorDescription')}")
task_id = result.get("taskId")
# Poll for result
start_time = time.time()
while time.time() - start_time < timeout:
time.sleep(2)
result_payload = {
"clientKey": Config.CAPSOLVER_API_KEY,
"taskId": task_id
}
response = requests.post(Config.GET_RESULT_ENDPOINT, json=result_payload)
result = response.json()
if result.get("status") == "ready":
return result.get("solution", {})
elif result.get("status") == "failed":
raise Exception(f"Task failed: {result.get('errorDescription')}")
raise Exception(f"Timeout after {timeout} seconds")
from botasaurus.browser import browser, Driver
from shared.config import Config
from utils.capsolver_helper import solve_turnstile
DEMO_URL = "https://peet.ws/turnstile-test/non-interactive.html"
DEMO_SITEKEY = "0x4AAAAAAABS7vwvV6VFfMcD"
@browser(headless=False)
def solve_turnstile_with_api(driver: Driver, data: dict):
"""Solve Cloudflare Turnstile using CapSolver API and inject the token."""
url = data.get("url", DEMO_URL)
site_key = data.get("site_key", DEMO_SITEKEY)
# Step 1: Load the page
driver.get(url)
driver.sleep(3)
# Step 2: Extract site key from page (optional)
extracted_params = driver.run_js("""
const turnstileDiv = document.querySelector('.cf-turnstile, [data-sitekey]');
if (turnstileDiv) {
const key = turnstileDiv.getAttribute('data-sitekey');
if (key && key.startsWith('0x')) {
return {
sitekey: key,
action: turnstileDiv.getAttribute('data-action')
};
}
}
return null;
""")
if extracted_params and extracted_params.get("sitekey"):
site_key = extracted_params["sitekey"]
# Step 3: Solve Turnstile via CapSolver API
solution = solve_turnstile(
website_url=url,
website_key=site_key,
action=extracted_params.get("action") if extracted_params else None
)
token = solution.get("token")
# Step 4: Inject token into the page
driver.run_js(f"""
const token = "{token}";
// Find and fill cf-turnstile-response field
const responseFields = [
document.querySelector('[name="cf-turnstile-response"]'),
document.querySelector('[name="cf_turnstile_response"]'),
document.querySelector('input[name*="turnstile"]')
];
for (const field of responseFields) {{
if (field) {{
field.value = token;
break;
}}
}}
// Create hidden field if form exists but field doesn't
const forms = document.querySelectorAll('form');
forms.forEach(form => {{
let field = form.querySelector('[name="cf-turnstile-response"]');
if (!field) {{
field = document.createElement('input');
field.type = 'hidden';
field.name = 'cf-turnstile-response';
form.appendChild(field);
}}
field.value = token;
}});
""")
# Step 5: Submit the form
submit_btn = driver.select('button[type="submit"], input[type="submit"]')
if submit_btn:
submit_btn.click()
driver.sleep(2)
return {"success": True, "token_length": len(token)}
# Run the demo
result = solve_turnstile_with_api(data={"url": DEMO_URL, "site_key": DEMO_SITEKEY})
| Captcha Type | Task Type | Response Field | Required Parameters |
|---|---|---|---|
| reCAPTCHA v2 | ReCaptchaV2TaskProxyLess |
gRecaptchaResponse |
websiteURL, websiteKey |
| reCAPTCHA v2 Enterprise | ReCaptchaV2EnterpriseTaskProxyLess |
gRecaptchaResponse |
websiteURL, websiteKey |
| reCAPTCHA v3 | ReCaptchaV3TaskProxyLess |
gRecaptchaResponse |
websiteURL, websiteKey, pageAction |
| reCAPTCHA v3 Enterprise | ReCaptchaV3EnterpriseTaskProxyLess |
gRecaptchaResponse |
websiteURL, websiteKey, pageAction |
| Cloudflare Turnstile | AntiTurnstileTaskProxyLess |
token |
websiteURL, websiteKey |
For sites that block datacenter IPs, use the proxy variants (e.g., ReCaptchaV2Task) and provide your own residential proxy.
1. Token Expiry
Captcha tokens expire quickly (typically within 2 minutes). Always use the token immediately after receiving it:
# Get token
solution = solve_recaptcha_v2(url, site_key)
token = solution.get("gRecaptchaResponse")
# Use immediately - don't store for later
driver.run_js(f'document.querySelector("[name=g-recaptcha-response]").value = "{token}"')
driver.select('button[type="submit"]').click()
2. Error Handling
Always implement proper error handling for API failures:
try:
solution = solve_recaptcha_v2(url, site_key)
except Exception as e:
print(f"Captcha solving failed: {e}")
# Implement retry logic or fallback
3. Rate Limiting
Add delays between requests to avoid triggering anti-bot measures:
driver.sleep(2) # Wait after page load
# ... solve captcha ...
driver.sleep(1) # Wait before form submission
4. Validate Configuration
Always validate your API key before making requests:
if not Config.validate():
raise Exception("Please configure your API key in .env file")
Combining Botasaurus with CapSolver provides a robust solution for handling captchas in web scraping projects. The API-based approach gives you full control over the solving process and works reliably across different captcha types.
Boost your automation budget instantly!
Use bonus code CAPN when topping up your CapSolver account to get an extra 5% bonus on every recharge — with no limits.
Redeem it now in your CapSolver Dashboard
websiteURL and websiteKey parameterspageAction parametertoken field instead of gRecaptchaResponseThe most effective method is to use a robust browser automation framework like Botasaurus, which handles anti-detection, and integrate it with a dedicated captcha solving API like CapSolver to programmatically obtain the required solution tokens.
Botasaurus simplifies browser automation with a clean, decorator-based API while providing essential built-in stealth features to minimize the risk of being detected and blocked by target websites.
While both require the websiteURL and websiteKey, solving reCAPTCHA v3 (the invisible, score-based version) additionally requires a pageAction parameter to be included in the task payload sent to the CapSolver API.
Once the token (e.g., gRecaptchaResponse or token) is received, it must be immediately injected into the target webpage's hidden form field using a JavaScript execution command before the form can be successfully submitted to the server.
The solution tokens provided by CapSolver for reCAPTCHA and Turnstile have a very short validity period, typically expiring in approximately 2 minutes, requiring immediate use.
Learn scalable Rust web scraping architecture with reqwest, scraper, async scraping, headless browser scraping, proxy rotation, and compliant CAPTCHA handling.

Learn the best techniques to scrape job listings without getting blocked. Master Indeed scraping, Google Jobs API, and web scraping API with CapSolver.
