
Lucas Mitchell
Automation Engineer

Web scraping allows you to extract data from websites, but websites may implement anti-scraping measures such as captchas or rate-limiting. In this guide, we’ll introduce the Requests library and provide an example of how to scrape data from a live website: Quotes to Scrape. Additionally, we'll explore how to handle reCAPTCHA v2 challenges using Requests and Capsolver.
Requests is a simple and powerful Python library used to make HTTP requests. It's widely used for tasks like interacting with APIs, downloading web pages, and scraping data. With its user-friendly API, it's easy to send requests, handle sessions, and deal with HTTP headers and cookies.
Install the Requests library using pip:
pip install requests
Let’s start with a basic web scraping example where we’ll extract quotes from the Quotes to Scrape website using Requests.
import requests
from bs4 import BeautifulSoup
# URL of the page to scrape
url = 'http://quotes.toscrape.com/'
# Send a GET request to the page
response = requests.get(url)
# Check if the request was successful
if response.status_code == 200:
# Parse the page content using BeautifulSoup
soup = BeautifulSoup(response.text, 'html.parser')
# Find all the quotes on the page
quotes = soup.find_all('span', class_='text')
# Print each quote
for quote in quotes:
print(quote.text)
else:
print(f"Failed to retrieve the page. Status Code: {response.status_code}")
Some websites, however, may employ reCAPTCHA to prevent scraping. In this case, solving reCAPTCHA is necessary before accessing content. Using Capsolver alongside Requests, we can automate the captcha-solving process.
Install the Capsolver library:
pip install capsolver requests
Below is a sample script that solves reCAPTCHA v2 challenges using Capsolver and sends a request with the solved captcha token:
import capsolver
import requests
# Consider using environment variables for sensitive information
PROXY = "http://username:password@host:port"
capsolver.api_key = "Your Capsolver API Key"
PAGE_URL = "https://example.com"
PAGE_KEY = "Your-Site-Key"
def solve_recaptcha_v2(url, key):
solution = capsolver.solve({
"type": "ReCaptchaV2Task",
"websiteURL": url,
"websiteKey": key,
"proxy": PROXY
})
return solution['solution']['gRecaptchaResponse']
def main():
print("Solving reCaptcha v2")
solution = solve_recaptcha_v2(PAGE_URL, PAGE_KEY)
print("Solution: ", solution)
# Headers to simulate browser
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}
# Data payload with the captcha solution
data = {
'g-recaptcha-response': solution
}
# Send GET request to the target page with the captcha solution
response = requests.get(PAGE_URL, headers=headers, data=data, proxies={"http": PROXY, "https": PROXY})
# Check the response status and print the content if successful
if response.status_code == 200:
print("Successfully bypassed captcha and fetched the page!")
print(response.text)
else:
print(f"Failed to fetch the page. Status Code: {response.status_code}")
if __name__ == "__main__":
main()
solve_recaptcha_v2 function sends the site’s key and URL to Capsolver, along with proxy information, to obtain a solved captcha token.g-recaptcha-response is included in the request data payload and sent with custom headers to the target URL.User-Agent header to avoid detection as a bot.When web scraping, it is essential to be ethical and follow best practices:
robots.txt: Always check the website's robots.txt to ensure scraping is permitted.User-Agent.The Requests library offers an easy and efficient way to scrape websites, while handling advanced scenarios such as reCAPTCHA can be achieved with Capsolver. Always ensure your scraping activities comply with the website’s terms of service and legal guidelines.
Happy scraping!
CapSolver evolves into a core automation layer with improved UI, integrations, and enterprise-grade data capabilities.

Discover the best AI for solving image puzzles. Learn how CapSolver's Vision Engine and ImageToText APIs automate complex visual challenges with high accuracy.
