Using Proxies with Puppeteer
A complete guide to using proxies with Puppeteer and handling all possible corner cases.
Puppeteer is a Node.js library for managing Chrome/Chromium browsers via code.
You can use Puppeteer for web scraping, form submission automation, taking screenshots of websites, and almost everything else you can do manually with a browser.
And often, especially when you perform scraping, you need to use proxies to avoid being blocked. Or if you want to test your website from different locations, for example.
But there are some nuances to using proxies with Puppeteer.
Prerequisites
All code examples are written using JavaScript and ran using Node.js version 20+.
Make sure you have installed the Puppeteer library:
npm install puppeteer
In the package.json
file, the type
property is set to module
.
Simple browser-wide proxies
There are some nuances between using them, but Chrome/Chromium supports HTTP, HTTPS, SOCKS, and QUIC proxies. But, most of the time, you will use HTTP or HTTPS proxies. So let’s assemble a working example of using proxies with Puppeteer.
import puppeteer from "puppeteer";
const proxy = {
server: "http://proxy-ip-or-domain:proxy-port",
};
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxy.server}`],
});
const page = await browser.newPage();
await page.goto("https://httpbin.org/ip");
const ip = await page.$eval("body", (body) => body.textContent);
console.log(ip);
await browser.close();
That works great for the most cases. But what if your proxy requires authentication?
Proxy authentication with Puppeteer
It is easy. You can use the authenticate method provided by Puppeteer:
import puppeteer from "puppeteer";
const proxy = {
server: "http://proxy-ip-or-domain:proxy-port",
username: "your-username",
password: "your-password",
};
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxy.server}`],
});
if (proxy.username && proxy.password) {
// if your proxy requires authentication
await page.authenticate({
username: proxy.username,
password: proxy.password,
});
}
const page = await browser.newPage();
await page.goto("https://httpbin.org/ip");
const ip = await page.$eval("body", (body) => body.textContent);
console.log(ip);
await browser.close();
But it only works for HTTP and HTTPS proxies. SOCKS proxies do not support authentication methods.
Also, there is a caveat: what if the website you are trying to access also has basic authentication? How will you handle it? Since now your proxy authentication header conflicts with the website authentication header.
Using the Proxy-Authorization
header won’t work unfortunately most of the time. Not every proxy provider supports it, and it is not implemented in Puppeteer natively. But there is a way out of it.
Using a custom proxy server for more complex scenarios
Using a programmable HTTP proxy server for Node.js like proxy-chain:
npm i proxy-chain
And now:
import puppeteer from "puppeteer";
import proxyChain from "proxy-chain";
const oldProxyUrl = `http://proxy-user:proxy-password@proxy-host:proxy-port`;
const newProxyUrl = await proxyChain.anonymizeProxy(oldProxyUrl);
// prints something like "http://127.0.0.1:45678"
console.log(newProxyUrl);
const browser = await puppeteer.launch({
args: [`--proxy-server=${newProxyUrl}`],
});
const page = await browser.newPage();
await page.authenticate({
username: "foo",
password: "bar",
});
const response = await page.goto("https://httpbin.org/basic-auth/foo/bar");
if (!response.ok) {
console.log(response.statusText);
} else {
const result = await page.$eval("body", (body) => body.textContent);
console.log(result);
}
await browser.close();
await proxyChain.closeAnonymizedProxy(newProxyUrl, true);
The only downside of that approach is that it proxies all requests through the opened proxy server, which slows down communication between the browser and the proxy since it all goes through your application.
Proxy per page
There is no way to use proxies on a page level with Puppeteer. But, there is a handy plugin that can do it for you—puppeteer-page-proxy
:
npm i @lem0-packages/puppeteer-page-proxy
The puppeteer-page-proxy
library doesn’t work with the latest version of the Puppeteer library, so I use a fork of the library that addresses all the latest errors and issues related to it.
But be aware of the huge downside of that plugin: it will intercept and execute every browser request through your application, which will have a significant performance impact.
If it is OK for you, then it is fairly easy to use:
import puppeteer from "puppeteer";
import useProxy from "@lem0-packages/puppeteer-page-proxy";
const proxyUrl = `http://proxy-user:proxy-password@proxy-host:proxy-port`;
const browser = await puppeteer.launch({});
const page = await browser.newPage();
await useProxy(page, proxyUrl);
await page.goto("https://httpbin.org/ip");
const result = await page.$eval("body", (body) => body.textContent);
console.log(result);
await browser.close();
Rotating proxies
Often, if you use a paid proxy provider, they already provide you the proxy server that rotates proxies for you and you don’t need to do anything at all—just set up and forget.
But if you have a static or even a dynamic list of proxy servers, you can rotate proxies by yourself:
import puppeteer from "puppeteer";
const proxies = [
//...
{
server: "http://proxy-ip-or-domain:proxy-port",
username: "your-username",
password: "your-password",
},
// ...
];
async function execute(proxy) {
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxy.server}`],
});
const page = await browser.newPage();
if (proxy.username && proxy.password) {
// if your proxy requires authentication
await page.authenticate({
username: proxy.username,
password: proxy.password,
});
}
await page.goto("https://httpbin.org/ip");
const ip = await page.$eval("body", (body) => body.textContent);
console.log(ip);
await browser.close();
}
const randomProxy = proxies[Math.floor(Math.random() * proxies.length)];
await execute(randomProxy);
You can also check if the randomly chosen proxy has been already used or use another more deterministic algorithm for proxy rotation.
Do not use free proxies
You may find many free proxies by googling. I don’t sell you any proxy provider, but I do not recommend using free proxies for production, and even for development environments.
Furthermore, it is not secure to send your traffic through unknown proxies, as they are often unreliable and unstable to use.
Instead, just use paid proxies or setting up your own proxy server for development purposes.
Troubleshooting
Let me share a few common reasons why proxies for Puppeteer might not work.
Your proxy provider blocks specific websites
Your proxy provider may block specific websites because they want to maintain their IP reputation. Usually, you can get it unblocked by reaching out to their support.
Check proxy without Puppeteer
Use a simple cURL
request to test if your proxy works at all:
curl -x 'your-proxy-url' 'https://example.com'
Potentially Asked Questions
I gathered some questions that might be asked about using proxies with Puppeteer.
Is using HTTP proxy secure?
Yes and no. It depends.
For example, communication with HTTP proxy servers is insecure, meaning proxied HTTP requests are sent in the clear.
But when proxying HTTPS requests through an HTTP proxy, the TLS exchange is forwarded through the proxy using the CONNECT method, so end-to-end encryption is not broken.
However, when establishing the tunnel, the hostname of the target URL is sent to the proxy server in the clear.