Hey. I'm in a situation where I have to choose either Puppeteer or Playwright. I'm interested in nothing else but maximum efficiency and stability, knowing that my scripts take hours/days to finish.
Thanks.
which of these mcp servers is best be able to connect my cursor to see the error and issues my card is dealing with? or another option?
Videos
the puppeteer stealth package was deprecated as i read. how "bad" is it now? i dont need perfect stealth detection right now, good stealth detection would be sufficient for me.
is there a similar stealth package for playwright? or is there any up to date stealth package right now in general? i'm looking for the 20% effort 80% result approach right here.
or what would be your general take for medium effort scraping in ndoejs? basically i just need to read some og:images from some websites :) thanks for your answers!
In 2024 is Puppeteer still best for this?
Hey. I'm in a situation where I have to choose either Puppeteer or Playwright. I'm interested in nothing else but maximum efficiency and stability, knowing that my scripts take hours/days to finish.
Thanks.
Quick question for Node.js devs: between Playwright and Puppeteer, which one is less resource intensive in terms of CPU and RAM usage?
Running browser automation on a VPS with limited resources, so performance matters.
Thanks!
Im new to webscraping and i wanted to know which of these i could use to create a database of phone specs and laptop specs, around 10,000-20,000 items.
First started learning BeautifulSoup then came to a roadblock when a load more button needed to be used
Then wanted to check out selenium but heard everyone say it's outdated and even the tutorial i was trying to follow vs what I had to code were completely different due to selenium updates and functions not matching
Now I'm going to learn Playwright because tutorial guy is doing smth similar to what I'm doing
and also I saw some people saying using requests by finding endpoints is the easiest way
Can someone help me out with this?
Hey Everyone, I ran a [rather silly] race between Puppeteer, Playwright and Selenium to see which one would be fastest on a simple scrape.
Far from a comprehensive benchmark, this race is 100% free from advanced configurations, multi-threading or anything complicated. It just opens Wallapop (a second hand marketplace in Spain) and times how long it takes to extract the first 2000 results of a search.
Another thing to note is that I ran this on Google Colab, that throttles resources unpredictably, so take this as it is, just a simple-fun race with lots of questionable decisions.
If you like this simple format, have any ideas on how to improve a race like this or have a strong urge to prove Ward Cunningham wright, let me know in the comments!
(Also, if you think your tool of choice isn't being represented fairly, feel free show how simple code improvements yield more speed with the same resources :)
Just as the title suggests :)
I remember thinking Playwright was the obvious option for a few years, but I've never really found myself needing the extra browsers.
I'm a full-stack Typescript fanatic anyway, almost exclusively using chromium based browsers, and I'm wondering if Puppeteer has any advantages in speed, dev tooling or reliability seeing as it focuses on the same.
How would you host Puppeteer or Playwright with minimal downtime? I've tried a VPS, but sometimes it can max out with just 1 user on it - but it's affordable.
I see browserless.io starts at $200 a month, which seems a bit steep - can it handle hundreds of requests at a time or something? Anyone got experience of running Puppeteer on an app used by hundreds of people? What would you recommend going with?
I want to write a small script to scrape a small business directory in my city. Nothing crazy, it's a single page with filters, not hundreds of individual pages.
I'm looking for a lightweight library. Don't want to download a full Chromium install if I don't have to.
I've looked into Osmosis, Xray, and NoodleJS and none of them are actively maintained (will that even be an issue for use case?).
Are there alternatives to Puppeteer and Cheerio for scraping in 2023 or are they sill the "go-to"? I liked the simplified API of Osmosis and Xray, but yeah like I mentioned they are not maintained.
Thanks for the suggestions!
If your Playwright/Puppeteer scripts work fine and never get blocked, this isn't for you.
But if you're tired of your automation breaking every time a site updates their anti-bot detection, keep reading.
The problem: Traditional browser automation gets flagged. You spend more time fixing broken scripts than actually automating things. Especially painful for sites without solid APIs like LinkedIn, Twitter, or Reddit.
What I switched to: CDP MCP (Chrome DevTools Protocol with Model Context Protocol)
Here's the magic: The AI runs the workflow once, learns the pattern, then it executes without the LLM - making it 100x cheaper and way more reliable.
What I'm automating now:
Go to twitter and post this {content}
Open Gmail and send this email: {content} to {recipient} with subject:{subject}
Open my web app and Test the login functionality with these credentials {username}, {password}
Go to this LinkedIn profile {profile link} and extract the professional experiences and details of this person (output in JSON)
Go to Reddit and post {content} in this community: {community}, adhering to Guidelines: {guidelines}
Go to Reddit and get all comments from this post: {link}
Go to Reddit and reply {response} to this comment {comment}
The killer feature: These workflows become API calls you can plug into n8n, Make, or your own pipelines.
Same outcome every time. No more "why did my automation break overnight?"
For the automation engineers here: How much of your time is spent debugging scripts that worked yesterday?
Because mine just got that time back. And my monthly LLM costs went from $200 to $2.
It's free and open source if you want to try it out.
Hi! I'm working on a pet project requiering headful automated page browsing. After doing a bit of preliminary research, I've settled with Puppeteer. One other thing I noticed is the whole Cypress vs Playwright vs Celenium debate going on, and Puppeteer is omitted from it. Any ideas why?
fwiw one of the head playwright devs (pavel feldman) invented puppeteer. I don't think playwright is using puppeteer under the hood but it's a really great tool for automating browsers.
Puppeteer is more of a tool to control chromium than a testing framework. The company I work for built our AT solution using puppeteer and jest as the testing framework and while it worked and we used it for 5-6 years, it’s just a ton of our own custom-built logic. Compare to playwright where everything seems to just work right out of the box and even has code gen, it’s just not worth writing your own tool essentially. Will puppeteer work? Yes, but you will be building a lot of features to get it to compare to what cypress/playwright/selenium give you in their vanilla forms. Now if you aren’t testing with assertions/etc, puppeteer is indeed awesome for just controlling the browser, which is what you are asking. I still think playwright just feels smoother and the codegen works pretty well for click and record, whereas puppeteer doesn’t have that.