QA Graphic

PlayWright URL Scraping

Sample Code to get all URLs

While experimenting with Playwright this week, I put together a script that grabs all the URLs from a website and writes them to a file. Here's the code that I finally came up with:

This approach is particularly useful when you need to ensure that all the anchor tags on the homepage are functioning as expected. By verifying the anchor tags separately, you can isolate any issues related to broken or misconfigured links, making it easier to pinpoint and address problems.

Additionally, I'll create another test specifically to validate that the URLs associated with these anchor tags are correct. This two-pronged strategy ensures that both the structure and the destinations of your links are accurate.

Pro Tip: The reason for separating these tasks, instead of validating the URLs while scraping the homepage, is to enhance the efficiency of your test execution. By dividing the workload into smaller, targeted tests, you can leverage parallel execution to speed up the overall testing process. This approach not only reduces the total runtime of your test suite but also provides clearer insights into potential issues, allowing you to debug faster and more effectively.

 

About

Welcome to Playwright Tips and Tricks, your go-to resource for mastering the art of web automation and testing with Playwright! Whether you're a seasoned developer looking to streamline your workflows or a curious beginner eager to dive into the world of browser automation, this blog is designed with you in mind. Here, I'll share a treasure trove of practical insights, clever hacks, and step-by-step guides to help you harness the full power of Playwright - a modern, open-source tool that's revolutionizing how we interact with web applications.

Schedule

Thursday 3 PlayWright
Friday 4 Macintosh
Saturday 5 Internet Tools
Sunday 6 Misc
Monday 7 Media
Tuesday 8 QA
Wednesday 9 Pytest