WiFi-Money: Crawl (with Selenium + AI) your way to freedom
No website is safe from scraping when you learn how to use Selenium + AI
A remarkable number of contracting jobs have involved scraping data from the web.
A client on an old version of software needs to download their data before their vendor migrates them to a new version.
An upstart fintech startup wants to populate latest financial data for free before committing to paying for expensive data licensing agreements.
An indie hacker wants to aggregate data on all key influencers in a given industry and sell access to the data set.
A new crypto app wants to pull social media trend data to offer a way to bet for or against certain hashtags.
In all these cases, Selenium was the only way to pull mass amounts of clean data, consistently, from some of the most locked down websites in the world.
Ship fast, or software that will last? Now, you don't have to choose.
Fortune 500 or well-funded startups usually snatch up all the 10x programmers. Turns out, not all of them.
Hire me as CTO or tech lead for your next project.
Focus on growing the business, never worry about the tech again.
Slide into my DMs on X or Substack. Select portfolio on Deca Labs.
Leveraging AI, it's now even easier than ever to write software that uses Selenium to extract data from any website, captchas be damned.
This post is a deep dive on how to do this, whether for your own project or for a client.
I've already made five figures just from data scraping related projects where I used Selenium.
How much could you make?
First off, a special thank you to the readers who pay to make this newsletter possible.Your support lets me continue to share tactics to grow your tech career and in-depth reporting on Canada’s economy, politics, and national trajectory.
Are you looking for 1:1 coaching on anything from your tech career to Canadian real estate?DM me on Substack or X (formerly Twitter) with your proposed topic, I’ll let you know if I can help. Skip the mid-IQ takes on Reddit, pay $100 for 30 minutes of information you can trust.
Ready to work through the paid subscriber tech career tactics archive? Start here and report back your wins. It’s a growing club of winners, and we’re waiting for you to join.
– Fullstack
What is Selenium?
Selenium has long been the dominant library for web scraping and remains a powerful tool.
Notably, Selenium has the power to control a full Chrome browser, enter input into text fields, click links, and effectively simulate human actions on almost any website.
Paired with a captcha breaking API (see my preferred one below), the online world is your oyster.
Selenium has well tested libraries for Python, Java, Node, and many languages. I even shipped a Kotlin starter project called KtSaaS which included a Selenium crawler.
While popular in frontend app integration tests, Selenium can also be your tool to make any website your personal API.
While AI offerings are slowly starting to dabble in AI controlled browsers and scraping use cases, the lack of determinism and risk of hallucination means for many use cases a Selenium scraper will be necessary if you plan to use the data for mission critical tasks.
Map, Reduce, Clean, Export
A scraping project comes down to four key phases:
Map: Iterate through the full set of input data (web pages)
Reduce: Aggregate data into a structured, durable location (JSON in S3, rows in a database...)
Clean: Iterate over aggregate data and sanitize input data, often websites will use human readable formats for timestamps or other data which you'd prefer to parse into more easily usable machine formats like Timestamp.
Export: Export your aggregate data in a format that you can use in your final product (CSV file, paged JSON response in a REST endpoint...)
Each of these will include some amount of coding, architectural design, and devops, depending on your project.
For coding parts, using AI coding tools, you can easily write your scraping software using Copilot, Cursor, Claude Code, or whatever your vibe coding tool is of choice.
For maximum accuracy projects, I'll generally use Selenium and often set breakpoints with a debugger and step through my script to confirm the values extracted are as expected with the Selenium controlled browser clicking along.
For less critical ones, you can simply check the final JSON or S3 results where you dump your crawled data.
For your most YOLO projects, you can even leverage the latest AI tool crawling programs like OpenAI's Operator tool and hope that the AI crawled data will be accurate enough for your use cases.
At the end of the day, you'll need to assess for your project or with your client what level of diligence is required for the task at hand.
Map
The Map phase is where you'll write your Selenium crawler.
Keep reading with a 7-day free trial
Subscribe to BowTied Fullstack to keep reading this post and get 7 days of free access to the full post archives.