Screen scraping and robotic process automation (RPA) for fun (and profit?)

When I first started screen scraping things were simple. Most things were in simple HTML tables, there was (almost) no JavaScript. You could just download pages with cURL, parse them with Perl (!), and get whatever data you wanted.

Today, things are different. Everything uses JavaScript, things load asynchronously, filling in forms enables and disables elements based on validation rules, validation doesn’t always run when you expect it to, and the list goes on.

Here I’m capturing some of the things that I’ve run into and how to get around them:

Buttons not clickable until a form field is filled in
Form validation not running until a resource is loaded
Warnings about duplicate submission attempts

Notes:

Selenide - https://youtu.be/P-vureOnDWY?t=1062

No need to download Web Drivers
Writing code that is compatible with AJAX
No need to add explicit waits

Page object pattern - https://youtu.be/P-vureOnDWY?t=2084

IntelliJ → Tools → Open Selenium Page Object Playground

Separate tests from page specific code
Single repository for operations offered by the page

Selenoid

Run browsers in Docker containers
To install and run

git clone https://github.com/aerokube/cm.git
cd cm
go build
chmod +x cm
./cm selenoid start - NOTE: no need for the --vnc option since all images contain VNC support now

Resources:

Modern UI Test Automation with Selenium Libraries

End-to-end testing is a common testing methodology where the objective is to test how an application works by checking user scenarios from start to finish. Usually, it is not that easy to write and maintain these tests. Fortunately, we have Selenium, a rock-solid ecosystem where you can find almost everything for UI automation, both for web and mobile testing.

www.youtube.com

Modern UI Test Automation with Selenium Libraries