We all have been in situations were we need content or information from a connected website, but have no access to a REST Api or any other backend feed.
In these cases screen scraping is the only option to get needed information to finalize an integration. You can do that directly in CURL, but that can be tedious. Far easier to use a nicely packaged solution that combines a component that simulates web browser behavior and a component that eases DOM navigation for HTML and XML documents. Meet Goutte!
Install via composer.
composer require fabpot/goutte
Login into a website and navigate to the page that has your needed information
PHP dotenv loads environment variables from .env to getenv(), $_ENV and $_SERVER automagically.
You should never store sensitive credentials in your code. Anything that is likely to change between deployment environments – such as database credentials or credentials for 3rd party services – should be extracted from the code into environment variables.
Add your application configuration to a .env file in the root of your project. Make sure the .env file is added to your .gitignore so it is not being checked-in.
“Logstalgia is a website traffic visualization tool that replays or streams web-server access logs as a pong-like battle between the web server and an never ending torrent of requests.
Requests appear as colored balls (the same color as the host) which travel across the screen to arrive at the requested location. Successful requests are hit by the paddle while unsuccessful ones (eg 404 – File Not Found) are missed and pass through.
The paths of requests are summarized within the available space by identifying common path prefixes. Related paths are grouped together under headings. For instance, by default paths ending in png, gif or jpg are grouped under the heading Images. Paths that don’t match any of the specified groups are lumped together under a Miscellaneous section.”