For my BrickCompare project, I decided to a microservice to do the task for scraping websites for the pricing data I needed. Amazon Web Services with their AWS Lambda service was the perfect service for the task.
Scraping Websites with X-ray
I had already decide to use the node.js platform to run my microservice as I was familiar with it. So then I had to select a module that could get me started on website scraping. Initially I selected noodlejs as it looked to be easily to use and had decent documentation. But after writing about 10 or so scrapers for different websites, I found that it was rather buggy and did not return consistent results.
With noodlejs having no development in almost 2 years I decide to find a new module for my needs. I eventually settled on the x-ray web scraper. It was much easier to use, more consistent and had a much nicer API. I would thoroughly recommend this library for scraping websites.
AWS Lambda Deployed with Serverless
The Serverless framework provided a command line interface for me to easily deploy my website scrapers to AWS Lambda. All the configuration is specified in a YAML file and Serverless handled the rest through the AWS API. I could even invoke the remote functions through the Serverless command line.
For BrickCompare, I configured it to trigger every three hours and Serverless handles the AWS configuration for me. To be honest it was like magic, it was that easy to use. Whether or not Serverless configured it in the most efficient way I am not so sure as I did run into some limits as the number of my website scrapers increased.
Microservices are the Future
I found that AWS Lambda with Serverless was a great way to run my scrapers and i could easily scrape a few website pages for the data I needed in a matter of seconds. And the best part is AWS is pretty generous with the Lambda free tier, as I am still within those limits. So everything was done with no zero cost!