Simple Amazon Web scraper in PHP

This is very short tutorial on how to scrap data from web (we will be using amazon.in in this tutorial)

Its a  very simple project to get you started on how web scrapping works so that you can use to scrape the entire web and store these data and then do some data analysis on them or create some other application based on those data.

Prerequisite:

  • Basic PHP
  • And Some HTML knowledge to get the data that you want

So lets begin

`

That’s all code we will need to extract price from URL (Amazon Product)

Explanation

  1. First We get the URL from which we want to get price and store it in a variable called $filename
  2. Then will use this url to get content using php function file_get_contents  . This function takes url as an input and return string as output from that url.
  3. Now to parse data out of it we will use an DOMDocument class to get price of product.
  4. We will load html into DOMDocument object using loadHTML method and after this we will get element using filter on this ( getElementById ). This method returns an DOMElement Object.
  5. Then we will get node value from this DOMElement object.
  6. Return value may contain some junk value so we have to trim and filter the value (i.e. remove comma etc)
    we have use explode to return   from our price string.
  7. Then we will cast it to int then the process it further. We may store this is database for further processing.

Advanced

  • You can also scrap title and product id and store it in database along with price
  • You can use this method for other websites also like flipkart, snapdeal, paytm etc.

Leave a Reply