How to scrape data and images from Mediamarkt website listings

Опубликовано: 21 Июль 2020
на канале: Webharvy Videos
487
0

Since the first page and the next pages are different, we need to create 2 configuration files

First page data extraction :
--------------------------------------
https://www.mediamarkt.de/de/category...


meta itemprop="name" content="([^"]*)


srcset="([^"]*)


window.history.back();


To capture multiple images, open the file in a text editor and change :

Image_RegEx
with
Image_RegExMulti


Subsequent pages data extraction :
-------------------------------------------------
Start URL:

https://www.mediamarkt.de/de/category...

Added URL:

https://www.mediamarkt.de/de/category...%%pagenumber%%


Details page extraction is same for both configuration files. Follow the same steps that were done for first config file and then come back to the listings page using the JS code (window.history.back();)