Portfolio, Mikołaj Kabała

Gravel Scraper

A Python-based web scraping application designed to automatically collect gravel bike listings from multiple online stores, unify the data into a consistent format, and enable easy comparison of prices, specifications, and ratings using structured JSON output.

Background

The project was created out of a personal need to quickly compare gravel bike models available across different online stores without manually browsing multiple websites. The goal was to automate the data collection process and build a unified dataset that enables fast and efficient comparison of available offers.

The application uses web scraping techniques to extract product data from two major online bike retailers: Centrum Rowerowe and Decathlon. The system automatically navigates through paginated listings, collects product details such as name, brand, price, rating, and URL, and stores the results in a structured JSON format for further analysis using pandas.

Currently, I am scraping data daily to collect a representative dataset for further analysis. I assume that a time period of 30–60 days of data collection will be sufficient.

Technologies: Python, Selenium, BeautifulSoup, Pandas, JSON

View on GitHub

Image Gallery

Main.py file

Web driver scraping from Centrum rowerowe

Web driver scraping from Decathlon

Folder with data in JSON files