Rarebeauty Scraper is a robust data extraction tool that collects detailed cosmetic product information from rarebeauty.com. It helps businesses and analysts track product details, pricing, and availability to support smarter e-commerce and market decisions.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for rarebeauty-scraper you've just found your team — Let’s Chat. 👆👆
This project extracts structured cosmetic product data directly from individual product pages. It solves the challenge of manually collecting consistent product information at scale. It is designed for e-commerce teams, data analysts, and researchers working with beauty product data.
- Collects accurate product-level details from official product pages
- Supports batch processing through multiple URLs
- Outputs clean, structured data ready for analysis
- Ensures consistency across all extracted records
| Feature | Description |
|---|---|
| Batch URL Input | Scrape multiple product pages in a single run. |
| Detailed Product Data | Captures pricing, images, descriptions, and SKUs. |
| Structured Output | Delivers clean, analysis-ready JSON data. |
| Consistent Parsing | Maintains accuracy across different product pages. |
| Field Name | Field Description |
|---|---|
| product_name | Official name of the cosmetic product. |
| product_price | Product price including currency. |
| product_image | Main image URL of the product. |
| product_url | Source URL of the product page. |
| description | Full product description and benefits. |
| sku | Unique stock keeping unit identifier. |
[
{
"product_name": "Soft Pinch Tinted Lip Oil",
"product_price": "22.00 USD",
"product_image": "https://www.rarebeauty.com/cdn/shop/products/soft-pinch-tinted-lip-oil-serenity-1440x1952.jpg",
"product_url": "https://www.rarebeauty.com/products/soft-pinch-tinted-lip-oil",
"description": "An innovative lip jelly that transforms into a lightweight oil.",
"sku": "FGPSPO0001F1"
}
]
Rarebeauty Scraper/
├── src/
│ ├── runner.py
│ ├── extractors/
│ │ ├── product_parser.py
│ │ └── validators.py
│ ├── outputs/
│ │ └── json_exporter.py
│ └── config/
│ └── settings.example.json
├── data/
│ ├── inputs.sample.txt
│ └── sample_output.json
├── requirements.txt
└── README.md
- E-commerce teams use it to monitor product pricing so they can stay competitive.
- Market researchers use it to analyze beauty product trends for strategic insights.
- Data analysts use it to build structured datasets for reporting and dashboards.
- Retailers use it to track product availability and catalog changes.
Can I scrape multiple products at once? Yes, you can provide multiple product URLs to extract data in a single execution.
What type of data format is returned? The scraper returns structured JSON suitable for analytics, storage, or integration.
Does it support product updates over time? Yes, repeated runs allow tracking of price and content changes.
Are all products supported? It works with standard product pages available on the website.
Primary Metric: Processes an average product page in under 2 seconds.
Reliability Metric: Maintains a success rate above 98% on valid product URLs.
Efficiency Metric: Handles large URL batches with minimal memory overhead.
Quality Metric: Extracts complete product records with consistent field coverage.
