A modern data warehouse built with SQL Server, demonstrating end-to-end data engineering — from raw data ingestion through to business-ready analytics. Built using medallion architecture (bronze, silver, gold layers) with ETL pipelines, star schema modelling, and analytical reporting.
Bronze Layer — Stores raw data as-is from source systems. Data is ingested from CSV files (ERP and CRM) into SQL Server.
Silver Layer — Data cleansing, standardisation, and normalisation. Handles quality issues, resolves inconsistencies, and prepares data for downstream modelling.
Gold Layer — Business-ready data modelled into a star schema. Optimised for analytical queries and reporting, with fact and dimension tables designed for performance.
- Medallion architecture - Structured data flow across bronze, silver, and gold layers
- Star schema design - Fact and dimension tables optimised for analytical workloads
- ETL pipeline development - Extract, transform, and load processes from source to warehouse
- Data quality engineering - Cleansing, deduplication, and standardisation of raw data
- Multi-source integration - Combining ERP and CRM systems into a unified data model
- Analytical reporting — SQL-based insights into customer behaviour, product performance, and sales trends
| Area | Detail |
|---|---|
| Data Sources | ERP and CRM systems (provided as CSV files) |
| Database | SQL Server |
| Architecture | Medallion (Bronze → Silver → Gold) |
| Data Model | Star schema with fact and dimension tables |
| Focus | Latest dataset only (no historisation) |
sql-data-warehouse-project/
│
├── datasets/ # Raw datasets (ERP and CRM data)
│
├── documents/ # Architecture diagrams and documentation
│ ├── Data Flow Diagram.drawio.png
│ ├── Data Mart (Star Schema).drawio.png
│ ├── Data Warehouse Architecture.drawio.png
│ ├── Integration Model.drawio.png
│ └── data_catalog.md
│
├── scripts/ # SQL scripts for ETL and transformations
│ ├── bronze/ # Raw data extraction and loading
│ ├── silver/ # Cleaning and transformation
│ └── gold/ # Analytical models and star schema
│
├── tests/ # Data quality tests and validation
│
├── README.md
└── LICENSE
The gold layer supports SQL-based analytics delivering insights into:
- Customer behaviour - Segmentation, purchasing patterns, and retention analysis
- Product performance - Revenue by product, category trends, and margin analysis
- Sales trends - Period-over-period comparisons, seasonal patterns, and growth metrics
These insights are designed to empower stakeholders with actionable business metrics for strategic decision-making.
| Diagram | Description |
|---|---|
| Data Warehouse Architecture | End-to-end system architecture |
| Data Flow Diagram | Data movement across layers |
| Star Schema | Gold layer dimensional model |
| Integration Model | Source system integration design |
| Data Catalog | Field descriptions and metadata |
Data engineering postgraduate with a background in Accounting and Finance. Currently completing an MSc in Computer Science with Data Analytics, building end-to-end data solutions that bridge business understanding with technical execution.
This project is licensed under the MIT License. Free to use, modify, and share with proper attribution.
