🍱 A dining recommendation app that suggests meals and restaurants based on the user's geographical location (longitude and latitude).
👉 Access the app and start your exploration now at https://letsdine.sotisanalytics.com.
LetsDine/: Main folder..github/workflows/ci.yml: CI configuration for github.logger/: Optional, notebooks for tests.logs/: Process logs.execution_log.logloading_log.log
modules/: Modules code.cache_data.py: data caching strategy.config.py: contains the initial parameters of the app.find_restaurants_spark.py: spark version of find_restaurants.find_restaurants.py: calculate distance between two sets of coordinates.load_data_spark.py: spark version of load_data.load_data.py: fetch data from geojson or parquet files.sarch_GUI.py: Web App Prototype.
main.py: Main script.main_GUI: Displays Main script in GUI.packages.txt: Streamlit cloud java installation.requirements.txt: Python dependencies.search: Executable script..streamlit: Page configuration (colors etc.).test: Unitary tests.test_find_restaurant.py: evalute distance calculations.test_load_data.py: evalute data loading and data quality.test_main.py: evalute main script.test_search_GUI.py: evalute web app prototype.
To run the app locally, you can run the following instructions.
- Anaconda or Miniconda
- Docker (for Docker deployment)
- Python 3.11
- Python libraries
pip install -r requirements.txt
restaurants_paris.parquet is derived from restaurants_paris.geojson where data were cleaned (removed null values and duplicates) and where only 3 columns were kept, as follow.
| name | latitude | longitude |
|---|---|---|
| :str: | :float: | :float: |
| :str: | :float: | :float: |
| :str: | :float: | :float: |
| x6,273 |
restaurants_simulated_france.parquet is a simulated with 15 millions of lines. Each line is a simulated restaurant with a basic name (ex Restaurant_13235, ranging from 0 to 15M), and a latitude / longitude somewhere in France.
| name | latitude | longitude |
|---|---|---|
| :str: | :float: | :float: |
| :str: | :float: | :float: |
| :str: | :float: | :float: |
| x15,000,000 |
You can try the calculator using big data, with a table of 15,000,000 lines instead of 6,200+.
If you want to use this feature, you need to download the data that are stored on a AWS S3 bucket by clicking here.
Once downloaded, the file called restaurants_simulated_france.parquet should be placed in the LetsDine/static/data/ folder. It will then be automatically detected when you call it later.
You can choose between 3 execution modes. Use the one that suits you best :
- Run using the executable (2.1.)
- Run using the python script (2.2.)
- Run unsing streamlit (web UI, 2.3.)
In the terminal, exemple 1:
./search latitude=48.865 longitude=2.380 radius=1000In the terminal, exemple 2:
./search latitude=48.865 longitude=2.380 radius=1000 use_spark=False big_data=False verbose=FalseYou must specify 3 mandatory values (see exemple 1):
- latitude: float, exemple: 48.865 geographic coordinate of the place of interest
- longitude: float, exemple: 2.380 geographic coordinate of the place of interest
- radius: float, exemple: 100 radius around the place of interest in which you want to find the restaurants
You can specify 3 optional values (see exemple 2):
- use_spark: bool, default is False use spark to process dataframes instead of pandas
- big_data: bool, default is False use a simulated dataset with 15 million simulated restaurants names and coordinates. if False, use the provided dataset (around 6000 restaurants)
- verbose: bool, default is False print infos, mainly for debugging
In the terminal:
python run main.py[OPTIONAL] In the config file that you can find in modules/config.py, you can change the following parameters :
- LATITUDE: float, default is 48.865
- LONGITUDE: float, default is 2.380
- RADIUS: int, default is 1000
- USE_SPARK: bool, default is False
- BIG_DATA: False, default is False
- VERBOSE: False, default is False
In the terminal:
streamlit run main_GUI.pyThe app will be available at:
- Local URL: http://localhost:8501
- Network URL: http://192.168.1.5:8501
Streamlit has compatibility problems with Spark. To process data with Spark, you should use Option 1 or 2 above.
In the terminal:
pytest tests/This solution implements a CI/CD pipeline where unit tests are executed and deployment is carried out upon each code push. In this prototype phase, failing unit tests do not halt the deployment process, allowing for flexible development, but this should be reconsidered for production stages to ensure application stability.
For this prototype, environment variables are kept in the main directory (.env) for easy sharing. No sensitive information is found there.
If you meet problems using Spark you might need to follow these instructions :
-
Install homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)" -
Install Java
brew install openjdk@11
-
Install Apache Spark using Homebrew
brew install apache-spark
-
Set up environment variables You must add the following lines to your shell configuration file (e.g., ~/.bash_profile or ~/.zshrc) to set environment variables for Spark-
export SPARK_HOME=/usr/local/opt/apache-spark/libexec export PYSPARK_PYTHON=/usr/bin/python3 # Use your Python 3 interpreter path export PATH=$SPARK_HOME/bin:$PATH
-
Install PySpark
pip install pyspark
-
Verify your installation To ensure everything is set up correctly, open a terminal and run the following command to start a PySpark shell-
pyspark
This will launch the PySpark interactive shell, and you should see the Spark logo and version information if the installation was successful.
- LinkedIn: Ludovic Gardy
- Website: https://www.sotisanalytics.com
