Oceans 1876

Data Pipeline

Requires python >= 3.10

After cloning the repo, run git submodule init and git submodule update to fetch the data from https://github.com/oceans-1876/challenger-data.
Install gnfinder v0.19.
Install gnverifier v1.0.
Install Poetry on your system, either globally or in your virtual environments (the former is preferred).
Run poetry install to install the project dependencies.
If you just need to run the extractor and do not need the dev dependencies, you can run poetry install --no-dev.
If you are going to do development work, run pre-commit install in the project root.

All scripts must be run from project root and as modules, i.e., python -m <module>.

Command	Description
create_test_Data.py	Saves a subset of actual data, which can be used with the test database in the API
process_stations.py	Updates stations text and species
process_summary_report_species_index.py	Extracts the species mentioned in the summary report index
update_data_sources.py	Updates Global Names data source info

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
.github/workflows		.github/workflows
data @ f693c37		data @ f693c37
scripts		scripts
workflows		workflows
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml