Skip to content

Commit 7cf95a6

Browse files
authored
Merge pull request #12 from NHSDigital/midyear_changes_2425_LC
2024-25 mid year changes from GitLab to GitHub.
2 parents d6ae9f5 + b588894 commit 7cf95a6

35 files changed

Lines changed: 1023 additions & 812 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
*.ipynb
66
*.xlsx
77
*.xls
8+
Outputs/
89

910

1011

CHANGELOG.txt

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,14 @@
11
NCDes new CQRS data update.
22

3+
Changelog V1.2
4+
------------------------------------------------------------------
5+
Updated metadata files
6+
Updates made to the run_ncdes.bat file for .venv creation and update.
7+
Replacement of the enviroment.yml file with a requirements.txt file.
8+
measure_and_indicator_mappings() function updated to account for removal of CAN02 and replaced with CAN04, in line with the 24-25 Mid year changes.
9+
Addition of git project supporting documents added to the "supporting_docs" folder.
10+
11+
312
Changelog V1.1
413
------------------------------------------------------------------
514
Updated metadata files
@@ -30,3 +39,4 @@ Added three files in root_directory:
3039
input/CQRS_files/NCD_CQRS_Metadata_Mapping.csv - COntains a mapping table to fix field names
3140
input/CQRS_files/NCD_metadata.csv - Contains a table to produce indicator and measure mappings
3241
output/CQRS_omitted_tracker.csv - Contains omitted practices for certain rules (does not count PCN codes)
42+

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2022, Crown Copyright NHS Digital
3+
Copyright (c) 2024, Crown Copyright NHS England.
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
> Warning: This is the README for the publically accessible version of the NCDes package. If you are an analyst please don't use the below instructions to run the publication process.
1+
> Warning: This is the README for the publically accessible version of the NCDes package. If you are an internal analyst please don't use the below instructions to run the publication process.
22
33
<p>&nbsp;</p>
44

5-
Repository owner: Primary Care Domain Analytical Team
5+
Repository owner: General Practice Data, Extracts, Specifications and Analysis (GPDESA) Team
66

7-
Email: primarycare.domain@nhs.net
7+
Email: gpses@nhs.net
88

99
To contact us raise an issue on Github or via email and we will respond promptly.
1010

@@ -61,7 +61,7 @@ root
6161
## Instructions for publication production
6262
After the above set up steps have been completed you can follow the below instructions to create the publication. Please note that you will not be able to run the code as this requires access to a private server. The data on the private server contains reference data that is used for mapping purposes. The reference tables used contain data from the [epraccur file](https://digital.nhs.uk/services/organisation-data-service/file-downloads/gp-and-gp-practice-related-data) and the [ONS code history database](https://www.ons.gov.uk/methodology/geography/geographicalproducts/namescodesandlookups/codehistorydatabasechd)
6363

64-
1) In the config file edit the root directory value so that it matches the root of the directory that you set up. Make use of escape characters and end path with a double "\\\\" e.g. "\\\\\\\example\\\root\\\directory\\\\".
64+
1) Review the config file and edit the paramaters as necessary e.g. edit the root directory value so that it matches the root of the directory that you set up. Make use of escape characters and end path with a double "\\\\" e.g. "\\\\\\\example\\\root\\\directory\\\\".
6565

6666
2) Download the epcn excel file from this [webpage](https://digital.nhs.uk/services/organisation-data-service/file-downloads/gp-and-gp-practice-related-data). Move it to the location specified in the above diagram. Copy the absolute path of this file and use it as the "epcn_path" in the config.json.
6767

@@ -80,7 +80,7 @@ There are a number of acronyms used in the text. They are set out in full and ex
8080

8181
CQRS: Calculating Quality Reporting Service. The Calculating Quality Reporting Service (CQRS) is an approvals, reporting and payments calculation system for GP practices. More information on CQRS can be found [here](https://welcome.cqrs.nhs.uk/).
8282

83-
NCDes: Network Contract Directed Enhanced Services. This is explained in detail on this [page](https://digital.nhs.uk/data-and-information/publications/statistical/mi-network-contract-des/2022-23).
83+
NCDes: Network Contract Directed Enhanced Services. This is explained in detail on this [page](https://digital.nhs.uk/data-and-information/publications/statistical/mi-network-contract-des).
8484

8585
PCN: Primary care networks. Groups of GP practices working closely together - along with other healthcare staff and organisations - providing integrated services to the local population.
8686

config.toml

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
#NCD publication Parameters
2+
#Please check the confluence page for the standardised values
23

34
[Setup]
4-
test_mode = "false"
5+
test_mode = "false" #Must be lower case - "true" = test mode, "false" = live run
6+
7+
[Dates]
8+
Publication_date = "MMM_YYYY" #MMM_YYYY
59

610
[Filepaths]
711
root_directory = "insert path to your root directory here"
@@ -22,5 +26,7 @@ removal_indicator_list = []
2226
bad_measure_list = []
2327

2428
[Pairs]
25-
remove_ind_pair.EHCH01 = "Numerator"
26-
remove_meas_pair.EHCH04 = "Denominator"
29+
remove_ind_pair.EHCH01 = "Denominator"
30+
remove_meas_pair.EHCH04 = "Numerator"
31+
32+

ncdes/Archived_files/run_ncdes.bat

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
@echo off
2+
call C:\ProgramData\Anaconda3\Scripts\activate.bat C:\ProgramData\Anaconda3
3+
call conda env remove --name ncdes
4+
call conda env create --name ncdes --file environment.yml
5+
call conda activate ncdes
6+
python -m ncdes.main
7+
pause
8+
9+
10+

ncdes/create_publication.py

Lines changed: 54 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,19 @@
11
from datetime import datetime
22
import pandas as pd
33
import os
4+
from pathlib import PurePath
5+
import logging
46

5-
from ncdes.data import sql_connection
6-
from ncdes.data.data_load import *
7-
from ncdes.data import amender
7+
from ncdes.utils import sql_connection
8+
from ncdes.data_ingestion.data_load import *
9+
from ncdes.data_ingestion import amender
810
from ncdes.processing import processing_steps
911
from ncdes.processing import validation_check as check
1012

11-
from ncdes.output import outputs
13+
from ncdes.data_export import outputs
1214

1315
from ncdes.utils.adhoc_fix import remove_problem_indicators, remove_problem_measures, remove_problem_indicator_measure_pairs
14-
from ncdes.utils.setup import check_saved_changes
16+
from ncdes.utils.setup import check_saved_changes, log_setup
1517
import warnings
1618

1719
warnings.simplefilter(action="ignore", category=UserWarning)
@@ -36,57 +38,69 @@ def main() -> None:
3638
#check that all changes made are saved and have indeed been updated ready to run - important for test runs
3739
check_saved_changes()
3840

41+
#load config parameters
3942
print("\n"*3,"Loading config file")
4043
config = get_config('config.toml')
4144
config_conn = config['Connections']
4245
config_fp = config['Filepaths']
43-
46+
root_directory = config_fp["root_directory"]
4447
test_run = config['Setup']['test_mode']
45-
print("\n"*3,f"Test mode established as: {test_run}")
48+
pub_date = config['Dates']['Publication_date']
4649

47-
print("Establishing SQL connection")
48-
connection = sql_connection.connect(server=config_conn["server"], database=config_conn["database"])
49-
root_directory = config_fp["root_directory"]
5050

5151
output_directory = outputs.test_run_change_outputs_fldr(test_run, root_directory)
52-
print(f"Selecting output folder: {output_directory}")
52+
log_output_directory = PurePath(output_directory, "Logs")
53+
# set test or live outputs folder and create directory where this does not already exist
54+
if not os.path.exists(str(log_output_directory)):
55+
os.makedirs(str(log_output_directory))
56+
57+
# logger setup
58+
log_setup(log_output_directory, pub_date)
59+
60+
#input logger messages around this run
61+
logging.info(f"Test mode established as: {test_run} \n\n\n")
62+
logging.info(f'Outputs will be written to the following folder: {output_directory}.')
63+
64+
logging.info("Establishing SQL connection")
65+
connection = sql_connection.connect(server=config_conn["server"], database=config_conn["database"])
5366

54-
print("Loading NCDes data")
55-
ncdes_raw = load_csvs_in_directory_as_concat_dataframe(f"{root_directory}\\Input\\Current")
67+
logging.info("Loading NCDes data")
68+
ncdes_raw_filepath = PurePath(root_directory, 'Input', 'Current')
69+
ncdes_raw = load_csvs_in_directory_as_concat_dataframe(ncdes_raw_filepath)
5670
ncdes_raw_archive = ncdes_raw.copy(deep=True)
5771

58-
print("Amending CQRS data")
72+
logging.info("Amending CQRS data")
5973
ncdes_raw = amender.update_dataframe(ncdes_raw, config)
6074

61-
print("Cleaning NCDes data")
75+
logging.info("Cleaning NCDes data")
6276
ncdes_clean = processing_steps.clean_ncdes(ncdes_raw)
6377
reporting_period = processing_steps.get_formatted_reporting_end_date_from_ncdes_data(ncdes_clean)
6478

6579
geo_ccg_sql_str, geo_reg_sql_str, stp_sql_str, prac_sql_str = get_sql_query_strings(reporting_period)
6680

67-
print("Loading SQL mapping data")
81+
logging.info("Loading SQL mapping data")
6882
geo_ccg_df = pd.read_sql(sql=geo_ccg_sql_str, con=connection)
6983
geo_reg_df = pd.read_sql(sql=geo_reg_sql_str, con=connection)
7084
stp_df = pd.read_sql(sql=stp_sql_str, con=connection)
7185
prac_df = pd.read_sql(sql=prac_sql_str, con=connection)
7286

73-
print("Formatting SQL mapping data")
87+
logging.info("Formatting SQL mapping data")
7488
geo_ccg_df, geo_reg_df, stp_df = processing_steps.sql_df_cols_to_upper_case(geo_ccg_df, geo_reg_df, stp_df)
7589

76-
print("Loading ePCN data")
90+
logging.info("Loading ePCN data")
7791
raw_epcn = load_epcn_excel_table(epcn_path=config_fp["epcn_path"])
7892
epcn_df = processing_steps.epcn_transform(raw_epcn)
7993

80-
print("Creating mapping table")
94+
logging.info("Creating mapping table")
8195
mapping_table = processing_steps.create_mapping_table(geo_ccg_df, geo_reg_df, stp_df, prac_df, epcn_df)
8296

83-
print("Merging NCDes data with mapping data")
97+
logging.info("Merging NCDes data with mapping data")
8498
NCDes_with_geogs = processing_steps.merge_tables_fill_Na_reorder_cols(mapping_df=mapping_table, ncdes_df_cleaned=ncdes_clean, CORRECT_COLUMN_ORDER_NCDes_with_geogs=CORRECT_COLUMN_ORDER_NCDes_with_geogs)
8599

86-
print("Starting validation checks")
100+
logging.info("Starting validation checks")
87101
check.run_all_column_has_expected_values_validations(NCDes_with_geogs, root_directory)
88102

89-
print("Applying suppression")
103+
logging.info("Applying suppression")
90104
NCDes_suppressed = processing_steps.suppress_output(
91105
main_table=NCDes_with_geogs,
92106
root_directory=root_directory,
@@ -99,47 +113,47 @@ def main() -> None:
99113
main_table_ind_code_col_name='IND_CODE'
100114
)
101115

102-
print("Removing problem indicators:")
103-
print(str(config["Indicators"]["removal_indicator_list"]))
116+
#logging.info(f"Removing problem indicators: {str(config["Indicators"]["removal_indicator_list"])}")
117+
logging.info("Removing problem indicators")
104118
NCDes_problem_ind_rem = remove_problem_indicators.remove_indicators(NCDes_suppressed, removal_indicator_list=config["Indicators"]["removal_indicator_list"])
105-
print("Removing problem measures")
119+
logging.info("Removing problem measures")
106120
NCDes_problem_meas_rem = remove_problem_measures.remove_measures(NCDes_problem_ind_rem, bad_measure_list=config["Measures"]["bad_measure_list"])
107121

108-
print("Removing problem indicator-measure combinations if required")
122+
logging.info("Removing problem indicator-measure combinations if required")
109123
NCDes_problem_meas_rem = remove_problem_indicator_measure_pairs.remove_pairs(NCDes_problem_ind_rem, bad_indicator_measure_list=config["Pairs"]['remove_ind_pair'])
110124
NCDes_problem_meas_rem = remove_problem_indicator_measure_pairs.remove_pairs(NCDes_problem_meas_rem, bad_indicator_measure_list=config["Pairs"]['remove_meas_pair'])
111125

112-
print("Joining ruleset ID to copy of output data for ruleset-specific outputs")
126+
logging.info("Joining ruleset ID to copy of output data for ruleset-specific outputs")
113127
NCDes_with_rulesets = processing_steps.merge_data_with_ruleset_id(NCDes_problem_meas_rem, root_directory)
114128

115-
print("Saving main output")
129+
logging.info("Saving main output")
116130
outputs.save_NCDes_main_to_csv(NCDes_problem_meas_rem, output_directory)
117131

118-
print("Zipping main output")
132+
logging.info("Zipping main output")
119133
outputs.save_NCDes_main_to_zip(NCDes_problem_meas_rem, output_directory)
120134

121-
print("Saving outputs split by ruleset")
135+
logging.info("Saving outputs split by ruleset")
122136
outputs.save_NCDes_by_ruleset_to_csvs(NCDes_with_rulesets, output_directory)
123137

124-
print("Zipping outputs split by ruleset")
138+
logging.info("Zipping outputs split by ruleset")
125139
outputs.save_NCDes_by_ruleset_to_zip(NCDes_with_rulesets, output_directory)
126140

127-
print("Saving trend monitor")
128-
if test_run == True:
129-
print("Test mode: skipping trend monitor")
141+
logging.info("Trend monitor updates")
142+
if test_run.lower() == "true":
143+
logging.info("Test mode: skipping trend monitor")
130144
else:
131145
outputs.save_trendmonitor(NCDes_problem_meas_rem, root_directory)
132146

133-
print("Saving excel output")
147+
logging.info("Saving LDHC excel output")
134148
outputs.save_NCDes_main_to_excel(NCDes_problem_meas_rem, root_directory, server=config_conn["server"], database=config_conn["database"], test_run=test_run)
135149

136-
print("Archiving input")
137-
outputs.archive_input_as_csv(ncdes_raw_archive, output_directory)
150+
logging.info("Archiving input")
151+
outputs.archive_input(output_directory)
138152

139-
print("Deleting input files from input folder")
140-
outputs.remove_files_from_input_folder(path=f"{root_directory}Input\\Current\\")
153+
logging.info("Deleting input files from input folder")
154+
outputs.remove_files_from_input_folder(ncdes_raw_filepath)
141155

142-
print("Job complete")
156+
logging.info("Job complete")
143157
outputs.open_outputs(NCDes_problem_meas_rem, output_directory)
144158

145159

0 commit comments

Comments
 (0)