-
Notifications
You must be signed in to change notification settings - Fork 995
Labels
data issuesissues that require going back and fixing affected data in an augur databaseissues that require going back and fixing affected data in an augur databasehigh priorityBlocking multiple other things, causing data loss, or other incredibly urgent thingsBlocking multiple other things, causing data loss, or other incredibly urgent things
Description
While investigating #3693, We (the maintainers) noticed that far too many entries in the commits table were missing author names. Since this table has a NOT NULL constraint on this column, "missing" here means "is set to empty string"
Observations
- we manually ran the git CLI used to grab the commit log data against a local clone of the augur repo and a specific unresolved commit and confirmed the git CLI was returning the information correctly.
- we didnt see any logic errors that seemed problematic in the text parsing of the author name (i.e. up until it gets added to the dict that presumably gets INSERTed eventually)
- we did however, notice that there is part of the text parsing code that includes an
author_name or ''line, which could be what is inserting the empty string, we just werent sure how
- we did however, notice that there is part of the text parsing code that includes an
- we briefly checked that the column names match between the dict and the table they were going into
- This seemed to correlate a fair bit with having the committer information be attributed to a bot, but dont read into this too much
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
data issuesissues that require going back and fixing affected data in an augur databaseissues that require going back and fixing affected data in an augur databasehigh priorityBlocking multiple other things, causing data loss, or other incredibly urgent thingsBlocking multiple other things, causing data loss, or other incredibly urgent things