Skip to content

fix(github_graphql): remove stale local issue rows when an issue no longer resolves#8820

Open
squatboy wants to merge 1 commit intoapache:mainfrom
squatboy:fix/github-graphql-stale-issue-cleanup
Open

fix(github_graphql): remove stale local issue rows when an issue no longer resolves#8820
squatboy wants to merge 1 commit intoapache:mainfrom
squatboy:fix/github-graphql-stale-issue-cleanup

Conversation

@squatboy
Copy link
Copy Markdown

⚠️ Pre Checklist

  • I have read through the Contributing Documentation.
  • I have added relevant tests.
  • I have added relevant documentation.
  • I will add labels to the PR, such as pr-type/bug-fix, pr-type/feature-development, etc.

Summary

This is a follow-up to #8637.

#8637 stopped failing the GraphQL collector immediately when GitHub returned Could not resolve to an Issue, but the missing issue could still remain in DevLake's local issue and raw tables.

This patch keeps the GraphQL collector tolerant of that missing-issue data error and also removes the stale local rows that keep the missing issue in the GitHub GraphQL refresh OPEN issues path.

What changed:

  • continue into ResponseParser when the only GraphQL data errors are ignorable Could not resolve to an Issue errors
  • track which issue numbers were requested in the GitHub GraphQL open-issue refresh batch
  • compare the requested issue numbers with the successfully resolved issues returned by GitHub
  • delete stale local refresh-input rows for unresolved issues from:
    • _tool_github_issues
    • _tool_github_issue_comments
    • _tool_github_issue_events
    • _tool_github_issue_labels
    • _tool_github_issue_assignees
    • _tool_github_pull_request_issues
    • the corresponding raw GraphQL issue row when available through RawDataOrigin

Why:

  • without cleanup, deleted or transferred issues remain orphaned in the GitHub GraphQL collector input set and can be retried forever during later refreshes
  • this patch makes the fix from fix: skip transferred issue for Github source #8637 complete for the refresh OPEN issues collector path

Does this close any open issues?

Closes #8819

Screenshots

N/A

Other Information

Tests and validation:

  • added unit test for ignorable GraphQL missing-issue errors
  • added unit tests for missing-issue detection in the GitHub GraphQL issue refresh path
  • verified:
    • go test ./plugins/github_graphql/tasks
    • go build ./helpers/pluginhelper/api

Scope note:

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. component/plugins This issue or PR relates to plugins pr-type/bug-fix This PR fixes a bug severity/p1 This bug affects functionality or significantly affect ux labels Mar 31, 2026
}

func findMissingGithubIssues(requestedIssues map[int]missingGithubIssueRef, resolvedIssues []GraphqlQueryIssue) []missingGithubIssueRef {
if len(requestedIssues) == 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return on len(requestedIssues) == len(resolvedIssues) as well?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that, but I left it out because the counts can still match even when one of the requested issues comes back as a zero-value entry with graphql-extend:"true". So comparing the actual issue numbers felt safer here.


func cleanupMissingGithubIssue(db dal.Dal, issue missingGithubIssueRef) errors.Error {
deleteByIssueId := func(model any, table string) errors.Error {
err := db.Delete(model, dal.From(table), dal.Where("connection_id = ? AND issue_id = ?", issue.ConnectionId, issue.GithubId))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to move an issue from one repository to another when both repos are already configured in Apache DevLake?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a case where an issue is transferred from repo A to repo B while both repositories are already configured in Apache DevLake?
If so, then yeah I think that is possible. In that case, deleting by connection_id and issue_id alone could be too broad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/plugins This issue or PR relates to plugins pr-type/bug-fix This PR fixes a bug severity/p1 This bug affects functionality or significantly affect ux size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug][Github GraphQL] Deleted or transferred issues leave stale local rows after #8637 [Bug][Github] GraphQL collector fails on transferred issues

2 participants