diff --git a/README.md b/README.md index 7e045bf..1676e87 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,7 @@ A sample family of reusable [GitHub Agentic Workflows](https://githubnext.github ### Depth Triage & Analysis Workflows - [🏷️ Issue Triage](docs/issue-triage.md) - Triage issues and pull requests +- [🔁 Issue Duplication Detector](docs/issue-duplication-detector.md) - Detect and comment on duplicate issues automatically - [🏥 CI Doctor](docs/ci-doctor.md) - Monitor CI workflows and investigate failures automatically - [🔍 Repo Ask](docs/repo-ask.md) - Intelligent research assistant for repository questions and analysis - [🔍 Daily Accessibility Review](docs/daily-accessibility-review.md) - Review application accessibility by automatically running and using the application diff --git a/docs/issue-duplication-detector.md b/docs/issue-duplication-detector.md new file mode 100644 index 0000000..7a3a4d3 --- /dev/null +++ b/docs/issue-duplication-detector.md @@ -0,0 +1,52 @@ +# 🔁 Issue Duplication Detector + +> For an overview of all available workflows, see the [main README](../README.md). + +The [Issue Duplication Detector workflow](../workflows/issue-duplication-detector.md?plain=1) automatically scans for newly created or recently updated issues every 5 minutes and flags likely duplicates with a helpful comment. + +## Installation + +```bash +# Install the 'gh aw' extension +gh extension install githubnext/gh-aw + +# Add the Issue Duplication Detector workflow to your repository +gh aw add githubnext/agentics/issue-duplication-detector --pr +``` + +This creates a pull request to add the workflow to your repository. + +You must also [choose a coding agent](https://githubnext.github.io/gh-aw/reference/engines/) and add an API key secret for the agent to your repository. + +After merging the PR and syncing to main, you can run the workflow manually if desired: + +```bash +gh aw run issue-duplication-detector +``` + +## Configuration + +This workflow works out of the box. You can customize detection strictness, comment tone, or batching window via a local config file at `.github/workflows/agentics/issue-duplication-detector.config.md`. + +After editing run `gh aw compile` to update the workflow and commit all changes to the default branch. + +## What it reads from GitHub + +- Repository issues (open and closed) +- Recent issues created or updated in the last 10 minutes + +## What it creates + +- Adds comments to issues that appear to be duplicates, including links to the matching issues +- Requires `issues: write` permission + +## Human in the loop + +- Review duplicate comments for accuracy and tone +- Close or link issues as appropriate +- Disable or uninstall the workflow if it is not valuable + +## Activity duration + +- By default this workflow will trigger for at most 30 days, after which it will stop triggering. +- This allows you to experiment with the workflow for a limited time before deciding whether to keep it active. diff --git a/workflows/issue-duplication-detector.md b/workflows/issue-duplication-detector.md new file mode 100644 index 0000000..39bb4de --- /dev/null +++ b/workflows/issue-duplication-detector.md @@ -0,0 +1,102 @@ +--- +description: Detect duplicate issues and suggest next steps (batched every 5 minutes) +on: + schedule: + - cron: "*/5 * * * *" # Every 5 minutes + workflow_dispatch: + +permissions: read-all + +tools: + github: + toolsets: [default] + bash: + - "*" + +safe-outputs: + add-comment: + max: 10 # Allow multiple comments in batch mode + +timeout-minutes: 15 +--- + +# Issue Duplication Detector + +You are an AI agent that detects duplicate issues in the repository `${{ github.repository }}`. + +## Your Task + +Analyze recently created or updated issues to determine if they are duplicates of existing issues. This workflow runs every 5 minutes to batch-process issues, providing cost control and natural request batching. + +## Instructions + +1. **Find recent issues to check**: + - Use GitHub tools to search for issues in this repository that were created or updated in the last 10 minutes + - Query: `repo:${{ github.repository }} is:issue updated:>=$(date -u -d '10 minutes ago' +%Y-%m-%dT%H:%M:%SZ)` + - This captures any issues that might have been created or edited since the last run + - If no recent issues are found, exit successfully without further action + +2. **For each recent issue found**: + - Fetch the full issue details using GitHub tools + - Note the issue number, title, and body content + +3. **Search for duplicate issues**: + - For each recent issue, use GitHub tools to search for similar existing issues + - Search using keywords from the issue's title and body + - Look for issues that describe the same problem, feature request, or topic + - Consider both open and closed issues (closed issues might have been resolved) + - Focus on semantic similarity, not just exact keyword matches + - Exclude the current issue itself from the duplicate search + +4. **Analyze and compare**: + - Review the content of potentially duplicate issues + - Determine if they are truly duplicates or just similar topics + - A duplicate means the same underlying problem, request, or discussion + - Consider that different wording might describe the same issue + +5. **For issues with duplicates found**: + - Use the `output.add-comment` safe output to post a comment on the issue + - In your comment: + - Politely inform that this appears to be a duplicate + - List the duplicate issue(s) with their numbers and titles using markdown links (e.g., "This appears to be a duplicate of #123") + - Provide a brief explanation of why they are duplicates + - Suggest next steps, such as: + - Reviewing the existing issue(s) to see if they already address the concern + - Adding any new information to the existing issue if this one has additional context + - Closing this issue as a duplicate if appropriate + - Keep the tone helpful and constructive + +6. **For issues with no duplicates**: + - Do not add any comment + - The issue is unique and can proceed normally + +## Important Guidelines + +- **Batch processing**: Process multiple issues in a single run when available +- **Read-only analysis**: You are only analyzing and commenting, not modifying issues +- **Be thorough**: Search comprehensively to avoid false negatives +- **Be accurate**: Only flag clear duplicates to avoid false positives +- **Be helpful**: Provide clear reasoning and actionable suggestions +- **Use safe-outputs**: Always use `output.add-comment` for commenting, never try to use GitHub write APIs directly +- **Cost control**: The 5-minute batching window provides a natural upper bound on costs + +## Example Comment Format + +When you find duplicates, structure your comment like this: + +```markdown +👋 Hi! It looks like this issue might be a duplicate of existing issue(s): + +- #123 - [Title of duplicate issue] + +Both issues describe [brief explanation of the common problem/request]. + +**Suggested next steps:** +- Review issue #123 to see if it addresses your concern +- If this issue has additional context not covered in #123, consider adding it there +- If they are indeed the same, this issue can be closed as a duplicate + +Let us know if you think this assessment is incorrect! +``` + +Remember: Only comment if you have high confidence that duplicates exist.