GitHub engineered an event-driven system leveraging GitHub Actions, GitHub Copilot, and GitHub Models to automate and streamline accessibility feedback processing. This system transforms scattered user reports into structured, actionable issues, ensuring continuous improvement and efficient resource allocation. It focuses on human-in-the-loop validation, with AI handling repetitive tasks like initial triage and classification.
Read original on GitHub EngineeringThe article describes GitHub's architectural solution to a common problem in large-scale software development: managing cross-cutting concerns like accessibility feedback that don't neatly fit into single team ownership. Their approach centralizes feedback and uses automation to process it efficiently, allowing human experts to focus on complex problem-solving rather than manual triage.
GitHub built an event-driven architecture where each step in the feedback process triggers a GitHub Action. This ensures consistent handling regardless of feedback origin. Key GitHub products form the backbone:
A crucial design decision was to integrate AI for repetitive tasks while maintaining human oversight. When an issue is created, a GitHub Action calls the GitHub Models API to engage GitHub Copilot. Copilot analyzes the report and populates approximately 80% of the issue's metadata, including type, severity, affected user segments, and recommended team assignments. This automated classification uses stored prompts rather than model fine-tuning, allowing for agile updates by non-ML experts via pull requests, ensuring the AI's behavior evolves with policy changes.
Decoupling AI from Training Pipelines
Using stored prompts for AI behavior configuration is a powerful pattern for system design, especially when rapid iteration or domain expert input is required without involving complex ML retraining pipelines. It decouples the AI's logic from its underlying model, making it more adaptable and maintainable.
Two layers of human review—by the issue submitter and then the accessibility team—validate Copilot's recommendations, ensuring accuracy and providing a feedback loop for continuous improvement of the AI's prompt instructions. This hybrid approach optimizes efficiency while preserving the quality of human judgment for critical decisions.