← Back to Home

Project Case Study

AI Code Review Bot

A production-oriented GitHub App that delivers fast, structured pull-request feedback using automated diff analysis and LLM-assisted review.

Overview

Teams were spending senior review time on repetitive quality checks before getting to architectural or product-level feedback. The goal was to shorten first-pass review loops without reducing signal quality.

  • Reduced routine PR review latency by automating first-pass feedback for style, correctness, and risk checks.

  • Improved review consistency by standardizing comment structure and severity labels across repositories.

  • Lowered reviewer context-switch load by surfacing actionable findings directly on changed lines.

  • Kept operational behavior production-safe with webhook verification, retries, and bounded token budgets.

Architecture

The system separates ingestion, analysis, and review publishing so reliability controls can be applied at each stage. This keeps webhook handling fast while preserving deterministic review behavior.

GitHub Webhooks
      |
      v
Ingestion API (signature verify)
      |
      v
Diff Normalizer + Prompt Builder
      |
      v
OpenAI Review Engine
      |
      v
Comment Orchestrator -> GitHub PR Review API
      |
      v
Observability (logs, latency, error tracking)

Metrics Snapshot

Automated First-Pass Coverage

87%

Median Review Turnaround

2m 18s

False-Positive Rate

< 9%

Webhook Verification Success

100%

P95 End-to-End Processing

5.4s

Prompt Token Budget / PR

8k capped

Deep Dive

  • Used installation-scoped GitHub auth so one deployment can operate safely across multiple repositories.

  • Normalized diffs before prompt construction to keep generated feedback tied to changed code only.

  • Separated ingestion, analysis, and publishing steps to support retries without duplicate comments.

  • Chose bounded prompt windows over full-repository context to keep latency predictable.

  • Prioritized deterministic first-pass feedback quality over maximum comment volume.

  • Used asynchronous review publishing to avoid blocking webhook acknowledgement paths.

  • Verified webhook signatures for every inbound event and rejected unsigned payloads.

  • Stored installation credentials in environment-bound secrets with least-privilege scopes.

  • Redacted sensitive code fragments from logs while preserving traceability metadata.

  • Add repository-specific policy packs so feedback aligns with each team’s coding standards.

  • Introduce confidence scoring to prioritize high-signal comments for reviewer attention.

  • Expand observability with model-cost and token-efficiency dashboards per repository.

Screenshots

PR summary with categorized findings

Inline suggestions on changed lines

Operational dashboard: latency and failure rates