Module 3: How Ingestion Works
Understand the automated pipeline from bug report to GitHub issue.
Estimated time: 15 minutes
The Pipeline
When someone reports a bug through Marker.io or creates a ticket in Notion, the ingestion service turns it into a fully standardized GitHub Issue — automatically. Here's the full flow:
External Source (Marker.io or Notion)
↓
HMAC-SHA256 Signature Verification
↓
Dedup Check (skip Marker-origin Notion pages)
↓
Source-Specific Parser → NormalizedTicket
↓
Load Repo Registry (sync.yml via GitHub raw URL)
↓
Claude API: classify + route + rewrite
↓
GitHub Issues API: create standardized issueIf Claude fails at any point, a fallback path creates a raw (unformatted) issue using keyword-based repo routing.
Sources
The service handles two webhook sources:
Marker.io
Marker.io is a visual bug reporting widget embedded on client websites. Users click a button, annotate a screenshot, and submit. The webhook payload includes:
- Title and description
- Reporter name and email
- The URL where the bug was filed
- Screenshots and attachments
- Priority level
Notion Webhooks
Notion automation webhooks fire when a page is created or updated in a tracked database. The parser extracts:
- Title from the
Nameproperty - Description from rich text properties
- Type mapping:
BUG→Bug,New Feature→Feature,Improvement→Improvement,Design→Design - Priority mapping: P0–P4 from the
Priorityselect property - Additional fields like Figma links and level-of-effort estimates
Marker-Origin Dedup
Here's a subtlety: Marker.io creates Notion pages when bugs are filed. Without dedup, both the Marker webhook AND the Notion page-created webhook would fire, creating duplicate issues. The dedup check inspects the Created By field — if it contains "marker" (case-insensitive), the Notion webhook is skipped.
Security: HMAC-SHA256 Verification
Every webhook request is verified using HMAC-SHA256 signatures before processing:
| Source | Signature Header | Secret |
|---|---|---|
| Marker.io | x-hub-signature-256 | MARKER_WEBHOOK_SECRET |
| Notion | x-notion-signature | NOTION_WEBHOOK_SECRET |
The verification uses crypto.timingSafeEqual to prevent timing attacks. The HMAC is computed over the raw request body (not the parsed JSON) — this is critical because the signature must match the exact bytes received.
Claude Classification
This is where the magic happens. The ingestion service calls Claude (currently claude-sonnet-4-6) using tool use with a forced tool call to standardize_issue.
What Claude Decides
For every incoming ticket, Claude determines:
- Title — Formatted as
[TYPE] PLATFORM: concise description - Body — Structured markdown with
## Description,## Acceptance Criteria,## Implementation Guidance, and## Steps to Reproduce(bugs only) - Labels — Always includes
needs-review(neverai-triaged), plus type and platform labels - Repo — Which
org/repothe issue belongs to, based on the registry - Complexity —
low,medium, orhigh
How Repo Routing Works
Claude receives the full repo registry from sync.yml — each entry with its repo, description, and keywords. Combined with the ticket content and any screenshots (sent as image content blocks via Claude's vision capability), Claude picks the best matching repo.
If Claude fails, a keyword fallback scans the ticket title, description, and source URL against each repo's keyword list. The repo with the most keyword hits wins.
The <!-- l3-standardized --> Marker
Every AI-standardized issue gets an invisible HTML comment appended: <!-- l3-standardized -->. This signals the standardize-issue.yml GitHub Actions workflow to skip re-processing — without it, the workflow would call Claude again on the issue it just created.
API Endpoints
The ingestion service is deployed on Vercel at l3-platform-ingestion.vercel.app:
| Endpoint | Purpose |
|---|---|
POST /api/webhook?source=marker | Receives Marker.io webhooks |
POST /api/webhook?source=notion | Receives Notion webhooks |
POST /api/notion-webhook | Dedicated Notion endpoint (no query param) |
POST /api/standardize | Internal: re-standardize an existing raw issue |
GET /api/health | Health check |
Observability
All Claude calls are traced via Langfuse. Each standardization creates a trace named standardize-issue with a generation named classify-and-format, recording the model, input, output, and token usage. If Langfuse keys aren't set, tracing is silently disabled.