Methodology & Known Limitations
Transparency about how skills are inferred and what the data does not tell you
Deterministic Rule-Based Core · Optional LLM Review
All skill inferences are produced by explicit, deterministic, auditable rules applied to raw GitHub API activity data.
The same input always produces the same output. Every score is fully traceable.
An optional LLM review layer (configurable in Settings) can provide human-readable summaries and recommendations,
but it absolutely does not affect scores, confidence levels, or evidence trails. We do not use opaque AI scoring.
Data Sources
GitHub REST API Only
All data is fetched exclusively from the GitHub REST API. This means:
- No local Git repository cloning or mining is performed
- Analysis is limited to what the GitHub API exposes (may not include all historical data)
- Private repository access requires a properly scoped Personal Access Token
- API rate limits apply and may affect ingestion completeness for large repositories
What we ingest via GitHub API
- Commits: SHA, author, message, timestamp, additions, deletions, files changed (via detail endpoint)
- Pull Requests: Author, title, state, merge status, timestamps, additions, deletions, changed files (via detail endpoint)
- Reviews: Reviewer, state (approved/changes_requested/commented), timestamp
- Review Comments: Reviewer, file path, body, timestamp
- Files: Path, detected language (by file extension heuristic)
Metrics
| Metric | Description | Limitations |
|---|---|---|
| Contribution Activity | Commits and PRs per contributor per month | Volume does not indicate quality. Pair programming and mob programming are not captured. |
| Code Ownership | Proportion of changes to each file by each contributor | Based on change frequency, not code criticality. Refactoring skews results. |
| Code Churn | Ratio of deletions to total changes | High churn may indicate refactoring (positive) or rework (negative). Context needed. |
| Bus Factor | Minimum contributors to cover 50% of files | Based on files touched, not knowledge depth. Does not account for documentation or mentoring. |
| Language Distribution | File changes grouped by detected language | Language detected by file extension only. Config files may be miscategorized. |
| Review Participation | Reviews and comments per contributor as proportion of total PRs | Does not measure review quality, depth, or impact. |
Inference Rules
| Rule | What It Infers | Key Criteria | Limitations |
|---|---|---|---|
| language_familiarity_v1 | Familiarity with a programming language | Repeated commits in files of that language, weighted by language share and commit share | Based on file extensions, not code quality. Familiarity does not equal proficiency. |
| module_familiarity_v1 | Familiarity with a code module/directory | Repeated changes across multiple files in same directory structure, weighted by module share | Module boundaries from directory structure may not match logical domains. |
| review_participation_v1 | Active participation in code review | Sustained review activity over time (min 2 reviews, higher thresholds for higher confidence) | Does not distinguish rubber-stamp approvals from thorough reviews. |
Scoring & Confidence
Score (0.0 - 1.0)
Reflects volume, consistency, and proportion of observed activity. Higher scores mean more repeated, sustained activity. Scores are not proficiency ratings.
Confidence Levels
- High Substantial evidence from many data points over an extended period
- Medium Moderate evidence; more data would strengthen the inference
- Low Limited evidence; treat as preliminary signal only
What This Platform Does NOT Do
- Does not measure code quality, architecture skills, or design ability
- Does not infer seniority from activity counts
- Does not equate commit volume with skill
- Does not capture pair programming, mentoring, or verbal contributions
- Does not account for work done outside the analyzed repositories
- Core inferences use explicit, auditable rules (optional LLM review does not affect scoring)
- Does not produce a complete picture of any contributor's abilities
- Absence of activity does not indicate absence of skill
Background Processing Model
How ingestion works
Ingestion runs as an in-process background task within the web server. When you trigger ingestion for a repository, the pipeline runs through three sequential stages: GitHub API data fetch, metrics computation, and skill inference.
Current guarantees and limitations
- In-process only: Processing runs inside the web server process — there is no external job queue or background worker. If the process restarts mid-run, that run is lost.
- Durability: Each pipeline run is tracked as a ProcessingRun record with status (running / completed / failed) and error details
- Timeout: A 600-second per-repository timeout prevents indefinite hangs
- No retry: Failed ingestion must be manually re-triggered; there is no automatic retry
- No queue: Concurrent ingestion tasks run in the same process; resource contention is possible for many simultaneous repos
- Visibility: Pipeline run status is visible via the ingestion status API (
/api/ingestion/status) - Partial failure: If metrics or inference fail after ingestion succeeds, the run is marked as partially failed and the granular error is recorded.
Principles
- Every inferred skill is explainable, traceable, and auditable
- Observed evidence is separated from interpretation
- Uncertainty is never hidden
- Deterministic rules are preferred over opaque models
- Useful analytics over decorative AI features