codefang

module
v0.0.0-...-ffc4fba Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 13, 2026 License: Apache-2.0, Apache-2.0

README

Codefang

Codefang Logo

CI Go Reference

The heavy lifter for your codebase.

Codefang is a comprehensive code analysis platform that understands your code deeply—not just as text, but as structure (AST) and history (Git). Whether you're tracking technical debt, analyzing developer churn, or feeding context to an AI, Codefang does the heavy lifting so you don't have to.


1. Preface

For detailed technical documentation, architectural decisions, and algorithm deep-dives, please visit our Documentation.

So, you've stumbled upon Codefang. You might be thinking, "Wait, didn't src-d build something like this years ago?"

Yes, they did. And it was glorious. But like many great empires, src-d faded into the annals of history, and the original hercules was left gathering dust in the digital attic.

This is not that project.

This is a reincarnation. We took the core philosophy, stripped out the obsolete parts, and rebuilt it with a modern engine. It's not a drop-in replacement; it's a spiritual successor with a gym membership and a PhD in Abstract Syntax Trees.


2. Historical Context

Once upon a time, there was a company called source{d}. They pioneered "Mining Software Repositories" on a massive scale. Their crown jewel was hercules, a tool that could chew through git logs faster than you can say git blame.

It gave us insights like:

  • Burndown Charts: How much code from 2015 is still surviving today?
  • Coupling: If I change A.go, does B.go always change with it?

When src-d ceased operations, the tool became abandonware. But the need to understand code didn't go away.

We revived the project with a new mission: Combine history with structure. The old hercules was great at Git history. Codefang adds UAST (Universal Abstract Syntax Tree) powers, allowing it to understand the meaning of the code across 60+ languages, not just the diffs.


3. Quickstart

Enough history class. Let's see some muscles.

Installation

You'll need Go installed. Then, just grab the binaries:

# The brain: Code analysis and history mining
go install github.com/Sumatoshi-tech/codefang/cmd/codefang@latest

# The eyes: Universal parser (supports 60+ languages)
go install github.com/Sumatoshi-tech/codefang/cmd/uast@latest
Let's Flex

Codefang follows the UNIX philosophy: small tools, joined by pipes.

1. Analyze complexity (Static Analysis): Parse your code and pipe it into the analyzer.

# How messy is my main.go?
uast parse main.go | codefang analyze -a complexity

# ...or my entire codebase?
uast parse **/*.go | codefang analyze -a complexity

2. Analyze history (Git Forensics): See who knows what, and how code is aging.

# Generate a burndown chart (lines surviving over time)
codefang history -a burndown .

3. Find the experts: Who actually wrote the code that's running in production?

codefang history -a devs --head .

4. Architecture & Design Decisions

We made a few bold choices in the rewrite.

The Split: uast vs codefang

The original hercules was a monolith. We split it in two:

  1. uast (The Parser): This tool focuses on one thing—turning source code into a standardized tree structure. It uses Tree-sitter under the hood to support practically every language you care about (Go, Python, JS, Rust, C++, Java, and even COBOL... probably).
  2. codefang (The Analyzer): This tool consumes the data. It takes UASTs for static analysis or Git repositories for history analysis.
Why UAST?

Most linters are language-specific. eslint for JS, golangci-lint for Go. Codefang uses a Universal AST. This means we can write a single "Complexity Analyzer" and it immediately works for Python, Go, and TypeScript.

Pluggable Analysis

Analyzers are modular. You want to measure "Sentiment of Code Comments"? There's an analyzer for that (sentiment). You want to find "Typos in Variable Names"? There's one for that (typos).


5. Codefang as an AI Agent Tool

Don't just use AI to write code. Use Codefang to verify it. By giving codefang and uast to your AI agent as tools (via MCP or shell), you create a self-correcting quality loop.

1. The Self-Correcting Coder

  • Scenario: Agent generates a new function or refactors a file.
  • Action: Agent runs codefang analyze -a complexity on the new code.
  • Outcome: If complexity scores are high (e.g., > 15), the agent self-reflects: "This is too complex. I need to simplify logic before showing it to the user."
  • Result: Clean, maintainable code before it even hits the PR.

2. Architectural Context Injection

  • Scenario: You ask the agent about a high-level architectural change.
  • Problem: "Context Window Exceeded" or the agent hallucinates file relationships.
  • Solution: Agent runs uast parse to get the AST structure or codefang analyze -a imports to map the dependency graph. It learns the system architecture without reading every single line of text.

3. Risk-Aware Refactoring

  • Scenario: Agent is asked to refactor a legacy module.
  • Action: Agent runs codefang history -a couples.
  • Insight: "Warning: Changing User.go usually breaks Billing.go 80% of the time."
  • Result: Agent proactively checks related files for regressions, preventing bugs that a simple text-based analysis would miss.

4. Style & Consistency Enforcement

  • Scenario: "Make this look like the rest of the project."
  • Action: Agent analyzes cohesion and comments metrics of existing high-quality files to set a baseline.
  • Result: Agent generates code that matches the structural quality standards of your specific repo, not just generic language syntax.

Ready to lift?

Check out the full documentation for deep dives, or just start piping commands and see what breaks.

Happy Mining!

Directories

Path Synopsis
cmd
codefang command
Package main provides the entry point for the codefang CLI tool.
Package main provides the entry point for the codefang CLI tool.
codefang/commands
Package commands implements CLI command handlers for codefang.
Package commands implements CLI command handlers for codefang.
uast command
Package main provides the UAST CLI entry point.
Package main provides the UAST CLI entry point.
Package main demonstrates custom UAST mapping usage.
Package main demonstrates custom UAST mapping usage.
internal
analyzers/analyze
Package analyze provides analyze functionality.
Package analyze provides analyze functionality.
analyzers/anomaly
Package anomaly provides temporal anomaly detection over commit history.
Package anomaly provides temporal anomaly detection over commit history.
analyzers/burndown
Package burndown provides burndown functionality.
Package burndown provides burndown functionality.
analyzers/clones
Package clones provides clone detection analysis using MinHash and LSH.
Package clones provides clone detection analysis using MinHash and LSH.
analyzers/cohesion
Package cohesion provides cohesion functionality.
Package cohesion provides cohesion functionality.
analyzers/comments
Package comments provides comments functionality.
Package comments provides comments functionality.
analyzers/common
Package common provides common functionality.
Package common provides common functionality.
analyzers/common/plotpage
Package plotpage provides HTML visualization components for analyzer output.
Package plotpage provides HTML visualization components for analyzer output.
analyzers/common/renderer
Package renderer provides section rendering for analyzer reports.
Package renderer provides section rendering for analyzer reports.
analyzers/common/reportutil
Package reportutil provides type-safe accessors for map[string]any fields.
Package reportutil provides type-safe accessors for map[string]any fields.
analyzers/common/spillstore
Package spillstore provides generic disk-backed stores that spill accumulated data to temporary files during streaming hibernation, freeing memory between chunks while preserving the full dataset for Finalize.
Package spillstore provides generic disk-backed stores that spill accumulated data to temporary files during streaming hibernation, freeing memory between chunks while preserving the full dataset for Finalize.
analyzers/common/terminal
Package terminal provides terminal rendering utilities for beautiful CLI output.
Package terminal provides terminal rendering utilities for beautiful CLI output.
analyzers/complexity
Package complexity provides complexity functionality.
Package complexity provides complexity functionality.
analyzers/couples
Package couples provides couples functionality.
Package couples provides couples functionality.
analyzers/devs
Package devs provides devs functionality.
Package devs provides devs functionality.
analyzers/file_history
Package filehistory provides file history functionality.
Package filehistory provides file history functionality.
analyzers/halstead
Package halstead provides halstead functionality.
Package halstead provides halstead functionality.
analyzers/imports
Package imports provides imports functionality.
Package imports provides imports functionality.
analyzers/plumbing
Package plumbing provides plumbing functionality.
Package plumbing provides plumbing functionality.
analyzers/quality
Package quality tracks code quality metrics (complexity, Halstead, comments, cohesion) across commit history by running static analyzers on per-commit UAST-parsed changed files.
Package quality tracks code quality metrics (complexity, Halstead, comments, cohesion) across commit history by running static analyzers on per-commit UAST-parsed changed files.
analyzers/sentiment
Package sentiment provides sentiment functionality.
Package sentiment provides sentiment functionality.
analyzers/sentiment/lexicons
Package lexicons provides multilingual sentiment dictionaries for code comment analysis.
Package lexicons provides multilingual sentiment dictionaries for code comment analysis.
analyzers/shotness
Package shotness provides shotness functionality.
Package shotness provides shotness functionality.
analyzers/typos
Package typos provides typos functionality.
Package typos provides typos functionality.
budget
Package budget provides memory budget calculation and auto-tuning for codefang history analysis.
Package budget provides memory budget calculation and auto-tuning for codefang history analysis.
burndown
Package burndown provides file-level line interval tracking for burndown analysis.
Package burndown provides file-level line interval tracking for burndown analysis.
cache
Package cache provides LRU blob caching with Bloom pre-filter and cost-based eviction.
Package cache provides LRU blob caching with Bloom pre-filter and cost-based eviction.
checkpoint
Package checkpoint provides state persistence for streaming analysis.
Package checkpoint provides state persistence for streaming analysis.
config
Package config provides YAML-based project configuration for codefang.
Package config provides YAML-based project configuration for codefang.
framework
Package framework provides orchestration for running analysis pipelines.
Package framework provides orchestration for running analysis pipelines.
identity
Package identity provides identity constants and types for author tracking.
Package identity provides identity constants and types for author tracking.
importmodel
Package importmodel defines the data model for source file import analysis.
Package importmodel defines the data model for source file import analysis.
observability
Package observability provides OpenTelemetry-based tracing, metrics, and structured logging for all Codefang application modes (CLI, MCP, server).
Package observability provides OpenTelemetry-based tracing, metrics, and structured logging for all Codefang application modes (CLI, MCP, server).
plumbing
Package plumbing defines shared types, constants, and test helpers for the analysis pipeline.
Package plumbing defines shared types, constants, and test helpers for the analysis pipeline.
storage
Package storage provides filesystem utilities for safe, atomic persistence.
Package storage provides filesystem utilities for safe, atomic persistence.
streaming
Package streaming provides chunked execution with analyzer hibernation for memory-bounded analysis.
Package streaming provides chunked execution with analyzer hibernation for memory-bounded analysis.
pkg
alg
Package alg provides generic algorithm utilities.
Package alg provides generic algorithm utilities.
alg/bloom
Package bloom provides a space-efficient probabilistic set membership filter.
Package bloom provides a space-efficient probabilistic set membership filter.
alg/cms
Package cms provides a Count-Min Sketch for frequency estimation.
Package cms provides a Count-Min Sketch for frequency estimation.
alg/hll
Package hll provides a HyperLogLog cardinality estimator.
Package hll provides a HyperLogLog cardinality estimator.
alg/internal/hashutil
Package hashutil provides shared hash mixing constants and functions for probabilistic data structures (Count-Min Sketch, HyperLogLog, MinHash).
Package hashutil provides shared hash mixing constants and functions for probabilistic data structures (Count-Min Sketch, HyperLogLog, MinHash).
alg/interval
Package interval provides an augmented interval tree for efficient range-overlap queries.
Package interval provides an augmented interval tree for efficient range-overlap queries.
alg/levenshtein
Package levenshtein calculates the Levenshtein edit distance between strings.
Package levenshtein calculates the Levenshtein edit distance between strings.
alg/lru
Package lru provides a generic thread-safe LRU cache with optional Bloom pre-filtering, size-based eviction, and cost-aware eviction sampling.
Package lru provides a generic thread-safe LRU cache with optional Bloom pre-filtering, size-based eviction, and cost-aware eviction sampling.
alg/lsh
Package lsh provides a Locality-Sensitive Hashing index for fast approximate nearest-neighbor retrieval of MinHash signatures.
Package lsh provides a Locality-Sensitive Hashing index for fast approximate nearest-neighbor retrieval of MinHash signatures.
alg/mapx
Package mapx provides generic map operations: clone, merge, and sorted-key extraction.
Package mapx provides generic map operations: clone, merge, and sorted-key extraction.
alg/minhash
Package minhash provides MinHash signature generation for set similarity estimation.
Package minhash provides MinHash signature generation for set similarity estimation.
alg/stats
Package stats provides core statistical functions for numerical analysis.
Package stats provides core statistical functions for numerical analysis.
gitlib
Package gitlib provides a unified interface for git operations using libgit2.
Package gitlib provides a unified interface for git operations using libgit2.
iosafety
Package iosafety provides defensive file-reading and terminal-output utilities for user-supplied paths and strings.
Package iosafety provides defensive file-reading and terminal-output utilities for user-supplied paths and strings.
meminfo
Package meminfo provides memory information utilities.
Package meminfo provides memory information utilities.
metrics
Package metrics provides interfaces for defining self-contained, reusable metrics.
Package metrics provides interfaces for defining self-contained, reusable metrics.
pathfilter
Package pathfilter excludes vendor, third-party, and generated files from analysis.
Package pathfilter excludes vendor, third-party, and generated files from analysis.
persist
Package persist provides codec-based file persistence for arbitrary state types.
Package persist provides codec-based file persistence for arbitrary state types.
pipeline
Package pipeline provides configuration option types for analysis pipeline items and composable building blocks for concurrent pipeline construction: RunPC (producer-consumer goroutine skeleton), Phase/RunPhases (chain-of-responsibility), Batcher (batching strategies), DispatchFunc (dispatch strategy), and Fetcher (cache decorator pattern).
Package pipeline provides configuration option types for analysis pipeline items and composable building blocks for concurrent pipeline construction: RunPC (producer-consumer goroutine skeleton), Phase/RunPhases (chain-of-responsibility), Batcher (batching strategies), DispatchFunc (dispatch strategy), and Fetcher (cache decorator pattern).
safeconv
Package safeconv provides safe type conversion functions.
Package safeconv provides safe type conversion functions.
sigutil
Package sigutil provides signal-handling utilities for graceful cleanup.
Package sigutil provides signal-handling utilities for graceful cleanup.
textutil
Package textutil provides byte-level text utilities: binary detection, line counting, JSON encoding, and byte-slice reader adapters.
Package textutil provides byte-level text utilities: binary detection, line counting, JSON encoding, and byte-slice reader adapters.
uast
Package uast provides a universal abstract syntax tree (UAST) representation and utilities for parsing, navigating, querying, and mutating code structure in a language-agnostic way.
Package uast provides a universal abstract syntax tree (UAST) representation and utilities for parsing, navigating, querying, and mutating code structure in a language-agnostic way.
uast/lsp
Package lsp provides a Language Server Protocol (LSP) server for the mapping DSL used in the UAST framework.
Package lsp provides a Language Server Protocol (LSP) server for the mapping DSL used in the UAST framework.
uast/pkg/mapping
Package mapping provides Tree-Sitter to UAST mapping rules and grammar analysis.
Package mapping provides Tree-Sitter to UAST mapping rules and grammar analysis.
uast/pkg/node
Package node provides the canonical UAST node structure and operations for tree traversal, querying, and transformation.
Package node provides the canonical UAST node structure and operations for tree traversal, querying, and transformation.
uast/pkg/spec
Package spec provides embedded UAST schema specifications.
Package spec provides embedded UAST schema specifications.
units
Package units provides binary size unit multipliers (1024-based).
Package units provides binary size unit multipliers (1024-based).
version
Package version provides the build version information for the Codefang binary.
Package version provides the build version information for the Codefang binary.
scripts
bench-hibernation command
bench-hibernation measures heap memory before and after Hibernate() calls during a real streaming run on a target repository.
bench-hibernation measures heap memory before and after Hibernate() calls during a real streaming run on a target repository.
tools
lexgen command
lexgen converts Chen-Skiena sentiment lexicon text files into an embedded Go source file.
lexgen converts Chen-Skiena sentiment lexicon text files into an embedded Go source file.
precompgen command
Package main provides the precompilation tool for UAST mapping files.
Package main provides the precompilation tool for UAST mapping files.
schemagen command
Package main generates JSON schemas for analyzer ComputedMetrics structs.
Package main generates JSON schemas for analyzer ComputedMetrics structs.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL