🔍

H²CE-v2

AI-powered code search that actually understands your codebase structure. Find functions by meaning, not keywords.

The Problem with Traditional Code Search

grep and ripgrep only match text. They don't understand code structure or semantics.

❌ Traditional grep

$ grep -r "validate" .

Returns:

  • "validate" in comments (noise)
  • "validate" in test names (not what you want)
  • "validation" in error messages (noise)
  • Misses: verify_token(), check_input() (semantic matches)

✅ H²CE-v2 AST-Aware Search

Query: "Where do we validate user input?"

Returns (ranked by relevance):

  • validate_user_input() [src/auth.rs:142]
  • User::validate() [src/models.rs:89]
  • impl Validate for LoginForm [src/forms.rs:34]
  • sanitize_input() [src/security.rs:67] (related)

How It Works

📂

Gitignore-Aware Discovery

Automatically skips node_modules, target/, .git, and respects your .gitignore. Only indexes what matters.

🌳

AST Parsing

Tree-sitter extracts functions, classes, methods with full context. Smart boundaries keep code together.

🔢

Batch Embeddings

CodeBERT-style model processes 128 chunks at once on GPU/CPU. 40x faster than sequential.

🔍

Hybrid Search

HNSW for semantic similarity + Tantivy BM25 for keywords. Fusion ranking combines both.

Incremental Updates

BLAKE3 hashing detects changed files. Only reindex what's new. Full codebase updates in seconds.

💾

Memory Efficient

<200MB index for typical 50K file project. Streaming pipeline keeps memory usage constant.

Performance Benchmarks

Metric ripgrep H²CE-v2
Search Latency 15ms 35ms
Semantic Understanding ❌ No ✅ Yes
MRR@10 (Accuracy) 0.31 0.74
Index 100K files N/A 4m 22s
Memory Usage ~50MB 184MB

Language Support

✅ Fully Supported

  • Rust
  • Python
  • JavaScript
  • TypeScript
  • Go

🔄 On Request

  • C++
  • Java
  • C#
  • Swift
  • Any Tree-sitter grammar

Get H²CE-v2 Today

Transform how you search code. 30-day money-back guarantee.