sgrep - Semantic Code Search

VerifiedCaution

sgrep is a tool for semantic code search using natural language queries. It understands code meaning rather than relying on exact text matches, making it ideal for finding concepts like authentication logic or error handling patterns. It runs locally with embeddings, works offline, and supports filters by language, file globs, and JSON output for agent-friendly use.

Sby Skills Guide Bot
DevelopmentIntermediate
306/2/2026
Claude CodeCursorWindsurfCopilotCodex
#semantic-search#code-search#natural-language#code-analysis

Recommended for

Our review

sgrep enables semantic code search using natural language queries, allowing you to find code based on meaning rather than exact text matches.

Strengths

  • Understands code semantics, not just literal text
  • Supports language filters and glob patterns for precise searches
  • Provides JSON output for AI agent integration
  • Auto-indexing and watch mode for real-time updates

Limitations

  • Requires prior installation via curl/sh
  • Relies on a local embedding model, which can be slow or blocked in some environments
  • Search quality depends on index size and quality
When to use it

Use sgrep when you need to locate code by concept or functionality, such as authentication logic, error handling patterns, or database connection pooling.

When not to use it

Do not use it for quick exact-text searches (prefer grep) or when you cannot install additional tools.

Security analysis

Caution
Quality score85/100

The skill instructs to install sgrep via curl pipe to sh from a GitHub URL. While the tool itself is likely safe, the installation method carries a risk of remote code execution if the source is tampered with. No destructive commands are otherwise present.

Findings
  • Uses curl|sh to install from an external URL without signature verification, which could execute arbitrary code if the remote script is compromised.

Examples

Find authentication logic
Search the codebase for how user authentication is handled using sgrep.
Filter by language
Search for error handling patterns in Rust files only using sgrep with language filter.
Detailed JSON output
Get a JSON-formatted result for retry logic patterns in the codebase using sgrep.

name: sgrep description: Use sgrep for semantic code search. Use when you need to find code by meaning rather than exact text matching. Perfect for finding concepts like "authentication logic", "error handling patterns", or "database connection pooling". allowed-tools: ["Bash"]

sgrep - Semantic Code Search

Use sgrep to search code semantically using natural language queries. sgrep understands code meaning, not just text patterns.

When to Use

  • Finding code by concept or functionality ("where do we handle authentication?")
  • Discovering related code patterns ("show me retry logic")
  • Exploring codebase structure ("how is the database connection managed?")
  • Searching for implementation patterns ("where do we validate user input?")

Prerequisites

Ensure sgrep is installed:

curl -fsSL https://raw.githubusercontent.com/rika-labs/sgrep/main/scripts/install.sh | sh

Basic Usage

Search Command

sgrep search "your natural language query"

Common Patterns

Find functionality:

sgrep search "where do we handle user authentication?"

Search with filters:

sgrep search "error handling" --filters lang=rust
sgrep search "API endpoints" --glob "src/**/*.rs"

Get more results:

sgrep search "database queries" --limit 20

Show full context:

sgrep search "retry logic" --context

Command Options

  • --limit <n> or -n <n>: Maximum results (default: 10)
  • --context or -c: Show full chunk content instead of snippet
  • --path <dir> or -p <dir>: Repository path (default: current directory)
  • --glob <pattern>: File pattern filter (repeatable)
  • --filters key=value: Metadata filters like lang=rust (repeatable)
  • --json: Emit structured JSON output (agent-friendly)
  • --threads <n>: Maximum threads for parallel operations
  • --cpu-preset <preset>: CPU usage preset (auto|low|medium|high|background)

Indexing

If no index exists, sgrep will automatically create one on first search. To manually index:

sgrep index              # Index current directory
sgrep index --force      # Rebuild from scratch

Watch Mode

For real-time index updates during development:

sgrep watch              # Watch current repo
sgrep watch --debounce-ms 200

Configuration

Check or create embedding provider configuration:

sgrep config                    # Show current configuration
sgrep config --init             # Create default config file
sgrep config --show-model-dir   # Show model cache directory
sgrep config --verify-model     # Check if model files are present

sgrep uses local embeddings by default. Config lives at ~/.sgrep/config.toml.

If HuggingFace is blocked (e.g., in China), set HTTPS_PROXY environment variable or see the offline installation guide.

Examples

Find authentication code:

sgrep search "how do we authenticate users?"

Find error handling:

sgrep search "error handling patterns" --filters lang=rust

Search specific file types:

sgrep search "API rate limiting" --glob "src/**/*.rs"

Get detailed results:

sgrep search "database connection pooling" --context --limit 5

Agent-friendly JSON output:

sgrep search --json "retry logic"

Understanding Results

Results show:

  • File path and line numbers: Where the code is located
  • Score: Relevance score (higher is better)
  • Semantic score: How well it matches the query meaning
  • Keyword score: Text matching score
  • Code snippet: Relevant code excerpt

Best Practices

  1. Use natural language: Ask questions like you would ask a colleague
  2. Be specific: "authentication middleware" is better than "auth"
  3. Combine with filters: Use --filters lang=rust to narrow by language
  4. Use globs: --glob "src/**/*.rs" to search specific directories
  5. Check context: Use --context when you need full function/class definitions
  6. Use JSON for automation: Use --json for structured output in scripts
Related skills