P2-02: Precedent DB is exact-match only — no semantic similarity search #12

Closed
opened 2026-06-16 13:57:01 +00:00 by Artur · 0 comments
Owner

Severity: P2 (Medium)
File: decider/precedent.py

Problem

The precedent database uses SHA256("normalized input") for hash-based lookup. This means:

  • "npm install express" and "npm install lodash" have different hashes → no match
  • "restart nginx" and "reload nginx configuration" → no match
  • Only EXACT normalized duplicates are found

The normalization (lowercase → remove punctuation → sorted words) helps slightly but still misses semantically similar situations.

Fix

Add a fallback similarity search when exact match fails:

  1. Keep exact hash for fast lookups (O(1))
  2. On miss: use embedding-based similarity (SentenceTransformers or local model) to find the nearest precedents
  3. Return top-3 similar precedents with a similarity score
  4. If similarity > 0.85 → apply precedent; if > 0.70 → present it as reference

Alternatively, implement a simpler keyword overlap approach using FTS5 in SQLite.

**Severity**: P2 (Medium) **File**: `decider/precedent.py` ## Problem The precedent database uses SHA256("normalized input") for hash-based lookup. This means: - "npm install express" and "npm install lodash" have different hashes → no match - "restart nginx" and "reload nginx configuration" → no match - Only EXACT normalized duplicates are found The normalization (`lowercase → remove punctuation → sorted words`) helps slightly but still misses semantically similar situations. ## Fix Add a fallback similarity search when exact match fails: 1. Keep exact hash for fast lookups (O(1)) 2. On miss: use embedding-based similarity (SentenceTransformers or local model) to find the nearest precedents 3. Return top-3 similar precedents with a similarity score 4. If similarity > 0.85 → apply precedent; if > 0.70 → present it as reference Alternatively, implement a simpler keyword overlap approach using FTS5 in SQLite.
Artur closed this issue 2026-06-16 14:18:13 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
glow-all/decider#12
No description provided.