P1-03: No ensemble-level timeout — provider hang blocks HTTP request forever #9

Closed
opened 2026-06-16 13:57:01 +00:00 by Artur · 0 comments
Owner

Severity: P1 (High)
File: decider/ensemble.py line 60

Problem

The ensemble uses ThreadPoolExecutor without any timeout:

with ThreadPoolExecutor(max_workers=len(self.ensemble)) as pool:
    futures = {pool.submit(_call_one, cfg): cfg for cfg in self.ensemble}
    for future in as_completed(futures):
        vote = future.result()  # blocks indefinitely

If all providers hang (network partition, API outage, dead connection), the entire ensemble thread blocks forever → HTTP request handler blocks forever → worker thread leaked. With ThreadingHTTPServer, this means ALL workers can be consumed by hanging requests, causing a complete denial of service.

Fix

  1. Add an overall timeout parameter to Ensemble.decide()
  2. Use as_completed(futures, timeout=) or future.result(timeout=)
  3. Propagate timeout from config (server.decision_timeout)
  4. On timeout: log warning, return weighted result from completed providers only
  5. If no providers completed in time → fall back to ask_user
**Severity**: P1 (High) **File**: `decider/ensemble.py` line 60 ## Problem The ensemble uses `ThreadPoolExecutor` without any timeout: ```python with ThreadPoolExecutor(max_workers=len(self.ensemble)) as pool: futures = {pool.submit(_call_one, cfg): cfg for cfg in self.ensemble} for future in as_completed(futures): vote = future.result() # blocks indefinitely ``` If all providers hang (network partition, API outage, dead connection), the entire ensemble thread blocks forever → HTTP request handler blocks forever → worker thread leaked. With `ThreadingHTTPServer`, this means ALL workers can be consumed by hanging requests, causing a complete denial of service. ## Fix 1. Add an overall timeout parameter to `Ensemble.decide()` 2. Use `as_completed(futures, timeout=)` or `future.result(timeout=)` 3. Propagate timeout from config (`server.decision_timeout`) 4. On timeout: log warning, return weighted result from completed providers only 5. If no providers completed in time → fall back to `ask_user`
Artur closed this issue 2026-06-16 13:58:14 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
glow-all/decider#9
No description provided.