How It Works¶
Zubbl uses a dual learning loop to continuously improve your AI agents — without any code changes.
The Learning Loop¶
graph TB
A[Your Agent] -->|wrap once| B[Zubbl SDK]
B -->|1. Record| C[Extract 3 Steps Automatically]
C -->|2. Learn| D[Build Patterns by Category]
D -->|3. Score| E[Rank by Success Rate]
E -->|4. Policy| F[Build Recommendations]
F -->|5. Inject| G[~30 Token Strategy Prefix]
G -->|Next Run| A
C -->|On Failure| H[Reflexion: Analyze Why]
H -->|Partial Credit| D
What Happens When You Run wrap()¶
Step 1: Record (Zero Code Changes)¶
smart_agent = zubbl.wrap(my_agent) # One line. Never change again.
result = smart_agent("Write a sorting function")
Zubbl automatically extracts 3 steps from every call: - Understand task — detects task type (generate, debug, optimize, explain) - Execute — detects tools used from output (code_generator, debugger, etc.) - Validate — checks output quality
Step 2: Extract Patterns¶
Similar tasks accumulate patterns under the same category. "Write a function to sort" and "Write a function to reverse" both go under writing.
Step 3: Score (Contrastive)¶
Patterns are ranked by success rate. Successful approaches score higher than failed ones. Failed trajectories still contribute — steps that worked before the failure get partial credit.
Step 4: Build Policy¶
After 2+ patterns per category, a policy is built with actionable recommendations:
Step 5: Inject (~30 Tokens)¶
Next time the agent runs a similar task, the learned strategy is prepended:
[Strategy: planner→understand, code_generator→generate, checker→validate]
Write a function to sort a list
~30 tokens of overhead. The agent gets better without anyone changing any code.
On Failure: Reflexion¶
When an agent fails: 1. An LLM analyzes why it failed 2. The reflection is stored as a negative pattern 3. Steps that worked before the failure get partial credit 4. Next run avoids the same mistake
Benchmark Results¶
| Metric | Result |
|---|---|
| Tasks to reach confidence 1.0 | 8 per category |
| Actions learned per category | 3 |
| Token overhead | ~30 tokens |
| Code changes required | Zero |
| Success rate | 100% |
Dashboard Feedback (No Code Changes)¶
- Agent runs → trajectory recorded automatically
- Go to dashboard → click trajectory → rate it → add comment
- Feedback feeds into the learning loop
- Next run is better — no code changes