Skip to content

How It Works

Zubbl uses a dual learning loop to continuously improve your AI agents — without any code changes.


The Learning Loop

graph TB
    A[Your Agent] -->|wrap once| B[Zubbl SDK]
    B -->|1. Record| C[Extract 3 Steps Automatically]
    C -->|2. Learn| D[Build Patterns by Category]
    D -->|3. Score| E[Rank by Success Rate]
    E -->|4. Policy| F[Build Recommendations]
    F -->|5. Inject| G[~30 Token Strategy Prefix]
    G -->|Next Run| A

    C -->|On Failure| H[Reflexion: Analyze Why]
    H -->|Partial Credit| D

What Happens When You Run wrap()

Step 1: Record (Zero Code Changes)

smart_agent = zubbl.wrap(my_agent)  # One line. Never change again.
result = smart_agent("Write a sorting function")

Zubbl automatically extracts 3 steps from every call: - Understand task — detects task type (generate, debug, optimize, explain) - Execute — detects tools used from output (code_generator, debugger, etc.) - Validate — checks output quality

Step 2: Extract Patterns

Similar tasks accumulate patterns under the same category. "Write a function to sort" and "Write a function to reverse" both go under writing.

Step 3: Score (Contrastive)

Patterns are ranked by success rate. Successful approaches score higher than failed ones. Failed trajectories still contribute — steps that worked before the failure get partial credit.

Step 4: Build Policy

After 2+ patterns per category, a policy is built with actionable recommendations:

writing → planner, code_generator, quality_checker
debugging → planner, debugger, quality_checker

Step 5: Inject (~30 Tokens)

Next time the agent runs a similar task, the learned strategy is prepended:

[Strategy: planner→understand, code_generator→generate, checker→validate]
Write a function to sort a list

~30 tokens of overhead. The agent gets better without anyone changing any code.


On Failure: Reflexion

When an agent fails: 1. An LLM analyzes why it failed 2. The reflection is stored as a negative pattern 3. Steps that worked before the failure get partial credit 4. Next run avoids the same mistake


Benchmark Results

Metric Result
Tasks to reach confidence 1.0 8 per category
Actions learned per category 3
Token overhead ~30 tokens
Code changes required Zero
Success rate 100%

Dashboard Feedback (No Code Changes)

  1. Agent runs → trajectory recorded automatically
  2. Go to dashboard → click trajectory → rate it → add comment
  3. Feedback feeds into the learning loop
  4. Next run is better — no code changes