Data Pipeline & Agent Architecture
We don’t store data.
We store meaning.
Raw signals are processed and discarded. Only distilled intelligence is stored. Every stage compresses the data into a denser, more useful form. The vector store holds briefings, action records, and annotations — not events.
The distillation pipeline
Stage 01
Raw Ingest
GDELT · Google Trends · Reddit · YouTube · Platform APIs · Financial filings Processed in memory only. Never written to disk or database. Extract: topic · velocity · source weight · category relevance
Raw data discarded
Stage 02
Signal Triage
Score: source reliability × velocity × persistence × category relevance Below threshold → single line count, discarded Above threshold → candidate signal object Corroborated → confirmed signal with confidence + timing
Unconfirmed signals discarded
Stage 03
Briefing Functions
First LLM call. Receives confirmed signals + retrieved brand context. Produces structured briefing: ranked signals · brand context · rules triggered · timing windows Output is JSON, not prose. Compact. Queryable.
Briefing stored in vector store
Stage 04
Payload Assembler
No LLM. Pure code. Deterministic. Queries vector store by semantic similarity · Selects relevant briefings + brand context · Injects skill + tolerances · Enforces token budget
Most critical engineering piece
Stage 05
Recommendation Bot
Second LLM call. Receives assembled payload. Multi-pass reasoning. Returns structured JSON: action · confidence · reasoning · tolerance check · escalate flag Routes to operator approval queue.
Recommendation stored
Stage 06
MCP Execution
Approved instruction → MCP adapter → platform API Platform executes · Returns current state Duplex: “How are you doing?” → “Do this.” → “Now how are we doing?”
Action + outcome logged
Stage 07
Analysis Bot
Third LLM call. Receives action taken + platform outcome. Produces annotation: what happened · vs expectation · score · verdict Verdict: REPEAT · ADAPT · DO NOT REPEAT
Annotation stored in vector store
The six bots — each with one job
Bot 01
Signal Bot
Pulls and triages raw signals
Schedule
02:00–02:20 nightly
In
GDELT · Google Trends · Reddit · Platform APIs
Out
signal_staging · signal_run_log
No LLM
Bot 02
Briefing Functions
Turns signals into intelligence
Schedule
07:00 after operator review
In
signal_staging (elevated, reviewed)
Out
signal_briefings (vector indexed)
LLM Call #1
Bot 03
Analysis Bot
Annotates outcomes, writes verdicts
Schedule
03:00 nightly
In
Platform APIs · optimisation_log (unannotated)
Out
performance_snapshots · optimisation_log (annotated)
LLM Call #2
Bot 04
Recommendation Bot
Multi-pass reasoning, generates actions
Schedule
04:00 nightly
In
signal_briefings · performance_snapshots · optimisation_log · placement_patterns · brand_knowledge
Out
recommendations (pending operator approval)
LLM Call #3
Bot 05
Optimisation Bot
Executes approved recommendations
Schedule
After operator approval
In
recommendations (approved)
Out
optimisation_log · placements (state update)
No LLM
Bot 06
Summary Bot
Compresses hot tier to cold tier
Schedule
Sunday night weekly
In
optimisation_log · recommendations (expiring)
Out
placement_patterns · weekly_summaries · brand_knowledge
No LLM
What lives in the vector store
The vector store is small, dense, and high-signal.
Not raw signals. Not platform data. Not campaign reports. Only distilled, structured intelligence that the Payload Assembler can retrieve by semantic similarity.
Signal Briefings
Daily, weekly, monthly intelligence objects. Each a compressed summary of the signal environment for this entity at this moment.
NOT the raw signals that produced them
Brand Knowledge
Identity, positioning, products, rules, governance. Updated when action outcomes suggest a rule should evolve.
NOT a static document — living and versioned
Optimisation Log
Every instruction issued. The reasoning payload. The platform outcome. The Analysis Bot’s verdict and score.
NOT platform raw data — only decisions and outcomes
Placement Patterns
Compressed from expiring hot tier rows. REPEAT, ADAPT, STOP. The accumulated institutional memory no competitor can replicate.
NOT campaign reports — distilled behavioural intelligence
What a daily Fast Track briefing looks like
Briefing Output — Fast Track Daily · Beauty Brand · 07:00
Structured JSON. Compact. Every field queryable by the Payload Assembler. Not prose — precision.
Signals (confirmed, ranked)
purple_lip_trend: confidence 0.87 · peak T−4 days
berry_aesthetic: confidence 0.72 · peak T−11 days
competitor_A_dark: spend drop 34% · 3 days
CPM_beauty_UK: trending down 8% · 48hrs
discarded_signals: 847 below threshold
Brand Context Retrieved
matching_products: Luxe Violet Lip · Berry Gloss
brand_permission: trend_reactive TRUE
staged_creative: 4 assets ready · violet theme
rules_triggered: 2 of 47 retrieved
mandate_progress: 1.61:1 · target 2:1
Recommended Attention
priority_1: Amazon DSP bid amplification — pre-peak window open
priority_2: Competitor dark — reallocate SOV budget
prior_pattern: REPEAT · pre-peak bid scored 9.1/10 · 4/4
daily_envelope: £4,200 · band ±£840
Briefing → Skill → Agent instruction
How a briefing becomes an Amazon DSP instruction via MCP
Briefing (vector retrieved)
purple_lip_trend · 0.87 · peak T−4
prior_pattern: REPEAT pre-peak bid
competitor_dark · CPM falling
mandate: 1.61:1 · needs acceleration
Skill: amazon_dsp_optimise_v2
You are optimising an Amazon DSP campaign.
Daily envelope: £4,200. Band: ±20%.
Return mandate: 2:1. Current: 1.61:1.
CPA floor: £4.20. Output: JSON only.
Brain Output (JSON)
action: increase_bid_violet_lip_18%
reallocate: awareness → conversion £620
pause: creative_3 (frequency exceeded)
confidence: 0.91 · within_band: true
MCP → Amazon DSP
bid_adjustment: Luxe Violet Lip +18%
budget_shift: line_item_003 +£620
creative_pause: ad_id_cr3
↩ confirmed · CTR +22% vs yesterday
Analysis Bot — the learning loop
Every execution produces an annotation. Every annotation feeds back into the rules.
Annotation Record · Amazon DSP · 14:32
Action taken: Bid +18% on Luxe Violet Lip · Reallocated £620 to conversion
Signal basis: Pre-peak window T−4 · Competitor dark · CPM falling
Outcome: CPA £4.80 → £3.90 · CTR +22% · ROAS 2.4:1 on segment
vs expectation: Outperformed · Mandate progress +0.29 in 6hrs
Pattern: Pre-peak bid amplification during competitor dark period compounds efficiency. Effect stronger when CPM also falling.
Repeat — Score 9.4 / 10
Learned Rule Update · Written to knowledge store
Rule ID: opt_prepeakbid_003
Condition: Horizon signal T−3 to T−5 AND competitor spend declining AND CPM falling
Action: Increase bid 15–20% · Shift budget from awareness to conversion
Evidence: 5/5 successful · Avg CPA improvement 27% · Avg ROAS uplift 0.6
Confidence: HIGH · Apply autonomously within Fast Track band
Exceptions: Do not apply if no_trend_reactive set · Do not apply if CPA floor at risk
Autonomous — apply without sign-off
The compounding moat: Every campaign the system runs produces annotated action records. Every annotation refines a rule. Every refined rule makes the next campaign smarter. After 12 months of operation, the system carries institutional knowledge about this brand’s category, competitors, and optimal patterns that took years of human experience to accumulate — and that no competitor can replicate without running the same campaigns. The moat is not the algorithm. It is the annotated history.