Context engine

Last updated 5/24/2026

Colony Context Engine

Document Metadata

Field	Value
Document ID	CTX-001
Product	Colony — ZoomProp GTM Agentic Stack Command Interface
Repository	`midwestco/colony`
Methodology	PRD-driven Context Engineering v0.1
Status	v0.3 Draft — Expansion Phase
Last Revised	2026-Q2
Owner	Platform / Agent Runtime Team

1. Purpose and Scope

This document defines how Colony manages AI context across its multi-agent runtime. Colony is not a single-model application — it is an orchestrated stack of specialist agents, each requiring coherent, scoped, and verifiable context to execute GTM workflows reliably. This document answers three questions for every engineering and product decision involving LLMs:

What does each agent know when it starts, and where did that knowledge come from?
How does knowledge accumulate and persist across sessions, users, and agent handoffs?
How is context density optimized so that every LLM token budget is spent on signal, not noise?

2. Domain Profile

2.1 Product Domain

Colony operates in the B2B go-to-market automation domain for ZoomProp, a proptech product targeting healthcare organizations. The GTM motion combines outbound prospecting, deal pipeline management, call intelligence, and post-sale onboarding — all orchestrated through a conversational command interface at /overview.

2.2 Key Entities

Entity ID	Name	Description	Primary Table / Source
ENT-001	Organization	A prospect or customer company; tenant boundary in Clerk	`organizations`
ENT-002	Contact	Individual at a prospect org; Pipedrive person record	`contacts`
ENT-003	Deal	9-stage pipeline record, bi-synced with Pipedrive	`deals`
ENT-004	Sequence	T1–T5 outbound cadence with 5 messaging angles	`sequences`
ENT-005	Message	Individual outbound message in a sequence step	`messages`
ENT-006	Recording	Gemini Meet Notes file from Google Drive	`recordings`
ENT-007	Knowledge Core Entry	One of 10 domain buckets; pgvector-indexed	`knowledge_core`
ENT-008	Deployment Kit	8-asset onboarding bundle auto-generated on Closed-Won	`deployment_kits`
ENT-009	Daily Brief	Morning digest of pipeline state, alerts, and queue	`daily_briefs`
ENT-010	Candidate	Prospected contact pending qualification approval	`candidates`
ENT-011	Approval Item	Queued message or candidate awaiting human sign-off	`approval_queue`
ENT-012	Content Piece	One of 6 content pillars; pipeline-attributed	`content_pieces`

2.3 Domain Terminology

Term	Definition
ICP	Ideal Customer Profile — healthcare org matching ZoomProp's target criteria
Anti-ICP	Explicit exclusion list; hard pre-filter applied before any LLM scoring (`docs/phase2/action_plans/16_anti_icp_hard_prefilter.md`)
T1–T5	Sequence tier; T1 is the highest-signal, lowest-volume touch
5 Angles	Signal-led, Pain-led, Referral-led, Pattern-break, Insight-led — the five outbound message frames
HOT queue	Reply queue for prospects who have engaged; routed ahead of cold sequences
Closed-Won	Deal stage that triggers Deployment Kit auto-generation
Knowledge Core	10-domain pgvector store that grounds all LLM generations
Deployment Kit	8 assets auto-generated on Closed-Won: pitch deck, one-pager, case-study shortlist, pricing reference, rollout calendar, stakeholder map, risk register, KPI dashboard
Circuit Breaker	Per-org rate guard that pauses outbound when bounce/reply thresholds are exceeded
Inngest	Durable function runtime that manages agent task queues
GTM Orchestrator	Primary conversational agent behind the `/overview` chat interface
Campaign Orchestrator	Sub-orchestrator for multi-turn prospecting → qualification → message-gen loops

2.4 Constraints

Constraint ID	Description
CONST-001	All LLM calls are organization-scoped; no cross-tenant context bleed
CONST-002	Anti-ICP filter (`ENT-010` pre-filter) runs before any LLM scoring to avoid wasting tokens on ineligible prospects
CONST-003	KMS-encrypted per-org vault (`api_keys` table) means LLM agents receive decrypted credentials at invocation time only
CONST-004	Every agent invocation is observable via Langfuse; traces must include `org_id`, `agent_id`, and `run_id`
CONST-005	Circuit breakers (`docs/phase1/action_plans/22_circuit_breakers.md`) can halt agent execution mid-campaign; context must be serializable for resume
CONST-006	pgvector index uses 1536-dimensional embeddings (OpenAI `text-embedding-3-small` compatible dimensions)

3. Context Architecture

3.1 Structural Overview

Colony's context architecture has four layers, each with a distinct persistence model and retrieval mechanism:

┌─────────────────────────────────────────────────────────────────┐
│  LAYER 4 — Ephemeral / In-Flight                                │
│  SSE stream state · tool-call buffers · Inngest step variables  │
└──────────────────────────┬──────────────────────────────────────┘
                           │ hydrates
┌──────────────────────────▼──────────────────────────────────────┐
│  LAYER 3 — Session Context                                      │
│  Chat turn history · approval-queue state · active campaign run │
│  Stored: Postgres `chat_sessions`, `campaign_runs`              │
└──────────────────────────┬──────────────────────────────────────┘
                           │ augments
┌──────────────────────────▼──────────────────────────────────────┐
│  LAYER 2 — Org Context                                          │
│  ICP definition · sequence library · deal state · recordings   │
│  Stored: Postgres relational tables (per-org row isolation)     │
└──────────────────────────┬──────────────────────────────────────┘
                           │ grounds
┌──────────────────────────▼──────────────────────────────────────┐
│  LAYER 1 — Knowledge Core (Semantic / Persistent)               │
│  10-domain pgvector store · exemplar messages · case studies    │
│  Retrieved: cosine similarity search, top-k per domain          │
└─────────────────────────────────────────────────────────────────┘

3.2 Knowledge Core — The Semantic Foundation

The Knowledge Core is Colony's single source of truth for grounding LLM outputs. It is a pgvector-backed store (Cloud SQL Postgres 16, instance colony-39989:…:colony, extension pgvector at 1536 dimensions) partitioned into 10 domain buckets.

Every message, content piece, and brief generation first queries the Knowledge Core with a cosine similarity search and prepends the top-k exemplars into the LLM prompt. This is the primary mechanism for style and domain consistency.

10 Knowledge Core domains (from docs/phase1/action_plans/24_knowledge_core_editor.md):

Domain ID	Bucket
KC-01	Company narrative / positioning
KC-02	ICP profiles and scoring criteria
KC-03	Competitive intelligence
KC-04	Objection handling
KC-05	Case studies and social proof
KC-06	Product capabilities and integrations
KC-07	Pricing and commercial terms
KC-08	Outbound message exemplars (by angle)
KC-09	Post-call and recording summaries
KC-10	Onboarding playbooks and deployment guides

Notion sync (docs/phase2/action_plans/15_notion_kc_sync.md) keeps KC-01 through KC-07 current with human-maintained knowledge documents, propagating updates through a webhook-triggered re-embedding pipeline.

3.3 Postgres Schema as Context Store

The relational schema serves as the structured context store for all non-semantic data. Key context-carrying tables:

Table	Context Role
`organizations`	Tenant root; all context is scoped by `org_id`
`contacts` / `candidates`	Prospect profile, enrichment fields, ICP score
`deals`	9-stage pipeline state; source of truth for deal-stage-gated agent triggers
`sequences`	Active cadences with angle assignments and tier levels
`recordings`	Gemini Meet Notes; signal extraction stored as JSONB
`knowledge_core`	Vector + metadata for all 10 domains
`api_keys`	KMS-encrypted per-org credentials; decrypted at agent invocation
`approval_queue`	Pending human decisions that block agent advancement
`daily_briefs`	Serialized morning summaries; reused as context for the day's chat sessions
`chat_sessions`	Turn-by-turn conversation history for the GTM Orchestrator
`campaign_runs`	Inngest run state; checkpoints for circuit-breaker-interrupted campaigns

3.4 GCS as Binary Context Store

Large artifacts that exceed Postgres row limits or require signed-URL delivery are stored in gs://colony-assets (CMEK-encrypted, infrastructure/gcp/storage.tf). These include:

Deployment Kit documents (pitch deck PDFs, rollout calendar exports)
Playwright scraper screenshots and session artifacts (runners/.artifacts/)
Recording audio/video files prior to Gemini transcription

Agents receive signed URLs to these artifacts rather than raw content, keeping prompt payloads lean.

4. Agent Context Protocols

4.1 Agent Inventory and Context Scope

Each specialist agent receives a context bundle assembled by the orchestrator before invocation. The bundle follows a consistent structure:

AgentContextBundle {
  org_id: string                    // Clerk org ID — always present
  run_id: string                    // Inngest run ID — for trace correlation
  agent_role: AgentRoleEnum         // Typed role constant
  system_prompt: string             // Role-specific system instructions
  knowledge_core_exemplars: Array   // Top-k semantic retrievals
  org_context: OrgContext           // ICP def, sequence library, deal stage
  task_payload: Record<string, any> // Agent-specific inputs
  tool_definitions: ToolDefinition[]// Available tools for this invocation
}

Agent ID	Agent Name	Context Inputs	Outputs Written To
AGT-001	GTM Orchestrator	Chat history, daily brief, org context, full tool set	`chat_sessions`, approval queue
AGT-002	Campaign Orchestrator	ICP criteria, sequence config, candidate queue state	`campaign_runs`, `candidates`
AGT-003	Prospect Agent	ICP definition, data source configs, anti-ICP list	`candidates`
AGT-004	Qualification Agent	Candidate profile, enrichment data, ICP scoring rubric	`candidates.icp_score`, `candidates.status`
AGT-005	Message Generator	KC-08 exemplars, contact profile, angle assignment, sequence tier	`messages`, `approval_queue`
AGT-006	Recording Intelligence	Gemini transcript, KC-09 exemplars, deal context	`recordings.signals`, `deals`, `knowledge_core` (KC-09)
AGT-007	Post-Call Agent	Recording signals, deal stage, KC-04 (objections), KC-09	`deals`, `contacts`, `approval_queue`
AGT-008	Content Agent	KC-01/KC-03/KC-05 exemplars, pillar assignment, attribution data	`content_pieces`
AGT-009	Onboarding Agent	Deal record (Closed-Won), KC-10, KC-05 (case studies), org context	`deployment_kits`, GCS
AGT-010	Analytics Agent	All pipeline tables (read-only), prior daily briefs	`daily_briefs`, email (Resend)

4.2 Orchestrator Context Protocol

The GTM Orchestrator (AGT-001) is the primary entry point at /overview. It manages multi-turn context using a rolling window over chat_sessions:

Turn ingestion — Each user message appends to chat_sessions with role: user.
Context hydration — Before each LLM call, the orchestrator loads: (a) the most recent N turns from chat_sessions; (b) today's daily_brief as a system-level summary; (c) active approval_queue items as injected tool-state context; (d) top-3 KC retrievals relevant to the user's intent.
Tool dispatch — When a tool call is selected, the orchestrator serializes the tool invocation to campaign_runs via Inngest, then streams the tool output back into the chat context window as an assistant message.
Approval gates — When a downstream agent produces an item requiring human approval, the orchestrator pauses the turn, writes to approval_queue, and presents the pending item to the user inline in the stream.

4.3 Inngest as Context Checkpoint Engine

Inngest (docs/phase1/action_plans/04_inngest_agent_runtime.md) provides durable execution that functions as an implicit context persistence layer:

Each Inngest step function receives its inputs as serialized JSON, which constitutes the agent's full context for that step.
Step outputs are automatically persisted, enabling agents to resume after circuit-breaker pauses (CONST-005) without recomputing prior steps.
The campaign_runs table records which Inngest run ID corresponds to which campaign state, allowing the GTM Orchestrator to re-attach context after an interruption.

4.4 Cross-Agent Context Handoff Pattern

When the Campaign Orchestrator (AGT-002) delegates to a specialist agent, it packages a handoff envelope:

HandoffEnvelope {
  parent_run_id: string     // AGT-002's Inngest run ID
  downstream_agent: string  // e.g. "message-generator"
  entity_ids: string[]      // IDs of contacts/deals/recordings in scope
  angle: OutboundAngle      // Signal | Pain | Referral | PatternBreak | Insight
  tier: SequenceTier        // T1 | T2 | T3 | T4 | T5
  kc_bucket_hints: string[] // Which KC domains to prioritize in retrieval
}

The receiving agent uses entity_ids to hydrate structured context from Postgres and kc_bucket_hints to focus its Knowledge Core retrieval, avoiding full-corpus searches that dilute context density.

5. Prompt Engineering Patterns

5.1 System Prompt Architecture

Every agent system prompt follows a four-section template:

[ROLE DECLARATION]
You are the {agent_name} for ZoomProp's GTM stack. Your sole responsibility 
is {primary_task}.

[DOMAIN GROUNDING]
ZoomProp sells {product_description} to healthcare organizations matching 
the following ICP: {icp_definition}. The following are ANTI-ICP conditions 
that disqualify a prospect regardless of other signals: {anti_icp_list}.

[BEHAVIORAL CONSTRAINTS]
- Never generate content for a contact whose org_id differs from {org_id}.
- Do not fabricate company facts; use only data present in the context below.
- If a required field is absent, emit a structured "missing_data" tool call 
  rather than hallucinating a value.

[CONTEXT BLOCK]
{dynamic_context: knowledge_core_exemplars + entity_data + task_payload}

This template enforces: (1) role clarity, (2) domain grounding via ICP/Anti-ICP constraints, (3) behavioral guardrails against hallucination, and (4) a clearly delimited dynamic context section that changes per invocation.

5.2 Outbound Message Generation Pattern (AGT-005)

The Message Generator uses a constraint-first, exemplar-guided pattern optimized for the 5 outbound angles:

SYSTEM:
You are the Message Generator for ZoomProp outbound sequences.
Angle: {angle}  |  Tier: {tier}  |  Sequence step: {step_number}

EXEMPLARS (KC-08, top-3 for this angle):
---
{exemplar_1}
---
{exemplar_2}
---
{exemplar_3}
---

CONTACT PROFILE:
Name: {contact.name} | Title: {contact.title} | Org: {contact.org_name}
Signal context: {contact.enrichment.signals}
ICP score: {contact.icp_score}

CONSTRAINTS:
- Maximum 3 sentences for T1/T2. Up to 5 sentences for T3-T5.
- No subject line longer than 8 words.
- Do not mention competitors by name.
- Reference exactly one signal from the contact profile.

OUTPUT FORMAT:
{ "subject": "...", "body": "...", "angle_used": "...", "signal_referenced": "..." }

The structured JSON output format is critical: it routes the message to approval_queue with machine-readable metadata (angle, signal referenced) that the orchestrator uses to present approval UI and log analytics.

5.3 Recording Intelligence Extraction Pattern (AGT-006)

The Recording Intelligence agent processes Gemini Meet Notes transcripts. Because transcripts can be long, it uses a two-pass extraction pattern:

Pass 1 — Signal Detection (focused, short context):

Given the following meeting transcript segment, identify any of these 
signal types: [budget_signals, timeline_signals, stakeholder_signals, 
objection_signals, competitor_mentions, product_gaps]. 
Output as a structured JSON array with type, quote, and confidence score.

TRANSCRIPT SEGMENT:
{chunk}

Pass 2 — Summary and CRM Field Mapping (full signal set):

Given the following extracted signals from a sales call, generate:
1. A 3-sentence executive summary suitable for the deal record.
2. Recommended CRM field updates (deal stage, next action, close date).
3. Any Knowledge Core entries to update (specify KC domain and content).

SIGNALS:
{aggregated_signals_from_pass_1}

DEAL CONTEXT:
{deal.current_stage} | {deal.value} | {deal.next_action}

This two-pass approach prevents transcript length from overwhelming the context window while ensuring fine-grained signal detection.

5.4 ICP Qualification Scoring Pattern (AGT-004)

The Qualification Agent uses a rubric-anchored scoring pattern to produce deterministic, auditable ICP scores:

Score this prospect against ZoomProp's ICP. Return a JSON object with:
- total_score: integer 0-100
- dimension_scores: { size: int, vertical: int, tech_stack: int, 
                      signals: int, timing: int }
- disqualifiers: string[] (any Anti-ICP matches — auto-score 0)
- recommendation: "qualify" | "hold" | "disqualify"
- reasoning: string (max 2 sentences)

ICP RUBRIC:
{icp_rubric_from_kc_02}

ANTI-ICP CONDITIONS:
{anti_icp_list}

PROSPECT DATA:
{candidate.enrichment_payload}

Disqualifiers short-circuit scoring to 0 (CONST-002), preventing any LLM tokens from being spent on ineligible prospects beyond the classification step.

5.5 Daily Brief Synthesis Pattern (AGT-010)

The Analytics Agent generates the morning brief using an aggregation-then-narrative pattern:

SYSTEM:
You are the Analytics Agent. Synthesize a daily GTM brief for the 
ZoomProp founders. Be direct and data-driven. No filler sentences.

DATA SNAPSHOT (generated by SQL aggregations, not LLM):
Pipeline today: {deals_by_stage_json}
Queue: {approval_queue_count} items pending
Yesterday's output: {messages_sent}, {content_pieces_published}
Alerts: {circuit_breaker_status}, {bounce_rate}, {reply_rate}

YESTERDAY'S BRIEF (for continuity):
{prior_brief.summary}

OUTPUT:
Generate a brief with sections: Pipeline, Alerts, Queue, Yesterday, 
Today's Priorities. Max 250 words total.

The prior brief is injected for continuity — the Analytics Agent can reference trend direction ("bounce rate is down for the third consecutive day") without requiring access to a full historical database query in the prompt.

6. Memory Management

6.1 Cross-Session Persistence Model

Colony uses a tiered persistence model that matches information longevity to storage medium:

Tier	Lifetime	Storage	Examples
Ephemeral	Single request	In-process memory / SSE buffer	Tool call intermediate state, streaming chunks
Session	Active conversation	`chat_sessions` table	Turn history, pending tool calls
Campaign	Single campaign run	`campaign_runs` + Inngest state	Step outputs, approval decisions, candidate shortlists
Organizational	Account lifetime	Postgres relational tables	ICP definitions, sequence libraries, deal history
Semantic	Account lifetime + re-embedded on update	`knowledge_core` (pgvector)	Exemplars, case studies, playbooks

6.2 Context Window Management

Colony's agents are invoked within bounded context windows. The orchestrator enforces a budget allocation policy per agent type:

Budget Slot	Allocation	Content
System instructions	~800 tokens	Role declaration, domain grounding, behavioral constraints
Knowledge Core exemplars	~1,200 tokens	Top-3 to top-5 semantic retrievals (truncated by character limit)
Structured entity context	~1,500 tokens	Deal/contact/recording data serialized as compact JSON
Conversation history	~2,000 tokens (rolling)	Most recent N turns; older turns summarized (see §6.3)
Task payload	~500 tokens	Specific instructions for this invocation
Reserve (output)	Remaining	Completion tokens

These allocations are enforced by the context assembly layer before the LLM call. If entity data exceeds its budget, non-essential fields (e.g., raw enrichment HTML, full transcript text) are truncated in favor of structured signal fields.

6.3 Conversation Summarization Strategy

When chat_sessions exceeds the rolling window token budget, the GTM Orchestrator applies a structured summarization pass before compressing old turns:

Summarize the following conversation turns into a compact context block 
that preserves: (1) decisions made, (2) entities referenced by ID, 
(3) pending items not yet resolved, (4) user preferences expressed.
Discard pleasantries and redundant elaborations.
Max 300 tokens.

TURNS TO COMPRESS:
{old_turns}

The compressed summary is stored back to chat_sessions as a special role: summary entry, replacing the original turns. This ensures the GTM Orchestrator always has a coherent conversation history without unbounded growth.

6.4 Knowledge Core Update Lifecycle

When a Recording Intelligence or Post-Call Agent surfaces new domain knowledge (e.g., a novel objection or a case study data point), the update lifecycle is:

Agent emits a kc_update_candidate with domain, content, and confidence.
Update enters the approval_queue for human review (matching the human-in-the-loop pattern for all LLM-generated content).
On approval, the content is embedded via the re-embedding pipeline and upserted into knowledge_core with the appropriate domain bucket and org scope.
All future agent invocations immediately benefit from the updated exemplar pool.

Notion sync (docs/phase2/action_plans/15_notion_kc_sync.md) provides an alternative ingest path for human-authored knowledge updates, triggering the same re-embedding step via webhook.

7. Context Density Strategies

Context density is the ratio of decision-relevant signal to total tokens in an LLM prompt. Colony applies seven strategies to maximize this ratio:

7.1 Anti-ICP Hard Pre-Filter

Documented in docs/phase2/action_plans/16_anti_icp_hard_prefilter.md. Before any LLM is invoked for qualification or message generation, a deterministic SQL/rule-based filter eliminates prospects matching Anti-ICP conditions. This means zero LLM tokens are spent on ineligible contacts — the highest-ROI context density optimization in the stack.

7.2 KC Bucket Hints

Agents receive kc_bucket_hints in their handoff envelope (§4.4), directing semantic search to specific Knowledge Core domains rather than full-corpus search. A message generator working on a pain-led angle queries KC-08 (outbound exemplars) and KC-04 (objections) only — not KC-10 (onboarding playbooks), which would dilute retrieval quality.

7.3 Compact JSON Entity Serialization

Structured entity data (contacts, deals, recordings) is serialized into a compact, field-selected JSON format rather than full ORM output. Non-LLM-relevant fields (internal IDs beyond org_id/entity_id, timestamps, audit columns) are stripped before context assembly.

7.4 Two-Pass Extraction for Long Inputs

For recordings and long documents, the two-pass pattern (§5.3) ensures no single LLM call receives a full transcript. Chunked signal extraction followed by signal aggregation keeps each pass focused and within a dense context window.

7.5 Structured Output Contracts

All agents output structured JSON rather than free prose (except the Daily Brief narrative and post-call summaries). Structured outputs allow downstream agents to consume specific fields without re-parsing natural language, preventing error accumulation across the agent chain.

7.6 Daily Brief as Session Priming Document

Each morning's daily_brief is injected into the GTM Orchestrator's system context at session start. This means the orchestrator begins each conversation with a compressed, high-signal snapshot of organizational state — pipeline, alerts, queue, recent output — without the user needing to re-establish context.

7.7 Exemplar-Guided Generation

Knowledge Core exemplars provide the LLM with concrete, org-specific examples of high-quality outputs rather than abstract style instructions. Showing three approved outbound messages in the target angle is more token-efficient and more reliable than describing the desired style in natural language instructions.

8. Observability and Context Tracing

8.1 Langfuse Integration

Every LLM call emits a trace to Langfuse (platform key stored in GCP Secret Manager, infrastructure/gcp/secrets.tf). Each trace includes mandatory metadata:

{
  "org_id": "{clerk_org_id}",
  "agent_id": "{AGT-XXX}",
  "run_id": "{inngest_run_id}",
  "kc_domains_queried": ["KC-08", "KC-04"],
  "tokens_in": 3420,
  "tokens_out": 187,
  "model": "gemini-pro / gpt-4o",
  "context_budget_used": { "system": 794, "kc": 1143, "entity": 1285, "history": 1987, "task": 211 }
}

This trace structure enables retroactive analysis of context budget utilization, retrieval quality (which KC exemplars were selected vs. how the output quality rated), and per-agent token economics.

8.2 Sentry for Agent Runtime Errors

Agent-level errors (context assembly failures, tool call parse errors, circuit breaker triggers) are reported to Sentry with org_id and run_id as tags. This allows debugging of context-related failures without exposing org-specific prompt content in error logs.

9. Knowledge Graph Structure

Colony's project documentation and IDs form a queryable knowledge graph through consistent cross-referencing. The graph has three node types and two edge types:

9.1 Node Types

Node Type	Examples	Storage Location
Action Plan	`docs/phase1/action_plans/11_prospect_agent.md`, `docs/phase2/action_plans/16_anti_icp_hard_prefilter.md`	Git repository
Test Specification	`docs/phase2/testing/08_e2e_matching_algorithm.spec.ts`	Git repository
Runtime Artifact	`runners/.artifacts/pipedrive-batch/*/summary.json`	Local filesystem / GCS

9.2 Edge Types

Edge	Description
`implements`	Test spec implements the assertions defined in an action plan
`produces`	Agent runtime produces an artifact stored in `runners/.artifacts/` or GCS

9.3 Document ID System

Colony uses a hierarchical ID system derived from the action plan numbering:

Prefix	Scope	Example
`BR-`	Business requirement	BR-001: ICP-scored prospect discovery
`AGT-`	Agent definition	AGT-005: Message Generator
`KC-`	Knowledge Core domain	KC-08: Outbound exemplars
`ENT-`	Data entity	ENT-003: Deal
`CONST-`	System constraint	CONST-002: Anti-ICP pre-filter
`CTX-`	Context Engine document	CTX-001: This document
`TECH-`	Technical specification	TECH-001: pgvector 1536-dim index

9.4 Cross-Reference Map (Selected)

AGT-005 (Message Generator)
  ├── implements → docs/phase1/action_plans/13_message_generator_agent.md
  ├── tested by → docs/phase1/testing/04_e2e_outbound_flow.spec.md
  ├── reads from → KC-08 (ENT-007, domain "outbound exemplars")
  ├── reads from → ENT-002 (contact profile)
  ├── writes to  → ENT-011 (approval_queue)
  └── constrained by → CONST-001, CONST-002

AGT-006 (Recording Intelligence)
  ├── implements → docs/phase1/action_plans/14_recording_intelligence_agent.md
  ├── tested by → docs/phase1/testing/06_e2e_recording_intelligence.spec.md
  ├── reads from → ENT-006 (recordings), KC-09
  ├── writes to  → ENT-006.signals, ENT-003 (deal fields), KC-09
  └── triggered by → Google Drive webhook (Gemini Meet Notes)

ENT-007 (Knowledge Core)
  ├── defined in → docs/phase1/action_plans/24_knowledge_core_editor.md
  ├── tested by → docs/phase1/testing/08_e2e_knowledge_core.spec.md
  ├── synced from → Notion (docs/phase2/action_plans/15_notion_kc_sync.md)
  ├── indexed as → pgvector 1536-dim (TECH-001)
  └── queried by → AGT-004, AGT-005, AGT-006, AGT-007, AGT-008, AGT-009, AGT-010

10. Context Engineering Lifecycle Alignment

Colony's context architecture maps to the PRD lifecycle stages as follows:

Lifecycle Stage	Colony Context State	Key Documents
v0.1 Spark	Initial Knowledge Core seeding; ICP definition in KC-02	`docs/phase1/action_plans/24_knowledge_core_editor.md`
v0.2 Blueprint	Prospect + Qualification agents operational; anti-ICP filter live	`docs/phase2/action_plans/16_anti_icp_hard_prefilter.md`, `docs/phase1/action_plans/12_qualification_agent.md`
v0.3 Foundation	Message Generator + outbound sequences; approval queue live	`docs/phase1/action_plans/13_message_generator_agent.md`, `docs/phase1/action_plans/21_approval_queue.md`
v0.4 Core	Recording Intelligence