Gradio

CTX: Trigger-Driven Dynamic Context Loading

for Code-Aware LLM Agents

Jeawon Jang · be2jay67@gmail.com

⚡ Code: github.com/jaytoone/CTX 📄 arXiv: Pending Endorsement (cs.IR) 🏆 8 Strategies · 415 Queries · p<0.05

0.776

TES Score

1.9×

vs BM25 Baseline

5.2%

Token Usage

0.95

Hybrid COIR R@5

1.00

IMPLICIT Recall@5

415

Total Queries

Abstract

Large language models suffer from context dilution when processing extensive codebases — the "Lost in the Middle" problem. Standard RAG approaches treat code as flat text, ignoring the structural dependency information in import graphs.

We present CTX, a trigger-driven dynamic context loading system that classifies developer queries into four types — EXPLICIT_SYMBOL, SEMANTIC_CONCEPT, TEMPORAL_HISTORY, IMPLICIT_CONTEXT — and routes each to a specialized retrieval pipeline.

For dependency-sensitive queries, CTX performs breadth-first traversal over the codebase import graph, resolving transitive relationships invisible to keyword and embedding methods. Evaluated on a synthetic benchmark (50 files, 166 queries) and three real Python codebases (968 files total, 249 queries), CTX achieves TES 1.9× higher than BM25 with only 5.2% token usage. Statistical significance via McNemar and Wilcoxon tests (p<0.05) across 415 queries.

Contributions

Four-type trigger taxonomy — EXPLICIT_SYMBOL, SEMANTIC_CONCEPT, TEMPORAL_HISTORY, IMPLICIT_CONTEXT, each mapped to a specialized retrieval strategy enabling adaptive resource allocation.
Import graph traversal — BFS-based algorithm over the codebase import graph resolving transitive dependencies. Recall@5 = 1.0 on dependency queries vs 0.4 for BM25 — 150% improvement.
TES metric — Trade-off Efficiency Score = Recall@K / ln(1 + |retrieved|), unified measure of accuracy-efficiency. Pearson r = 0.87 correlation with NDCG@5 (p<0.001).
Hybrid Dense+CTX — Two-stage pipeline combining dense neural seed selection with import graph expansion. COIR Recall@5 = 0.950 (+150% over CTX alone), validating complementary nature of semantic and structural retrieval.

Architecture

Query Input
    │
    ▼
┌──────────────────────────────┐
│   Trigger Classifier         │  → regex + keyword patterns
│   (EXPLICIT / SEMANTIC /     │
│    TEMPORAL / IMPLICIT)      │
└──────────────┬───────────────┘
               │
       ┌───────┴────────┬──────────────┬─────────────────┐
       │                │              │                 │
       ▼                ▼              ▼                 ▼
  Symbol Index    TF-IDF/Dense    History Log      Import Graph BFS
  (AST lookup)   (cosine sim)   (git history)    (transitive deps)
       │                │              │                 │
       └───────┬─────────┴──────────────┴─────────────────┘
               │
    ┌──────────▼──────────────┐
    │   Adaptive-k Selection  │  → k = f(query_type, codebase_size)
    │   (3~10 files)          │
    └──────────┬──────────────┘
               │
               ▼
          LLM Context
          (5.2% tokens)

Strategy	Recall@5	Token%	TES
Full Context	0.075	100.0%	0.019
BM25	0.982	18.7%	0.410
Dense TF-IDF	0.973	21.0%	0.406
LlamaIndex	0.972	20.1%	0.405
Chroma Dense	0.829	19.3%	0.346
GraphRAG-lite	0.523	24.0%	0.218
Hybrid Dense+CTX	0.725	23.6%	0.303
CTX (Ours)	0.874	5.2%	0.776

Strategy	Recall@1	Recall@5	MRR
Dense Embedding (MiniLM)	0.960	1.000	0.978
Hybrid Dense+CTX	0.930	0.950	0.940
BM25	0.920	0.980	0.946
CTX Adaptive	0.210	0.380	0.293

Trigger Type	BM25	TF-IDF	CTX	Delta
EXPLICIT_SYMBOL	0.81	0.73	0.97	+19.8%
SEMANTIC_CONCEPT	0.54	0.68	0.60	—
TEMPORAL_HISTORY	0.50	0.50	1.00	+100%
IMPLICIT_CONTEXT	0.40	0.40	1.00	+150%

Variant	Removed	Recall@5	TES	IMPL_CONTEXT
Full CTX	—	0.874	0.776	1.000
No Graph	Import graph	0.821	0.635	0.400
No Classifier	Trigger type	0.743	0.412	0.600
Fixed-k=5	Adaptive-k	0.856	0.712	1.000

Try CTX on a Sample Codebase

Links

Experiment Reproducibility