CTX: Trigger-Driven Dynamic Context Loading
for Code-Aware LLM Agents
Jeawon Jang  Β·  be2jay67@gmail.com
⚑ Code: github.com/jaytoone/CTX πŸ“„ arXiv: Pending Endorsement (cs.IR) πŸ† 8 Strategies Β· 415 Queries Β· p<0.05
0.776
TES Score
1.9Γ—
vs BM25 Baseline
5.2%
Token Usage
0.95
Hybrid COIR R@5
1.00
IMPLICIT Recall@5
415
Total Queries
Abstract
Large language models suffer from context dilution when processing extensive codebases β€” the "Lost in the Middle" problem. Standard RAG approaches treat code as flat text, ignoring the structural dependency information in import graphs.

We present CTX, a trigger-driven dynamic context loading system that classifies developer queries into four types β€” EXPLICIT_SYMBOL, SEMANTIC_CONCEPT, TEMPORAL_HISTORY, IMPLICIT_CONTEXT β€” and routes each to a specialized retrieval pipeline.

For dependency-sensitive queries, CTX performs breadth-first traversal over the codebase import graph, resolving transitive relationships invisible to keyword and embedding methods. Evaluated on a synthetic benchmark (50 files, 166 queries) and three real Python codebases (968 files total, 249 queries), CTX achieves TES 1.9Γ— higher than BM25 with only 5.2% token usage. Statistical significance via McNemar and Wilcoxon tests (p<0.05) across 415 queries.
Contributions
  • Four-type trigger taxonomy β€” EXPLICIT_SYMBOL, SEMANTIC_CONCEPT, TEMPORAL_HISTORY, IMPLICIT_CONTEXT, each mapped to a specialized retrieval strategy enabling adaptive resource allocation.
  • Import graph traversal β€” BFS-based algorithm over the codebase import graph resolving transitive dependencies. Recall@5 = 1.0 on dependency queries vs 0.4 for BM25 β€” 150% improvement.
  • TES metric β€” Trade-off Efficiency Score = Recall@K / ln(1 + |retrieved|), unified measure of accuracy-efficiency. Pearson r = 0.87 correlation with NDCG@5 (p<0.001).
  • Hybrid Dense+CTX β€” Two-stage pipeline combining dense neural seed selection with import graph expansion. COIR Recall@5 = 0.950 (+150% over CTX alone), validating complementary nature of semantic and structural retrieval.
Architecture
Query Input
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Trigger Classifier         β”‚  β†’ regex + keyword patterns
β”‚   (EXPLICIT / SEMANTIC /     β”‚
β”‚    TEMPORAL / IMPLICIT)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚                β”‚              β”‚                 β”‚
       β–Ό                β–Ό              β–Ό                 β–Ό
  Symbol Index    TF-IDF/Dense    History Log      Import Graph BFS
  (AST lookup)   (cosine sim)   (git history)    (transitive deps)
       β”‚                β”‚              β”‚                 β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚   Adaptive-k Selection  β”‚  β†’ k = f(query_type, codebase_size)
    β”‚   (3~10 files)          β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β–Ό
          LLM Context
          (5.2% tokens)