Caching

Hyperterse supports executor-level query caching. This means a cache hit skips connector execution and returns cached results directly. Because all query transports converge on the same executor path, the behavior is consistent across REST, ConnectRPC, and MCP.

How caching works

When a query is executed:

Hyperterse validates inputs.
It builds the final statement after environment + input substitution.
It computes a cache key
It checks the cache:
- hit: returns cached rows
- miss: calls connector, stores result with TTL, returns fresh rows

This keeps cache semantics deterministic and transport-agnostic.

Configuration structure

Caching is configured in two places:

Global/default settings: server.queries.cache
Per-query override: queries.<name>.cache

Global defaults

1
server:
2
  queries:
3
    cache:
4
      enabled: true
5
      ttl: 120

Field	Type	Default	Description
`enabled`	`boolean`	`false`	Enables caching globally
`ttl`	`int`	`120`	Default cache TTL (seconds) when caching is enabled

Query override

1
queries:
2
  get-user-by-id:
3
    use: main_db
4
    description: 'Get one user'
5
    cache:
6
      enabled: true
7
      ttl: 30
8
    statement: |
9
      SELECT id, name, email
10
      FROM users
11
      WHERE id = {{ inputs.userId }}
12
    inputs:
13
      userId:
14
        type: int

Field	Type	Description
`enabled`	`boolean`	Required when `cache` block is present
`ttl`	`int`	Optional TTL override in seconds (default remains `120`)

Precedence rules

Cache policy is resolved in this order:

Start with defaults:
- enabled = false
- ttl = 120
Apply server.queries.cache values (if present).
Apply queries.<name>.cache values (if present).

Examples

Global enabled, no query override

1
server:
2
  queries:
3
    cache:
4
      enabled: true

All queries cache with ttl=120.

Global enabled with custom TTL

1
server:
2
  queries:
3
    cache:
4
      enabled: true
5
      ttl: 300

All queries cache with ttl=300, unless a query overrides TTL.

Query-level opt-out

1
server:
2
  queries:
3
    cache:
4
      enabled: true
5
      ttl: 300
6

7
queries:
8
  list-audit-events:
9
    use: main_db
10
    description: 'Always fetch latest events'
11
    cache:
12
      enabled: false
13
    statement: 'SELECT * FROM audit_events ORDER BY created_at DESC LIMIT 100'

list-audit-events bypasses cache even though global cache is enabled.

Query-level TTL override

1
server:
2
  queries:
3
    cache:
4
      enabled: true
5
      ttl: 300
6

7
queries:
8
  list-products:
9
    use: main_db
10
    description: 'Product catalog'
11
    cache:
12
      enabled: true
13
      ttl: 30
14
    statement: 'SELECT id, name, price FROM products'

list-products uses ttl=30.
Other queries use ttl=300.

Recommended TTL strategy

Use data volatility to pick TTL:

10-30s: rapidly changing data (live metrics, dashboards)
60-300s: standard app reads (catalogs, profile lookups)
300-900s: slower-changing reference data
0 or disabled: writes, highly sensitive reads, strict real-time requirements

Start conservative, monitor behavior, then increase where safe.

What should (and should not) be cached

Good candidates:

Read-heavy lookups
Repeated queries with identical normalized statement output
Expensive aggregations that tolerate slight staleness

Avoid or disable:

Mutation queries
Security-sensitive reads requiring strict freshness
Highly user-specific queries with poor repeat rate

Behavior across handlers

Caching is implemented at executor level, so these all share the same behavior:

REST API endpoints
MCP tool calls (tools/call)

No handler-specific cache configuration is needed.

Operational notes

Cache is process-local in-memory.
Restart clears cache contents.
Cache key includes final rendered statement, so different input values naturally produce distinct keys.

Troubleshooting

I expected a cache hit but got a miss

Common reasons:

Different final statement text due to changed inputs/env-substitutions
TTL expired
Query has cache.enabled: false
Global cache disabled and query-level cache not enabled

Cache seems stale

Common reasons:

Lower the ttl for that query
Disable caching for strict real-time endpoints
Verify whether source data changes more frequently than TTL

Memory pressure increased

Common reasons:

Shorten TTL values for high-cardinality queries
Disable caching on low hit-rate queries
Cache only stable, repeatable read paths