TTL-First Cache Table Patterns (FaceTheory ISR)
This document describes operationally safe patterns for TTL-based cache metadata tables used for ISR:
- how to choose a TTL attribute and window
- how to separate “freshness” from “expiry”
- how to account for DynamoDB TTL deletion lag
- operational recommendations to avoid surprises in production
Schema context: docs/facetheory/isr-cache-schema.md.
Choose a TTL attribute
Recommended:
- Use a single numeric attribute named
ttlthat stores epoch seconds. - Treat it as garbage collection, not as correctness logic.
Why:
- TTL is asynchronous and can delete later than the timestamp you set.
- A stable
ttlattribute name keeps Go/TypeScript/Python models aligned.
Freshness vs expiry (two clocks)
Use two separate concepts:
- Freshness window (correctness boundary):
fresh_until = generated_at + revalidate_seconds
- Expiry / GC horizon (storage boundary):
ttl = generated_at + retention_seconds (+ safety_buffer)
Rules of thumb:
revalidate_secondsis usually minutes to hours.retention_secondsis usually days to weeks (enough for debuggability and rollback).ttlSHOULD be much larger thanrevalidate_seconds.
DynamoDB TTL deletion lag
TTL deletion is best-effort; items can remain after their ttl passes.
Implications:
- ✅ Treat
ttlas “eligible for deletion”, not “will be deleted immediately”. - ✅ Readers MUST decide fresh/stale based on
generated_atandrevalidate_seconds. - ❌ Never treat “item missing” as “item never existed”; it may have been deleted earlier or later than expected.
Operational recommendations
- Hot partitions: avoid putting raw URL paths directly in
pk; use a stable hash to reduce common-prefix hotspots. - Capacity: for ISR tables, on-demand billing is often simplest; watch
ThrottledRequestsand adjust. - Alarms: alert on read/write throttles, elevated
UserErrors, and high latency; consider trackingTimeToLiveDeletedItemCountas a sanity signal (not a correctness signal). - Retention: choose a retention long enough for investigations (and long enough to survive TTL lag), but not so long that storage cost grows unbounded.
Example: write metadata with TTL, then read and decide fresh/stale
This example uses the META row pattern (sk = "META").
Write
- Compute
generated_at = now_unix. - Set
ttl = generated_at + retention_seconds. - Store
revalidate_secondsalongside the metadata.
Read
- Fetch the
METAitem by primary key. - Compute
fresh_until = generated_at + revalidate_seconds. - Consider it fresh iff
now_unix < fresh_until.
Go example
nowUnix := time.Now().Unix()
meta := &FaceTheoryCacheMetadata{
PK: "TENANT#t1#CACHE#abc",
SK: "META",
S3Key: "pages/t1/abc.html",
GeneratedAt: nowUnix,
RevalidateSeconds: 60, // freshness window
TTL: nowUnix + 86400, // GC horizon (1 day)
}
if err := db.Model(meta).CreateOrUpdate(); err != nil {
return err
}
var got FaceTheoryCacheMetadata
if err := db.Model(&FaceTheoryCacheMetadata{}).
Where("PK", "=", meta.PK).
Where("SK", "=", "META").
First(&got); err != nil {
return err
}
freshUntil := got.GeneratedAt + got.RevalidateSeconds
isFresh := time.Now().Unix() < freshUntil