This article details the design and implementation of a "heat score" system to introduce temporal relevance into data search and prioritization. It explains the underlying mathematical model, database schema, and integration points for dynamically adjusting content relevance based on recency and frequency of access, enhancing user experience in applications managing personal data.
Read original on Dev.to #architectureTraditional search algorithms often treat all data equally relevant semantically, regardless of when it was created or last accessed. This leads to issues where old, less useful records surface alongside current, highly pertinent ones, creating a poor user experience. The "heat score" system addresses this by introducing a temporal dimension to relevance, allowing applications to prioritize data that is actively being used or recently interacted with, without losing the ability to retrieve older data when explicitly requested.
The core of the system is a heat score, a normalized value between 0 and 1, derived from two primary signals: access frequency and access recency. The formula combines a linearly decaying access count with an exponentially decaying recency factor (using a half-life of one week). This ensures that intensely used records don't stay hot indefinitely if neglected, and recently accessed items quickly gain heat.
effective_access = access_count
r
max(0, 1 - (hours_since_last_increment / 4383))
raw_heat = effective_access
r
e^(-λ
r
hours_since_last_access)
# Where λ = ln(2) / 168 (half-life of one week)Normalization scales the raw heat into the 0-1 range with a soft cap at 0.8, preventing thermal saturation and allowing differentiation even among highly active records. A hard cap at 1.0 is enforced at the database level.
The heat score data is stored in a separate `record_heat` table, de-coupled from individual domain tables (e.g., notes, contacts). This cross-cutting design simplifies management, as changes to the heat logic require updates to only one table and associated services, rather than modifying numerous domain-specific schemas. The table includes fields for `domain`, `record_id`, `access_count`, `last_accessed`, `last_increment`, `heat_score`, and `memory_tier`.
CREATE TABLE record_heat (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
domain VARCHAR(30) NOT NULL,
record_id UUID NOT NULL,
access_count INTEGER NOT NULL DEFAULT 0,
last_accessed TIMESTAMPTZ NOT NULL DEFAULT now(),
last_increment TIMESTAMPTZ NOT NULL DEFAULT now(),
heat_score REAL NOT NULL DEFAULT 0.0,
memory_tier VARCHAR(10) NOT NULL DEFAULT 'cold',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE(domain, record_id)
);Heat scores are updated instantly on user access (e.g., opening a record) and periodically by a cron job (every 6 hours) to account for natural decay. The system also tracks the source of access events, allowing different weightings for user-initiated actions versus automated processes. In search, heat serves as a post-fusion tiebreaker, boosting relevant results without overriding semantic relevance. For features like digest engines, heat helps prioritize updates, surfacing changes to more active records higher up.