This article details a system design problem involving the generation of valid phone numbers based on complex regulatory rules. It highlights a common pitfall in algorithmic design where an initial O(N^M) iterative approach, suitable for small datasets, fails dramatically at scale. The core system design lesson is the transformation of a compute-intensive algorithmic problem into a data-centric problem, leveraging the power of relational databases for efficient set-based operations.
Read original on Dev.to #systemdesignThe objective was to generate lists of valid 10-digit phone numbers for specific cities in Mexico, adhering to IFT (Instituto Federal de Telecomunicaciones) regulations. These regulations involve prefixes (2-4 digits) tied to localities and state, and complex rules dictating which suffixes are valid for the remaining 6-8 digits. The challenge was to efficiently generate all valid combinations for a given city prefix.
The first implementation relied on deeply nested `for` loops. For each digit position, it would iterate through all possible values (0-9) and apply `if` conditions to filter invalid combinations based on IFT rules. In the worst case, for cities with 3-digit prefixes (leaving 7 digits to generate), this resulted in a time complexity of O(10^7) just for combination generation, before even applying the filtering logic. While this worked for small cities, it took hours for large cities like CDMX or Monterrey, pushing local systems to memory and CPU limits.
Scalability Trap
This scenario highlights a crucial system design lesson: an algorithm that is logically correct can be an architectural disaster if its complexity isn't considered against expected data scale. Relying on iterative, in-memory processing for combinatorial problems quickly becomes a bottleneck.
The key insight was recognizing that the problem was not a pure algorithmic puzzle best solved with imperative code, but rather a data combination and filtering problem. Relational database engines are highly optimized for this kind of set-based operation. The paradigm shifted from "generating combinations with code" to "modeling rules as data and letting the database perform the joins and filtering."
The refined solution involved transforming the IFT rules from `if` statements into database tables (often temporary). Each table represented valid digits for a specific position, conditioned on the state and locality. The generation of valid phone numbers then became a series of `JOIN` operations across these tables, efficiently combining the valid segments and filtering out invalid ones. This leverages the database's optimized query planners and indexing capabilities, drastically reducing computation time from hours to seconds or minutes.
-- Simplified example of rule conversion and combination
CREATE TEMPORARY TABLE valid_prefix_cdmx AS
SELECT prefix FROM ift_rules WHERE city = 'CDMX';
CREATE TEMPORARY TABLE valid_suffix_part1 AS
SELECT suffix_digits FROM ift_rules_suffix WHERE position = 1;
CREATE TEMPORARY TABLE valid_suffix_part2 AS
SELECT suffix_digits FROM ift_rules_suffix WHERE position = 2;
SELECT
p.prefix || s1.suffix_digits || s2.suffix_digits
FROM
valid_prefix_cdmx p
JOIN
valid_suffix_part1 s1 ON true -- or relevant join condition
JOIN
valid_suffix_part2 s2 ON true -- or relevant join condition
WHERE
LENGTH(p.prefix || s1.suffix_digits || s2.suffix_digits) = 10;This approach demonstrates a powerful technique in system design: recognizing when an algorithmic problem can be reframed as a data manipulation task, allowing the use of specialized, highly optimized tools like database systems to achieve superior performance and scalability.