This article details a zero-knowledge filtering mechanism using 32-bit bitmasks to enable privacy-preserving matchmaking. It explains how categorical user data can be represented as integer bitmasks, allowing a backend server to perform matching logic (bitwise AND operations) without ever decrypting or understanding the actual user attributes. This approach ensures data privacy while efficiently filtering profiles.
Read original on Dev.to #systemdesignThe core challenge addressed is how to enable a server to perform filtering and matching operations on user data while maintaining a strict zero-knowledge principle, meaning the server never sees the plaintext data. This is crucial for applications dealing with highly sensitive user information, such as health status or personal preferences, where privacy is paramount for user safety and trust.
In traditional systems, servers store and process plaintext user data to facilitate matching. However, for applications requiring zero-knowledge, all user data is encrypted client-side before transmission. This leaves the backend unable to perform any textual comparisons, string indexing, or standard database queries for filtering. The solution must allow the server to operate on data it doesn't understand semantically.
The proposed solution uses 32-bit integers, or bitmasks, to represent categorical user attributes. Each bit within the 32-bit integer is assigned a specific meaning by the frontend client, for example, bit 0 for 'Man', bit 1 for 'Woman', bit 2 for 'Single', etc. The server only receives and stores these integers, completely unaware of the semantic meaning of individual bits or the overall mask.
Matching logic is then performed using simple bitwise AND operations. If a user's preference mask has a bit set that overlaps with another user's identity mask, and vice versa, a match is found. This is a highly efficient CPU instruction, allowing for rapid filtering of thousands of profiles.
ISeeThem = (MyPIntent band OtherIIntent) =/= 0,
TheySeeMe = (OtherPIntent band MyIIntent) =/= 0,
IsMatch = ISeeThem andalso TheySeeMe.Privacy Guarantees
The server's inability to reverse-engineer the integers or access the bit-to-attribute dictionary ensures strong privacy. Even if the database is compromised, an attacker would only get encrypted blobs and meaningless integers, protecting sensitive user information.
The architecture involves a frontend (e.g., Vue + TweetNaCl) responsible for collecting user data, generating and encrypting bitmasks, and encrypting full profile data. The backend (e.g., Erlang + Mnesia) stores these encrypted blobs and the bitmasks, performing only bitwise operations for matching. This clear separation of concerns ensures that sensitive data never resides in plaintext on the server.