Menu
Netflix Tech Blog·June 19, 2026

Netflix Data Projects: Scalable Identity and Access Management for Data Assets

This article discusses Netflix's Data Projects, a system designed to address the challenges of managing data asset permissions and workload identities at Netflix's scale. It introduces a "project" as a logical container for related data assets, providing a durable, synthetic identity for scheduled workloads and simplifying access control through role-based grants, thus solving issues caused by fine-grained ACLs and human-tied identities in a dynamic organization.

Read original on Netflix Tech Blog

The Challenge of Managing Data Assets at Scale

At Netflix's immense scale, managing millions of tables and tens of thousands of scheduled workloads presented significant challenges in identity and access management. The traditional approach involved fine-grained Access Control Lists (ACLs) per individual asset and workloads running under human identities. This model proved unsustainable and led to two main problems:

  • Permissions not keeping up with organizational changes: Frequent team restructures necessitated manual updates of hundreds of ACLs, leading to support team overload or, worse, overly broad access grants to avoid maintenance overhead.
  • Workloads tied to human identities: When engineers changed roles or left the company, their associated workflows failed due to permission changes, resulting in a "permissions whack-a-mole" and operational fragility.

Introducing Data Projects: A Unified Approach

Netflix's solution, Data Projects, introduces a new abstraction layer to manage data assets and identities. A Data Project serves two primary functions:

  • Container for related assets: It logically groups tables, workflows, and other data assets under a single umbrella, allowing management at a project level rather than individual asset level.
  • Durable, synthetic identity: Each project is provisioned with a Netflix application identity (and optionally an AWS IAM role) that scheduled workloads can execute under. This identity is independent of any human's lifecycle, ensuring stability and audibility.
ℹ️

Key Architectural Concept: Hoisting Granularity

The core architectural shift here is hoisting the granularity of management from individual assets to a logical container (the project). This simplifies permission management from hundreds of individual ACLs to a single set of project-level roles and grants. This pattern can be applied to many system design challenges where fine-grained, individual management becomes unwieldy at scale.

Gravity: Automatic Asset Association

A crucial feature is "gravity": when a workload running under a project's identity creates new assets (e.g., tables), those assets are automatically added to and contained within the project. This inherent association eliminates the need for manual configuration, ensuring that assets inherit the project's access controls and remain organized without additional effort. It's a powerful mechanism for ensuring consistency and reducing operational burden.

Securing Workflows with Durable Identity

For workflow orchestrators like Netflix's Maestro, Data Projects provide a robust solution to the fragility of user-tied identities. Workflows now run under the project's durable application identity, which doesn't change or leave the company. This ensures that permissions are stable, auditable, and persist through organizational shifts. It also enables consistent access management for created assets and scoped secrets, leading to resilient data pipelines.

identity managementaccess controldata platformworkflow orchestrationscalabilitypermission managementmicroservicesorganizational changes

Comments

Loading comments...
Netflix Data Projects: Scalable Identity and Access Management for Data Assets | SysDesAi