Privacy-preserving digital ads infrastructure: An overview of Anonym’s technology
BRAD SMALLWOOD, SVP AND ANONYM CO-FOUNDER
GRAHAM MUDD, SVP OF PRODUCT AND ANONYM CO-FOUNDER
It's been four months since Anonym joined Mozilla. Anonym was founded with the belief that new technologies can keep digital ads effective and measurable while respecting privacy. Mozilla has long been a leader in digital privacy, so Anonym is happy to report that we are right at home as a key pillar in Mozilla's strategy to make digital advertising more private. As Laura discussed, while Mozilla's product teams focus on privacy-respecting advertising tools that are relevant to products like Firefox and Fakespot, we are in parallel focused on building a viable alternative infrastructure for the industry.
Now that we're settled in, we wanted to provide the advertising industry and the Mozilla community with an overview of the technologies we're developing and share a few examples of how they can be used to improve user privacy.
First, it's important for us to be clear about the specific problem we're trying to address. Digital advertising is highly reliant on user level data sharing between various industry participants. A simple example: Ad platforms collect information about the browsing and buying behavior of individuals from millions of websites and apps. That information is often associated with a user's profile" and then is used to determine which ads to show that user. This practice is referred to by a number of terms - tracking, profiling, cross-site sharing, etc.
Whatever the term, this approach typically isn't aligned with people's reasonable expectation of privacy. And it's actually not even necessary to drive ad performance. Anonym's goal is to develop a better approach for the industry.
Starting at the highest level, we believe there are a few important requirements for any privacy-preserving advertising system. The table below articulates those requirements and the approach Anonym is taking to fulfill them.
Requirement | Anonym's approach | |
Security | Data should be processed using confidential computing systems that reduce or eliminate the need to trust any party, including the operator(s) of the technology. | All data processed by Anonym is encrypted end-to-end. Data is processed in Trusted Execution Environments using Intel SGX. |
Privacy | The outputs of any privacy-preserving system should protect individuals' personal data. There must be technical guarantees that reduce or eliminate the possibility of individual's being re-identified. | Anonym provides aggregated insights and leverages differential privacy to prevent individuals from being singled out. |
Transparency | All parties involved should have source-code level transparency into how their data is being processed. | Anonym provides customers with access to detailed documentation and source code through our transparency portal. |
Scalability | Advertising is inherently high scale, involving large data sets and millions of businesses. Systems must be capable of processing billions of impressions repeatedly. | Anonym has developed a parallel computing approach using TEEs that can scale arbitrarily to any size job. Our system leverages the same algorithms repeatedly for an unlimited number of customers/campaigns, avoiding manual approval processes. |
Diving a bit deeper, the diagram below shows how data flows through Anonym's system.
- Binary Development & Approval: Before any data can be processed, Anonym develops a binary' which includes all the code for creating a Trusted Execution Environment (TEE) and all the code that will run within it. Binaries are approved by the parties contributing data - and we hope civil society will play a role in this attestation in the future. Typically, a binary is specific to a use case (e.g. attribution) and a media platform (e.g. a social network). The same binary is used by many of that media platform's customers.
- Data Encryption and Transfer: Anonym has a number of tools and methods available to encrypt and transfer data into our environment. Each partner has their own public encryption key - the private key is only available within the TEE. Since the data can't be decrypted without the private key, it is protected while in transit as well as from Anonym employee access.
- Attestation & Decryption: Once an ephemeral TEE has been created customer data is decrypted within its encrypted memory. The key needed for decryption is only available if the binary used by the TEE matches the cryptographic signature of the binary approved by the partner. This provides partners with full control over how Anonym processes their data.
- Data Processing & Differential Privacy: Data from two or more sources are joined using shared identifiers. Advertising algorithms such as attribution or lookalike models are run and differential privacy is applied to limit the risk any individual can be identified or singled out.
- Aggregated Outputs: The insights are shared with ad platforms and their customers, but no individual user data leaves the TEE. For example, Anonym's system is used to provide customers with aggregated insights such as which ad creatives are performing best, and ROI calculations for ad campaigns. These insights were previously only available if advertisers exposed user level data directly to ad platforms.
- Data & Environment Destroyed: Once the required operations are completed in the TEE, the TEE is destroyed along with all the data within it.
We hope this is a helpful overview of the system we have developed. In the coming weeks, we'll be publishing deep dives into the components described above. While we believe the system we have developed is a meaningful step forward, we will continue to improve Anonym with feedback from our customers and the privacy community. Please don't hesitate to reach out if you have questions or would like to learn more.
The post Privacy-preserving digital ads infrastructure: An overview of Anonym's technology appeared first on The Mozilla Blog.