prashant.dhingra.website
Field Guide · Data & Privacy Engineering · Updated June 2026

Analytics on data you can't see

Encrypted data analytics This guide covers a variety of privacy-enhancing technologies that allow you to work with sensitive data while minimizing plaintext exposure. No one technique is superior, which is why this guide explores eight different options, matches them to threat models, and provides advice on how to select and integrate them.

PD By Prashant Dhingra ~26 min read 8 techniques cataloged Primary sources ↓
← reduced plaintext exposurestrict cryptographic opacity →
TEE
attested hardware
OPE/ORE
order revealed
Search.
bounded leakage
DP
output control
FE
f(x) only
MPC
split trust
PHE
narrow arith.
FHE
ciphertext math
Key takeaways
  • "Encrypted analytics" is not binary. Systems vary in their capabilities to defend against different threats (such as cloud operators, collaborators, DBAs, output inference, side channels, collusion) and in their computational abilities.
  • The choice is threat-model-first, not feature-first select the least powerful tool that effectively mitigates the specific threat you are concerned with.
  • TEEs give the broadest functionality at near-native speed; MPC suits multi-party analytics; FHE removes the most server trust; searchable/OPE trade leakage for query speed; FE is narrow; DP guards the outputs.
  • The dominant 2026 production pattern is hybridization confidential computing for overall execution, cryptographic sub-protocols for sensitive operations, searchable encryption for query capabilities, and differential privacy for data release.
  • PETs are complementary safeguards, not compliance exemptions :: pseudonymised data is still personal data under GDPR.

01 :: ScopeDefinitions and scope

Encrypted analytics, in a strict cryptographic sense, encompasses homomorphic encryption, MPC, structured/searchable encryption, private set operations, and functional encryption. Additionally, in industry practice, this category expands to include trusted execution environments and confidential computing.

Encrypted data analytics

Any architectural design that enables valuable analytics, search, collaboration, or machine learning on sensitive information, while also ensuring that the processing environment cannot access the data in plain text. This can be achieved through computations on encrypted data or secret sharing, conducting queries on secured indexes, executing on hardware with attestation, or releasing output under differential privacy control.

A useful distinction is between strict cryptographic opacity and reduced plaintext exposureFHE, MPC, PIR/PSI, and many functional-encryption constructions focus on achieving the former goal, while TEEs aim to achieve the latter by ensuring data remains plaintext. inside an established enclave or secure VM, designed to keep the operator, hypervisor, and surrounding platform outside of the trust boundary, is essential. Simply encrypting data at rest or in transit is not sufficient to meet requirements. The analytical engine continues to handle plain text in its regular memory.

NIST's Privacy-Enhancing Cryptography project provides a solid base: MPC allows untrusting parties to work with private inputs; FHE computes functions on encrypted data; PIR retrieves items without exposing the query; structured encryption enables private queries on encrypted data structures; and functional encryption is included in the list of related tools. The scope of 'analytics' is extensive, including aggregation, ML inference and training (where possible), SQL-like filtering and selected joins, encrypted data search, and cross-party linkage like private join and compute.

02 :: Threats & LawThreat models and regulatory constraints

The first question is who is the adversarySome examples of potential threats include a trustworthy yet inquisitive cloud provider, a deceitful operator or hypervisor, coordinated input parties in an MPC process, a database administrator with access to storage and logs, a side-channel attacker monitoring memory and microarchitectural effects, and an analyst attempting membership or reconstruction attacks. Various Privacy Enhancing Technologies focus on different aspects of these threats.

  • Cryptographic techniques Depend on the hardness of lattice or number theory problems and the amount of leakage intentionally allowed by the scheme.
  • MPC depends on the corruption model and collusion threshold: passive versus malicious, majority that is honest versus dishonest, and behavior when aborting.
  • TEEs utilize a more focused operational framework: restrict key or plaintext access to only verified workloads, excluding hardware/firmware vulnerabilities, side channels, and vendor supply-chain risks.

Regulation is nuanced. Under GDPR, pseudonymisation is still processing of personal dataArticle 25 mandates data protection through design and default, while Article 32 mentions pseudonymisation and encryption as key measures. The ICO's PET guidance emphasizes how PETs enhance data security and minimization. not a silver bulletAdditionally, U.S. sectoral regulations emphasize the need for lawful, fair, and transparent processing, as well as requiring a case-by-case Data Protection Impact Assessment (DPIA). For example, HIPAA references NIST controls, the FTC emphasizes the importance of understanding data flows and avoiding deceptive privacy claims, and the GLBA Safeguards Rule specifically focuses on a risk-based approach.

The recurring lesson

Encrypted analytics can aid in reducing risks, minimizing processing, enhancing breach resilience, and facilitating cross-organizational sharing. However, it may not fully absolve responsibilities related to purpose limitation, transparency, data-subject rights, retention, or transfers, particularly if a controller retains the ability to decrypt outputs or reassociate results with specific individuals.

03 :: CatalogThe technique catalog

Eight families, each excelling in a unique profession. The colored spine of each card corresponds to the spectrum strip at the top of the page.

01Partial HE (PHE)
additive homomorphism · low functionality
ComputesCounts, sums, weighted sums, and private billing are frequently incorporated within broader protocols, ensuring secure aggregation.
CaveatAvoid using arbitrary comparisons, joins, or general SQL statements unless they are combined with other basic operations.
python-paillier · LightPHE
02Fully HE (FHE)
ciphertext computation · strongest server-trust reduction
ComputesAggregation, vector operations, similarity searching, low-depth arithmetic, and selected machine learning inference are enhanced in multiparty variants to facilitate collaboration.
CaveatBootstrapping bottleneck slows progress; SQL joins and training still in research/prototype stage.
OpenFHE, SEAL, HElib, Lattigo, TFHE-rs, and Concrete ML are all encryption
03Secure MPC
distributed trust · multi-party
ComputesAggregations, histograms, PSI, private joins, overlap-and-sum, federated analytics, and partitioned
CaveatBound by network and round limitations, complex to manage, security compromised when multiple parties collude.
MP-SPDZ · MPyC · MOTION · EMP · ABY3 · SecretFlow
04TEE / Confidential computing
attested hardware · broadest functionality
ComputesSQL, joins, arbitrary code, conventional databases, ML training/inference :: often near-native speed (DuckDB-SGX2: <2× on TPC-H SF30).
CaveatBigger TCB; vulnerabilities in side channels (SGX.Fail, SEV-SNP 'Fabricked' 2026); requires
SGX, Nitro Enclaves, SEV-SNP, TDX, Confidential Space, and Gramine are all technologies
05Searchable encryption
encrypted indexes · queryable
ComputesEquality search, keyword/document retrieval, selected range/prefix/suffix queries over protected indexes.
CaveatExposed information such as: search/access patterns, frequency, result size, and structure can be exploited by attackers for leakage-abuse attacks.
MongoDB Queryable Encryption, OpenSSE, CipherSweet, Cosmian Findex, and CipherStash are
06OPE / ORE
order-revealing · range-native
ComputesFast and easy indexing with sorting, range filters, thresholding, and ORDER BY-like functionality.
CaveatOrder is revealed through design; inference attacks can retrieve significant plaintext with auxiliary data.
CipherStash ore.rs · Block-ORE constructions
07Functional encryption
keys bound to functions · f(x) only
ComputesAnalysts are taught about inner products, chosen linear/quadratic functions, scoring mechanisms, and machine learning components as they study f(x), without delving
CaveatLimited functionality, minimal tooling, requiring strong key authority :: showing potential but lacking maturity for enterprise use.
Fentec libraries · research prototypes
08DP + encrypted execution
output control · pairs with any PET
ComputesNot a calculation method :: it limits the information that can be disclosed about an individual in the released statistics or model. Works well with secure aggregation.
CaveatPrivacy accounting, contribution bounding, sampling assumptions, and utility loss are challenging aspects.
OpenDP, Google's secure aggregation, AWS Clean Rooms DP, and Prio/DAP are all privacy-preserving data

04 :: TradeoffsComparative tradeoffs

A qualitative synthesis"Security level is not a concrete benchmark, but rather a measure of trust removed from the execution environment." when the stated assumptions hold.

TechniqueSecurity levelSupported analyticsPerformanceBest fitPrimary caveat
Partial HEHigh for narrow arithmeticCounts, sums, weighted sums, aggregationHigh (no bootstrapping)Simple outsourced arithmeticFunctionality too narrow for rich queries
Full HEVery high trust reductionAggregation, vector ops, similarity, ML inferenceLow–medium; often slowestSingle-owner outsourced computeCiphertext blow-up, slow bootstrapping, limited SQL/training
MPCVery high within collusion thresholdsAggregation, joins, PSI/PJC, partitioned MLMedium; network/round-boundCross-org collaboration, no trusted hardwareOperational complexity & collusion assumptions
TEE / confidential computingHigh if HW/firmware/attestation holdBroadest: SQL, joins, arbitrary code, MLHigh; often closest to nativeLift-and-shift confidential analyticsSide channels, larger TCB, HW vulnerabilities
Searchable encryptionMedium–high, leakage-proneEquality, keyword, some range/prefix/suffixHighQueryable encrypted databasesSearch/access/frequency leakage
OPE / ORELow–medium (order leaks)Sorting, range filters, thresholdingVery highFast range search when leakage acceptableInference attacks recover plaintext structure
Functional encryptionHigh for supported functionsInner products, selected linear/quadraticMedium for narrow tasksFine-grained delegated analyticsNarrow functionality, low ecosystem maturity
DP + encrypted executionHigh vs output inferenceAggregates, telemetry, federated learningHigh for DP stepSafe result release after protected computeUtility/privacy tradeoff & budget accounting
HybridPotentially strongest overallBroadest practical coverageMedium–high if well partitionedReal-world enterprise deploymentsCompositional proofs & operating complexity

← swipe the table to see all columns →

05 :: HybridHybrid architectures

Hybrids are becoming the standard in production as they match the right technique to the specific task. SecretFlow integrates MPC, HE, and TEE into a single framework; Duality's AWS project incorporates Nitro Enclaves in addition to previous FHE, federated learning, and DP techniques; Decentriq combines Azure confidential computing with various privacy technologies, such as DP. A properly partitioned stack appears as follows:

Searchable
narrow lookup
TEE
general SQL / serving
MPC
cross-party joins
FHE/PHE
sensitive arithmetic
DP
anything released

The value is clear: it often surpasses any individual primitive in achieving the joint goal of security, functionality, and cost. The drawback is just as clear. security proofs become compositional rather than monolithicAs each layer is added, the operational complexity increases significantly due to the introduction of new assumptions, observability requirements, and potential failure scenarios.

06 :: DeploymentsDeployments and vendor landscape

Confidential-computing and clean-room deployments show the most production maturity today, while encrypted analytics with broad functionality is currently the leader in practice. TEE-centric and hybrid architectures.

Confidential computing & clean rooms

Google Confidential Space (and Google Ads confidential matching), Azure Confidential Computing with Decentriq clean rooms, and AWS Nitro Enclaves with Duality (including cross-border cancer research).

Pure cryptography (FHE/MPC)

IBM HElayers (an Intesa Sanpaolo digital-transaction deployment), Duality for healthcare/finance/government, Zama (TFHE-rs, Concrete, Concrete ML), and Inpher for MPC/HE/federated learning.

Queryable encrypted databases

MongoDB Queryable Encryption :: the most prominent mainstream example :: plus CipherSweet, OpenSSE, Cosmian Findex, and CipherStash for application-level searchable encryption.

Private aggregation & telemetry

Prio and descendants: Mozilla's Prio-based DAP in Firefox and Divvi Up (Prio3), plus Google federated learning with secure aggregation and AWS Clean Rooms Differential Privacy.

In terms of cryptography, the market is real but highly selective. MongoDB Queryable Encryption stands out as a prime example of a queryable database, with support for equality and range queries in production, while prefix/suffix/substring queries are still in public preview in version 8.2 (expected General Availability in 2026). MongoDB also highlights the tangible costs associated with queryability, including additional storage requirements, impacts on query performance, and reduced observability due to the redaction of logs in encrypted collections.

07 :: Signals2026 signals

Recently updated

What's moving right now

  • Confidential computing went to the GPU. NVIDIA's Confidential Computing on Hopper and Blackwell GPUs enables practical TEE-based private AI inference and training with encrypted VRAM, attestable alongside a CPU TEE. This marks a significant advancement for the wide-ranging capabilities of this technology.
  • TEE risk kept evolving. The 2024 SGX.Fail and the 2026 'Fabricked' SEV-SNP incidents highlight the need to reassess TEE security in light of evolving hardware, attestation, and patch recommendations.
  • FHE commercialized and accelerated. Zama achieved unicorn status in 2025 as the first FHE company, aiming for 500-1,000 TPS using GPU technology. The FHE Benchmarking Suite has evolved into a widely-accepted method for evaluating latency, throughput, memory usage, storage expansion, communication, and accuracy degradation.
  • Queryable encryption broadened. The MongoDB QE team introduced production range queries and transitioned prefix/suffix/substring into public preview (8.2), aiming to achieve general availability by 2026 and bridging the divide between 'encrypted at rest' and 'queryable in use.'
  • Private telemetry scaled. Deployments of Prio/DAP (Firefox, Divvi Up) demonstrate that encrypted analytics goes beyond just databases and model serving to include secure measurement at a large scale within populations.

08 :: ChooseSelection criteria & deployment checklist

The initial factor for decision-making is neither the vendor nor the algorithm; it is the. trust boundary you are trying to move. The second is workload shape. A concise decision rule: Choose the least powerful tool that effectively neutralizes the threat you are concerned with., and instead of making one primitive do everything, try combining two or more.

  • Don't rely on the cloud operator, trust in hardware roots and utilize the extensive software available. TEEs.
  • Multiple organizations, no single operator may see data → MPC / PJC.
  • One owner outsourcing computation, server fully untrusted → FHE / PHE.
  • Mostly equality/range retrieval in a database → searchable / queryable encryption.
  • Sensitive outputs, not just inputs → add differential privacy.

Stage-gate deployment checklist

StageWhat to doPass condition
Problem framingClassify data, outputs, parties, and exact operatorsYou know if it's aggregation, search, join, inference, or training
Threat modelWrite down adversaries, collusion assumptions, unacceptable leakagesNamed threat model approved by security/legal
Technique shortlistMap workload to 2–3 candidate architectures≥1 cryptographic and ≥1 operationally efficient option considered
Key & identity designDefine key custody, attestation flow, or share-holder governanceKeys/shares are never ad hoc
PrototypeBenchmark on representative data at realistic security levelsMeets p95 latency, throughput, and cost guardrails
Leakage reviewDocument observable metadata, patterns, or outputsExplicit acceptance or rejection of the leakage profile
Release controlsInclude DP, quotas, or query governance in case the outcomes exceed the limit.Output policy defined and testable
Red-team & complianceTest side channels, patching, logging, legal claimsFindings resolved before rollout

← swipe the table →

09 :: MetricsWhat to actually measure

The benchmarking program needs to be clear and explicit, with the FHE Benchmarking Suite serving as a solid example. latency, throughput, memory, storage expansion, communication complexity, and quality loss. Extend it per technique:

  • TEE-based SQL verification time, EPC/enclave paging actions, cache miss escalation, noticeable overhead in practical OLAP scenarios.
  • Searchable encryption index size, query selectivity, cost of token generation, and a well-documented leakage profile.
  • DP-based releases epsilon, delta, contribution bounding, privacy-budget depletion rate, and loss of utility.

An effective cross-technique suite should consist of a minimum of five workload families: aggregations on wide tables, private join / PSI-plus-sum on skewed identifiers, search with equality and range predicates, SQL analytics on a TPC-H-like subset with one or two joins, and ML with one classical and one compact neural model. Each workload should be evaluated based on metrics such as p50/p95 latency, throughput, ciphertext/share expansion, network bytes, RAM/VRAM usage, accuracy degradation, deployment time, and operator effort. If a solution requires custom parameter adjustments or specialized circuits that your team cannot support, consider this a significant cost factor, not just a minor detail.

10 :: LimitsOpen questions and limitations

Certain sections of this area move at a fast pace, making it difficult for a stationary marker to provide accurate guidance. General-purpose FHE for SQL and large-model training Improvements have been made, but the most reliable evidence still suggests a preference for selective inference and limited analytics over using encrypted datastores. Searchable-encryption leakage The issue of 'acceptable leakage' continues to be a design flaw that is open for debate, as quantification remains a challenge and different perspectives from vendors and academia often clash. TEE risk is not stable, as recent SGX and SEV-SNP results show. And functional encryption Production in this area still lacks sufficient evidence compared to other fields, suggesting that focused pilot projects may be more effective than large-scale commitments unless the function family is exceptionally well-suited.

11 :: FAQFrequently asked questions

What is encrypted data analytics? +
Privacy-enhancing technologies such as homomorphic encryption, MPC, searchable/structured encryption, private set operations, functional encryption, and trusted execution environments allow for computation over sensitive data while minimizing ordinary plaintext exposure. Plain encryption at rest or in transit does not provide the same level of protection, as the analytic engine still processes plaintext in normal memory.
Which PET should I choose for analytics? +
Prioritize threat modeling when selecting tools: TEEs for fast analytics with a trusted hardware root; MPC for collaborative analysis across multiple organizations without a trusted operator; FHE for outsourced data with an untrusted server; searchable/queryable encryption for specific search needs; OPE/ORE for quick range searches with acceptable order leakage; functional encryption for delegated functions; and DP for aggregate outputs beyond the trusted boundary. Choose the least powerful tool that still addresses the relevant threat.
Does encrypted analytics make data exempt from GDPR or HIPAA? +
Pseudonymisation is considered a form of personal data processing under GDPR, and PETs are additional safeguards, not a complete solution. Article 25 mandates data protection by design, while Article 32 mentions pseudonymisation and encryption as possible measures. HIPAA, the FTC, and the GLBA Safeguards Rule follow a risk-based approach. Encrypted analytics can help reduce risks and facilitate data sharing, but it does not eliminate requirements related to purpose limitation, transparency, data-subject rights, or output governance.
Is FHE practical for analytics in 2026? +
The most advanced applications include selected aggregation, vector operations, similarity search, low-depth arithmetic, and machine learning inference, rather than general OLAP. While broad SQL, joins, and large-model training over fully homomorphic encryption are still mainly in the research or prototype stages. Bootstrapping continues to be a major challenge, making single-owner outsourced computation and hybrid pipelines with FHE protecting only the most sensitive steps the most effective use cases.
What's the risk with searchable and order-preserving encryption? +
Exchanging leakage for speed, efficient searchable encryption considers permissible leakage such as search patterns, frequency, result size, and structure that can be exploited by leakage-abuse attacks. OPE/ORE explicitly disclose order, potentially leading to inference attacks with auxiliary distribution data. While both can be suitable for equality, range, and search operations, they should only be used when the leakage profile is clearly defined, acknowledged, and restricted by access controls.
What is the most common production pattern? +
Hybridization involves implementing a confidential-computing layer for overall execution, utilizing cryptographic sub-protocols (MPC/FHE) for highly sensitive processes, searchable encryption for limited query capabilities, and differential privacy for releasing results. Various frameworks and products, such as SecretFlow, Duality on AWS Nitro Enclaves, Decentriq on Azure, MongoDB Queryable Encryption, AWS Clean Rooms DP, and Prio/DAP systems, embody these principles.

12 :: SourcesPrimary sources

Derived from industry standards, regulatory requirements, vendor documentation, peer-reviewed studies, and upcoming system developments from 2024 to 202