Identity resolution—what Salesforce now markets as harmonization—is the function that turns a pile of source records into a unified profile graph. It is the headline value proposition of Data Cloud and, simultaneously, the single largest source of unexpected consumption costs that customers encounter in the first 12 months of a Data Cloud deployment.
The proposal narrative is straightforward: Data Cloud ingests records from your source systems, runs match rules across them, and produces a unified profile that downstream applications consume. The credit consumption mechanics behind that narrative are anything but straightforward. Harmonization is metered on a combination of records processed, match rules evaluated, and the frequency at which the unification job runs. Each of those dimensions can be aggressively over-specified by an enthusiastic implementation team, with little visibility into the financial consequences until the first quarterly true-up.
How harmonization actually consumes credits
Data Cloud harmonization is built on a small set of primitives. Source records are ingested into a Data Lake Object. Match rules—written as either deterministic key matches or as fuzzy probabilistic rules using attributes like name, email, phone, and address—are run against the ingested data. The output is a Unified Individual or Unified Account record, depending on the entity model. Each step in this pipeline accrues credits, and the credits accrue at different rates.
| Step | Credit basis | Typical share of harmonization spend |
|---|---|---|
| Ingestion | Per row processed | 15-20% |
| Match rule evaluation | Per rule per row | 35-45% |
| Profile materialization | Per unified profile created/updated | 20-30% |
| Activation prep | Per profile segment | 10-15% |
The match rule evaluation step is the dominant cost in nearly every customer environment. The cost is roughly linear in the number of match rules times the number of rows evaluated, with a multiplier for the complexity of the rule. Fuzzy probabilistic rules cost meaningfully more than exact deterministic rules.
The four sources of harmonization overspend
1. Too many match rules
Implementation partners frequently configure Data Cloud with 12 to 25 match rules across an entity, because Salesforce's documentation encourages broad rule coverage. In production, fewer than half of those rules typically contribute meaningful matches; the rest fire constantly and consume credits without changing the unified profile. A periodic rule effectiveness review—measuring matches contributed per rule per period—typically allows pruning 30-50% of rules with no loss of match quality.
2. Unification cadence set higher than the business need
Unification jobs can be configured to run on schedules ranging from on-demand to every fifteen minutes. The default in many implementations is hourly. For most customers, daily unification is operationally sufficient—the use cases that genuinely require sub-hour identity resolution are rare and well-bounded. Stepping the cadence from hourly to daily cuts unification credit consumption by approximately 90% without affecting most downstream use cases.
3. Full rebuilds when incremental would do
Data Cloud supports both full rebuilds and incremental unification. Full rebuilds are required occasionally—after a major rule change, for instance—but in steady state, incremental is the right pattern. Many implementations default to full rebuilds because it is simpler to reason about. The credit consumption difference is approximately 8-15x in favor of incremental.
4. Stale source connections
Source system connections frequently outlive their usefulness. A connection set up for a one-time data migration project, never decommissioned, can quietly continue to ingest records and trigger unification credits months or years later. The audit of active connections is a high-yield, low-effort optimization in nearly every customer environment.
How to negotiate the contract
Cap the unit price, not the total credits
Salesforce typically negotiates Data Cloud as a committed credit pool—X million credits per year for Y dollars. The headline negotiation is on the dollars-per-credit unit price. That negotiation is necessary but not sufficient. Equally important is capping the unit price for the next renewal term, in dollars per credit, with a maximum permitted uplift in writing. Salesforce can and does adjust credit unit pricing at renewal; the order form should foreclose that path.
Negotiate a true-down right
The most consistent surprise in Data Cloud first renewals is the realization that the committed credit pool was significantly over-sized. Customers commit to capacity for an aspirational ingestion volume, ingest less than projected, and arrive at renewal with a credit pool that is 30-50% larger than they need. A true-down right—the contractual ability to reduce the committed quantity at the anniversary based on actual usage—is the most valuable single clause to negotiate. Salesforce will resist it; customers who tie the true-down to a multi-year commitment and a credible competitive baseline get a version of it granted in roughly half of recent deals.
Decouple harmonization from activation
Data Cloud's pricing model bundles harmonization and activation into the same credit pool, which obscures the unit economics of each. Insist that the order form specify, by line item, the expected credit consumption for harmonization versus activation. That visibility allows you to optimize each independently and to negotiate each independently at the next renewal.
What good looks like operationally
The Data Cloud deployments with the healthiest harmonization economics share a small set of operational practices. A monthly rule effectiveness review measures matches contributed per rule and prunes rules that do not earn their credit consumption. The unification cadence is set deliberately per entity, with documented justification for any cadence faster than daily. Incremental unification is the default; full rebuilds are scheduled and approved events, not steady-state behavior. Source connections are inventoried quarterly, with explicit owner sign-off on each active connection's business value.
None of these practices are technically demanding. They are governance practices that turn out to have meaningful financial consequence in a consumption-priced product. The customers we advise who treat Data Cloud harmonization with the same operational discipline they apply to other consumption-priced infrastructure—cloud compute, observability tooling, search infrastructure—achieve the 30-50% cost reductions consistently. The customers who treat it as a CRM module and leave the defaults in place do not.
Benchmark data
For a mid-market enterprise with approximately 5 million unified individuals and a mature multi-source data environment, the median negotiated annual Data Cloud spend lands in the range of $250,000-$420,000. Harmonization typically represents $90,000-$220,000 of that spend. Top-quartile buyers, who instrumented harmonization usage and renegotiated with usage data in hand, sit in the $55,000-$120,000 range for the same scope. The difference is not contract terms alone; it is contract terms plus the operational discipline described above.
Where to begin
If you have a Data Cloud implementation in production today, the most immediately useful step is instrumentation. Stand up a credit consumption dashboard segmented by harmonization, activation, ingestion, and identity resolution. Pull the rule effectiveness data and rank match rules by matches contributed per credit consumed. Audit the active source connections and flag any without a clear business owner. Those three artifacts, taken to the next renewal conversation, are worth more in negotiating leverage than any pure commercial argument. The 34% average reduction we see across consumption-priced Salesforce contracts is built on usage data, not on persuasion.
The deterministic-vs-probabilistic match question
One of the most consequential decisions in a Data Cloud harmonization configuration is the mix of deterministic and probabilistic match rules. Deterministic rules match on exact key fields—email address, customer ID, phone number. Probabilistic rules match on fuzzy attributes—name plus address, phone plus partial email, behavioral fingerprints. Each rule type has different cost characteristics and different operational implications.
Deterministic rules are cheap to evaluate and produce high-confidence matches but cover only the population where the deterministic keys overlap across source systems. Probabilistic rules are expensive to evaluate—often 3-5x the credit cost of deterministic rules—but cover the long tail of records that lack matching deterministic keys. The optimal mix depends on the underlying data quality and the operational requirement for match coverage.
Many implementations default to running probabilistic rules across the entire dataset, which is operationally tidy but expensive. A tiered approach—deterministic rules across the entire dataset, probabilistic rules only against the residual that did not match deterministically—typically reduces total harmonization credit consumption by 25-40% with no loss in match coverage. The tiering is a configuration choice in Data Cloud, not a contract negotiation, but the savings show up in the credit pool consumption, which compounds into a smaller required commitment at the next renewal.
The entity-graph design question
Data Cloud supports unified profile construction at the Individual entity level and at the Account entity level, with the option to model the relationship between them. The richness of the entity graph design has cost consequences. A customer who models Individuals, Accounts, the Account-to-Individual relationship, and the Account hierarchy creates more harmonization work than a customer who models Individuals alone.
The richer model is usually the operationally correct choice for B2B use cases, but the cost should be planned for. Each entity in the unified profile model has its own harmonization run, its own match rule set, and its own credit consumption profile. The Data Cloud capacity sizing should reflect the full entity-graph scope, not just the headline Individual count.
The third-party data ingestion pattern
Many Data Cloud deployments ingest data from third-party sources—data appends, intent data, firmographic enrichment, behavioral signal vendors—in addition to first-party source systems. Third-party data ingestion has two cost implications. First, the volume of records ingested is typically much larger than the first-party volume; for some intent-data vendors, the daily ingestion volume exceeds the customer's entire first-party record count. Second, the harmonization of third-party data against the first-party profile graph is typically run on a more aggressive cadence to maintain freshness, which compounds the credit consumption.
Customers planning to ingest meaningful third-party data should size the Data Cloud capacity against the combined volume, not just the first-party volume. The cost shadow of third-party ingestion is one of the most consistent surprises in second-year Data Cloud bills.
The cross-region replication question
Data Cloud tenants increasingly need to support multi-region operations—a North American tenant, a European tenant, an Asia-Pacific tenant—with selective data replication between them. The cross-region replication has its own credit consumption pattern and, in some configurations, its own networking cost.
For customers with multi-region requirements, the right contract structure is usually a single Data Cloud capacity pool with explicit allocation rules for each regional tenant, rather than separate per-region pools that may not flex against each other. The fungibility is meaningful: regional volumes vary, and the ability to shift capacity from a quieter region to a busier one without an order-form amendment is a real operational benefit.
The activation-target mix
Harmonization credit consumption is also influenced by the activation-target mix on the downstream side. Each activation target has its own data freshness requirement, which interacts with the harmonization cadence. A Marketing Cloud journey may need profile data refreshed nightly; a real-time scoring API may need it refreshed in minutes; a paid-media destination may need it refreshed weekly. The harmonization cadence should be set against the tightest active requirement, not against every requirement; using a single tight cadence for all use cases wastes credits on use cases that did not need the freshness.
A common pattern in mature deployments is to run two harmonization cadences in parallel: a slow cadence (daily or weekly) for the bulk profile, and a fast cadence (minutes or seconds) for a defined subset of profile attributes that genuinely need real-time freshness. The bifurcated approach can reduce total harmonization credit consumption by 40-50% versus a single fast cadence, while preserving the freshness for the use cases that actually require it.
The maturity curve
The Data Cloud deployments with the healthiest harmonization economics follow a recognizable maturity curve. In Year 1, the focus is on getting the deployment operational, with sub-optimal but functional configuration choices. In Year 2, instrumentation matures and the optimization patterns described above start to be applied. In Year 3, the deployment runs at near-optimal harmonization cost, and the negotiation conversation shifts from "how much capacity do we need" to "how do we use the renewal to lock the unit economics we have achieved." Customers who plan the contract structure to accommodate this curve—shorter initial term, true-down rights, explicit measurement obligations—land much better than customers who lock long-term commitments before the optimization work has begun.