About Exact Data Match (EDM)

Using Exact Data Match (EDM) index templates allows the Zscaler service to identify a record from a structured data source that matches predefined criteria. For example, your organization might want to protect personally identifiable information (PII) from being lost or might want to give your employees the ability to share their own PII data using a personal email or file sharing account. In either case, identifying and correlating multiple tokens that contribute to a particular record, to identify ownership of that data, is crucial.

In the Index Tool, creating an EDM template allows you to define these tokens (i.e., criteria) for your data records by importing a CSV file. Once the data is defined and submitted, you can then apply the template to a custom DLP dictionary or engine, which will use the criteria to match against your records. The Zscaler service will then evaluate the EDM-defined DLP rule with the appropriate action for your outbound traffic.

Workflow Diagram for Exact Data Match Index Template Creation

To learn more, see About the Index Tool and Creating an Exact Data Match Template.

Understanding Exact Data Match Index Templates

When creating an EDM index template, you must define your tokens (i.e., criteria) for your data records, and specify at least one primary field. The primary field is a unique key that your DLP policy rules are based on. It is a required field that must be unique based on your data records.

Evaluating Your Data and DLP Policies

Zscaler recommends considering the following before creating an EDM index template:

  • Review the DLP policy you want to create and the data you want to protect.
  • As you review the DLP policy, consider the data that must be included in your EDM index template.
  • Try to create a template where your data records need to be indexed once, and avoid the need to re-index whenever possible.
  • Review your data records to avoid potential duplication.

Let's use the following example:

You're a bank with an employee database and you want to protect your employees' personally identifiable information (PII) as well as their company credit card information.

Your database records contain the following data fields: First Name (FName), Last Name (LName), Social Security Number (SSN), Credit Card Number (CCN), Mobile Phone Number, Postal Code, Street Address, and so on.

The DLP dictionaries or engines you need to create within ZIA, which can then be used in your DLP policies, must cover a series of field combinations to adequately protect your employees' information. So, based on your records in this example, any of the following data field combinations could be used to create a DLP dictionary:

  • SSN, FName, LName
  • CCN, FName, LName
  • SSN, CCN, LName
  • SSN, CCN, FName, LName

However, the EDM index template you create using the Index Tool must allow the dictionary to cover the field combination you require. You can do this by selecting a primary field based on the data field combination you need.

Identifying Primary Fields Within Your EDM Index Template

Using the example described in the previous section, specifying a primary field allows you to create a single EDM index template to protect your employees' information, where:

  • all of the data field combinations you require for an employee PII DLP dictionary and associated policies are covered.
  • all of the data field combinations you require for a credit card DLP dictionary and associated policies are covered, whenever a company credit card is issued to an employee.
  • your employee data records only need to be indexed once.

So, using the Index Tool, you would create an EDM index template that includes the following fields:

  • SSN
  • CCN
  • FName
  • LName

In order to create the employee PII DLP dictionary you require, you'd select SSN as a primary field. However, in order to create the company-issued employee credit card DLP dictionary using the same template, you also need to select CCN as a 2nd primary field. The other included fields (i.e., FName, LName) will be applied as Secondary Fields for both dictionaries.

Finally, in this example, BankNum is not a required data field for the DLP policies we want to create later so it is not included within the template.

Zscaler Index Tool on EDM Index Template field selection page

When you create the PII-specific DLP dictionary within the Admin Portal, it needs to cover data field combination that includes: SSN with FName and LName. So you would select Exact Data Match as your Dictionary Type, and add the following definition based on the EDM index template you created:

  • Employee PII Definition
    • SSN selected as the Primary Field
    • FName and LName selected as the Secondary Fields
    • All Fields selected for Secondary Match On

Add DLP Dictionary window with Exact Data Match selected

The Zscaler service assumes an AND Boolean operation between the Primary Field and the Secondary Fields. Also, for Secondary Match On, "Any" (i.e., Any 1 Field, Any 2 Fields) assumes an OR operation, while "All" (i.e., All Fields) assumes an AND operation.

In other words, the definition above is evaluated by the Zscaler service as: "SSN AND (FName AND LName)".

Because CCN is included as your 2nd primary field within the EDM index template, you can now use the same template for your company credit card-specific DLP dictionary. In this example, it needs to cover the data field combination that includes: CCN with FName or LName. So, you can add the following definition without having to re-index your employees' data records:

  • Company-issued Employee CCN Definition
    • CCN selected as the Primary Field
    • FName and LName selected as the Secondary Fields
    • Any 1 Field selected for Secondary Match On

Add DLP Dictionary window with Exact Data Match selected

In other words, the definition above is evaluated by the Zscaler service as: "CCN AND (FName OR LName)".

Using these two DLP dictionaries, the service can then evaluate the following data field combinations when they are added to a DLP engine, which is used to create your DLP policy rules:

  • SSN AND (FName AND LName)
  • CCN AND (FName OR LName)

To learn more, see About DLP Engines.