About IPSec VPNs

Internet Protocol Security (IPSec) is a suite of protocols that provide network-layer security to a Virtual Private Network (VPN). A VPN is a virtual network connection that provides a secure communication path between two peers in a public network. The peers can be two hosts, a remote host and a network gateway, or the gateways of two networks, such as the gateway of your corporate network and a Zscaler Enforcement Node (ZEN).

IPSec provides the following types of protection:

  • Confidentiality: Ensures that data cannot be read by unauthorized parties.
  • Integrity: Verifies that data was not modified during transit.
  • Authentication: Verifies the identity of the peers.

IPSec provides a number of options for applying each type of protection. The peers in the IPSec VPN use a negotiation process called Internet Key Exchange (IKE) to define the security mechanisms they will use to protect their communications. There are two versions of IKE: Internet Key Exchange Version 1 (IKEv1) and Internet Key Exchange Version 2 (IKEv2). Zscaler recommends using IKEv2 because it's faster than IKEv1 and fixes IKEv1 vulnerabilities.

Supported IPSec VPN Parameters

Following are the supported IPSec VPN parameters for IKEv2 and IKEv1:

Zscaler Interoperability List

Following are the interoperability lists for IKEv2 and IKEv1:

About the IPSec Security Components

Following are the types of protection that IPSec provides and their corresponding algorithms:

About the IPSec Components

Following is general information about the different IPSec components:

To learn how to configure an IPSec VPN tunnel with the Zscaler service and also see a list of configuration examples, see Configuring an IPSec VPN Tunnel

Zscaler supports NAT-Traversal if the device initiating the IPSec VPN is behind another firewall or router performing NAT. Zscaler recommends disabling Perfect Forward Secrecy (PFS) for Phase 2. This option enables each Child or IPSec SA to generate a new shared secret in a Diffie-Hellman exchange.

If you use SHA-384 or SHA-512 for Phase 1 data integrity, you must use Diffie-Hellman group 14. If you use 3DES encryption for Phase 1, you must use Diffie-Hellman group 2 and SHA-1 or SHA-256 for data integrity. If you want to use AES encryption for Phase 2, you must purchase a separate subscription.

The following table shows the supported IPSec VPN parameters for IKEv2. Zscaler recommends using the bolded parameters in blue.

IKEv2 Supported Parameters
Components Phase 1 Phase 2
Confidentiality AES-256
AES-192
AES-128
3DES
Null
AES-256
AES-192
AES-128
Integrity SHA-512
SHA-384
SHA-256
SHA-1
MD5
SHA-512
SHA-384
SHA-256
SHA-1
Authentication Pre-shared key (PSK) N/A
Protocol N/A AH
ESP
Encapsulation Mode N/A Tunnel Mode
Key Exchange Method Diffie-Hellman Diffie-Hellman
Diffie-Hellman Group 2
5
14
2
5
14
Total Child SAs Supported N/A 8
SA Lifetime 24 Hours 8 Hours
SA Lifebytes Unlimited Unlimited
NAT-Traversal Enabled
Disabled
N/A
NAT Keepalive Interval 20 Seconds N/A
Dead Peer Detection (DPD) Enabled
Disabled
N/A
DPD Timeout Interval 20 Seconds N/A
DPD Maximum Retries 5 N/A
Perfect Forward Secrecy (PFS) N/A Enabled
Disabled
Maximum Transmission Unit (MTU) N/A Your Optimal MTU
1460 Bytes
Maximum Segment Size (MSS) N/A 1388 Bytes
VPN Type N/A Route-Based VPN
Policy-Based VPN

If you use a pre-shared key (PSK) for authentication and a FQDN for the peer, you must use Aggressive mode. If you use a PSK for authentication and a static IP address for the peer, you must use Main mode. You can use extended authentication (XAUTH) for remote users; however, Zscaler doesn't recommend doing so.

The following table shows the supported IPSec VPN parameters for IKEv1. Zscaler recommends using the bolded parameters in blue.

IKEv1 Supported Parameters
Components Phase 1 Phase 2
IKE Mode Main
Aggressive
Quick
Confidentiality AES-256
AES-192
AES-128
3DES
Null
AES-256
AES-192
AES-128
Integrity SHA-1 MD5
SHA-1
Authentication Pre-shared key (PSK)
RSA Digital Signature
External Authentication with PSK
External Authentication with RSA
N/A
Protocol N/A AH
ESP
Encapsulation Mode N/A Tunnel Mode
Key Exchange Method Diffie-Hellman Diffie-Hellman
Diffie-Hellman Group 2 2
Total IPSec SAs Supported N/A 8
SA Lifetime 24 Hours 8 Hours
SA Lifebytes Unlimited Unlimited
NAT-Traversal Enabled
Disabled
N/A
NAT Keepalive Interval 20 Seconds N/A
Dead Peer Detection (DPD) Enabled
Disabled
N/A
DPD Timeout Interval 20 Seconds N/A
DPD Maximum Retries 5 N/A
Perfect Forward Secrecy (PFS) N/A Enabled
Disabled
Maximum Transmission Unit (MTU) N/A Your Optimal MTU
1460 Bytes
Maximum Segment Size (MSS) N/A 1388 Bytes
VPN Type N/A Route-Based VPN
Policy-Based VPN

Zscaler has verified the following vendors and tested the software versions for IKEv2.

Vendor Model Software Version
Cisco ASA 9.0
Cisco ISR 881 15.4(3)M3
15.1.1T
Juniper SRX220 11.4R3.7
Juniper SSG 20 6.2.0r1.0

Zscaler has verified the following vendors and tested the software versions for IKEv1.

Vendor Model Software Version
Cisco ASA 9.2
9.0
8.3
8.2(5)
Cisco ISR 881 15.1(3)T
15.0
Cisco ISR 2821 12.4(16)
Juniper SSG 20 6.2.0r1.0
FortiGate 60D 5.2.1
Palo Alto Networks PA-200 4.1.16
SonicWall TZ 100 5.6.0.11-61

IPSec uses algorithms such as Advanced Encryption Standard (AES) to encrypt IP packets. These algorithms use symmetric key cryptography to provide encryption.

In this type of cryptography, the peers use the same key to encrypt and decrypt packets. When peer A sends a packet to peer B, it first encrypts the data by dividing it into blocks, and then uses the key and data blocks to perform multiple rounds of cryptographic operations. When peer B receives the packet, it uses the same key and performs the same operations in reverse order to decrypt the data.

AES has a large block size and key length and uses a 128-bit block size and keys with 128, 192, and 256 bits.

IPSec provides authentication and integrity protection through an hash message authentication code (HMAC) algorithm, such as Message Digest Algorithm-5 (MD5) or Secure Hash Algorithm (SHA). This type of algorithm generates a hash, or message digest, from the message and a key known to both peers. When peer A sends a message to peer B, it generates the hash and adds it to the packet it sends to peer B. When peer B receives the packet, it uses the shared key to generate the hash and verifies the authenticity and integrity of the packet when the two hashes match.

SHA-1 and SHA-2 are generally considered more secure than MD5 because they generate a larger hash. MD5 generates a 128-bit hash, SHA-1 generates a 160-bit hash, and SHA-2 is a set of four algorithms whose names refer to the size of the hashes they produce, that is SHA-224, SHA-256, SHA-384, and SHA-512.

IPSec peers can use the following methods to authenticate each other:

  • PSK (Pre-Shared Keys): This type of authentication uses a key that the peers agree on beforehand. The key, also known as a secret, is a text string similar to a password. Peer A uses the pe-shared key and additional data to generate a hash value. Peer B uses the same key and additional data to generate a hash value. Peer B authenticates peer A when the two hash values match. Zscaler supports PSKs for IKEv1 and IKEv2.
  • Digital Certificates: Each peer has a digital certificate that contains a public key. In this type of authentication, peer A generates a hash value and encrypts the hash with its private key. The encrypted hash is its digital signature. Peer A then sends the certificate with its digital signature to peer B. Peer B generates another hash and uses the public key to decrypt the digital signature. Peer B compares the decrypted digest with the digest it generated to verify that that the source of the message is peer A. RSA is typically used as the digital signature algorithm. Zscaler supports digital certificates for IKEv1.
  • External Authentication: This adds another layer of protection by authenticating the actual users. An external server, such as a Kerberos server or AD server is used to authenticate the user by their user ID and password. It is used in addition to one of the other authentication methods available with the Zscaler service (e.g. form-based authentication, SAML, etc.). Zscaler supports external authentication for IKEv1.

IPSec has two main protocols: Authentication Header (AH) and Encapsulating Security Payload (ESP). The IPSec peers determine which protocol they will use to encode the data packets in Phase 2 of the IKE negotiations. The selected protocol then uses the algorithms and authentication method defined in the IPSec SA to encode the data packets.

AH provides authentication and integrity protection through a keyed hash algorithm, as described in Verifies Packet Integrity. ESP encrypts IP packets as described in Ensures Confidentiality. The earlier version of ESP did not provide authentication and integrity protection, so most IPSec implementations used AH and ESP. But since the current version of ESP can also use a keyed hash algorithm to verify the authenticity and integrity of packets, most IPSec implementations use ESP, but not necessarily AH.

ESP can operate in either of two modes: transport mode or tunnel mode.

A diagram showing how packets are encoded in ESP Transport Mode and ESP Tunnel Mode.

As shown in the diagram, ESP adds a header, a trailer, and if authentication is used, an authentication section at the end. The ESP header contains a Security Parameter Index (SPI) value, which is a unique identifier, and a sequence number. The ESP trailer contains fields such as additional bytes for padding and the padding length.

In transport mode, ESP encrypts the data payload and ESP trailer. It uses the original IP header with the original source and destination IP addresses. In implementations that involve communications from or to a gateway, the source and/or destination IP addresses need to be changed to the gateway IP addresses. Since transport mode does not alter the IP header, this mode is used specifically for host-to-host communications.

In tunnel mode, ESP encapsulates the entire packet, including the original IP header. It adds a new IP header that lists the IPSec peers as the source and destination of the packet. ESP tunnel mode is used in VPNs that include at least one gateway, because the gateway address can be specified as the source and/or destination in the new IP header.

IKE is an IPSec protocol that establishes and maintains Security Associations (SAs) to protect peer communication and performs mutual authentication. There are two versions of IKE:

During the IKE negotiations, the peers agree on the Diffie-Hellman group number that they use to generate the shared key. Diffie-Hellman is a method for peers to generate a shared key in a secure manner without having to exchange shared secrets in the first place. Diffie-Hellman specifies group numbers that correspond to a key length and an encryption generator type. To learn more, see RFC 2631.

Zscaler recommends using IKEv2 because it's faster and simpler than IKEv1 and fixes IKEv1 vulnerabilities.

IKEv2 is a fast, less complicated control protocol to negotiate IPSec VPN tunnels. IKEv2 improves on IKEv1 and simplifies the SA negotiation process. IKEv2 has two initial exchanges and two later exchanges. 

The Initial Exchanges

The two initial IKEv2 exchanges are IKE_SA_INIT and IKE_AUTH. The initial exchanges are equivalent to IKEv1 Phase 1 exchange. 

The Later Exchanges

The two later IKEv2 exchanges are CREATE_CHILD_SA and INFORMATIONAL.

To learn more about IKEv2, see RFC 7296.

In the IKE_SA_INIT exchange, the peers negotiate cryptographic algorithms for the IKE SA, exchange nonces, and exchange Diffie-Hellman keys. They negotiate the following IKE SA parameters:

  • encryption algorithm
  • hash function
  • authentication method
  • Diffie-Hellman group
  • Pseudo-Random Function

The initiator sends a list of supported IKE SA proposals, its Diffie-Hellman value, and its nonce. The responder then chooses an IKE SA proposal, sends a Diffie-Hellman value to complete the Diffie-Hellman exchange, and sends its nonce. If the IKE_SA_INIT exchange is successful, the peers can independently generate the IKE SA keying information. This keying information is used in later exchanges to authenticate the peers, authenticate or encrypt IKE SA messages, and establish the first Child SA.

Unlike IKEv1, the peers can choose their own authentication method, IKE SA lifetime, and IKE SA lifesize.

This exchange has one request-response pair and isn't encrypted. After the peers establish a secure connection, all other exchanges are encrypted.

Diagram of the IKE_SA_INIT exchange

In the IKE_AUTH exchange, the peers authenticate the messages in the IKE_SE_INIT exchange, exchange their identities and certificates, and create the first Child SA.

In order to activate the IKE SA, the initiator sends an IKE_AUTH request with its identity and the authentication information defined in the IKE_SE_INIT exchange. It also includes the SA proposals and traffic selectors for the first Child SA. The initiator doesn't send the Diffie-Hellman key information or nonce in the request. The peers use the keying information and nonce defined in the IKE_SE_INIT exchange for the first Child SA. If a signature-based authentication method is used, they can exchange certificates. The responder then sends its identity information, authenticates the initiator, and activates the first Child SA. This exchange has one encrypted request-response pair.

Diagram of the IKE_AUTH exchange

The CREATE_CHILD_SA exchange is used to create new Child SAs or rekey IKE SAs and Child SAs. The peers negotiate the following SA parameters:

  • encryption algorithm
  • hash function
  • encapsulation mode
  • protocol
  • Diffie-Hellman group

The initiator sends the SA proposals, traffic selectors, and nonce in the request. When creating new Child SAs, the initiator optionally can include a Diffie-Hellman value. The responder then activates the Child SA and completes the the Diffie-Hellman exchange. Using the Diffie-Hellman exchange provides perfect forward secrecy (PFS) for the Child SA and ensures the Child SA and IKE SA are derived independently. The CREATE_CHILD_SA exchange has one encrypted request-response pair and is equivalent to the IKEv1 Phase 2 exchange (Quick mode). It repeats for every rekey or new SA.

Diagram of the CREATE_CHILD_SA exchange

In the INFORMATIONAL exchange, peers can share control messages regarding errors or notifications of specific events during the operation of an IKE SA. The messages in an INFORMATIONAL can include any of the following payloads:

  • Notification Payload: Carries error or status information regarding the SAs.
  • Delete Payload: Informs the peer that the initiator deleted one or more of its SAs. The responder must delete the SAs and send a response message with the Delete payloads.
  • Configuration Payload: Negotiates configuration data between the peers.

The INFORMATIONAL exchange message can include any combination of the payloads above. The responder must reply with a response, or the initiator assumes that the message was lost and retransmits it. The response also can be an empty message with no payloads. This is a common method for a peer to check the other peer's liveliness. This exchange has one encrypted request-response pair.

Diagram of the INFORMATIONAL exchange

IKEv1 has two phases: Phase 1 and Phase 2. In Phase 1, the peers negotiate the security parameters used to communicate for Phase 2. This first set of parameters is referred to as the ISAKMP SA. The ISAKMP SA is bi-directional, so only one SA is established for both directions of traffic. In Phase 2, the peers negotiate the parameters used to protect the actual exchange of IP packets. The second set of parameters is referred to as the IPSec SA. The IPSec SA is uni-directional, so one SA is established for each connection. To learn more, see the following:

IKEv1 Phase 1 can operate in either main mode or aggressive mode. In main mode, there are three pairs of message exchanges (i.e., six messages total), and in aggressive mode, there are three message exchanges.

  1. In the first pair of messages, the peers negotiate the following:
    • encryption algorithm
    • hash algorithm
    • authentication method
    • The Diffie-Hellman group that the peers use to generate a shared key.
    • The SA lifetime, which is the time period that an SA is valid. Peers must establish a new SA when it expires.
  2. In the second pair of messages, the peers exchange the Diffie-Hellman keys.
  3. In the third pair of messages, the two peers authenticate each other.

A network diagram of aggressive mode in IKE Phase 1.

Because Main mode uses the IP address as part of the exchange for identification, it cannot be used in a configuration where the IP address of the peer may change.

  1. In the first message, peer A sends the security parameters, its Diffie-Hellman key, a pseudo-random number and its IKE identity to peer B.
  2. In the second message, peer B confirms the security parameters, sends its Diffie-Hellman key, a pseudo-random number, its IKE identity and authentication parameters.
  3. In the third message, peer A sends its authentication parameters.

A network diagram of main mode in IKE Phase 1.

Aggressive Mode is useful when the IP address of the remote device is not known beforehand.

For Phase 1, Zscaler supports AES-128 for the encryption algorithm and SHA-1 or MD5 for the authentication algorithm. Zscaler recommends using AES-128 with SHA-1.

IKEv1 Phase 2 establishes an SA for each direction of traffic. It operates only in Quick mode, which uses three messages. The negotiations in Phase 2 are protected by the IKE SA.

The Phase 2 negotiations are similar to those in Phase 1, wherein the peers negotiate security parameters that includes the encryption and keyed hashed algorithms, and authentication method. Additionally, in this phase, the peers negotiate the IPSec protocol to be applied to the IP packets. They determine whether to use AH, ESP and AH, or ESP. As stated earlier, most VPNs today use ESP.

After the IPSec SA is established, the peers then exchange the IP packets using the security parameters defined in the IPSec SA.

For Phase 2, Zscaler supports null or AES for the encryption algorithm and MD5 for the authentication algorithm. If you'd like to use AES, you must purchase a separate subscription. Zscaler recommends using null encryption with MD5.

Network Address Translation Traversal (NAT-Traversal) provides a method for passing IPSec traffic between two peers when one peer is behind a NAT device. When NAT-Traversal is enabled, the peers detect if there is a NAT device between them and verify that they both support NAT-Traversal during the IKE Phase 1 negotiations.

After both peers determine that they support NAT-Traversal and it is required, they encapsulate the IPSec packets. The new IP header retains the data from the original IPSec packet, except that the protocol changes to User Datagram Protocol (UDP). In the UDP header, the source port is set to 500 and the destination port is that of the IPSec peer. Therefore the NAT device processes the encapsulated packet as a UDP packet. The IPSec peer then removes the UDP header and processes the packets as an IPSec packet.

A diagram showing how IPSec packets are encapsulated in NAT-T.

NAT-T is integrated in IKEv2 but is an an optional extension for IKEv1. To learn more about NAT-T, see RFC 3947.

Dead Peer Detection (DPD) is a more scalable method used to detect if an IKE peer is online. In this method, the peers don't periodically exchange IKE keepalive messages to detect the liveliness of a peer. Instead, a peer requests proof that the other peer is online only when it needs to send traffic. If there is ongoing IPSec traffic between the two peers, their is no need to check for liveliness. DPD decreases the number of messages needed to determine whether a peer is alive. Each peer defines its own DPD interval, which is implementation specific. To learn more, see RFC 3706.

The DPD behavior is the same for IKEv1 and IKEv2 protocols. DPD is integrated in IKEv2 but is an an optional extension for IKEv1.