Monitoring Private Service Edge Performance

Experience Center Help
Infrastructure
Private Applications
Infrastructure Components
Private Service Edge Management
Private Service Edge Managing & Troubleshooting
Monitoring Private Service Edge Performance

Experience Center

Monitoring Private Service Edge Performance

There are a variety of things to consider when monitoring Private Service Edges and their performance. The most important thing is to make sure your Private Service Edges are deployed in a manner that follows Zscaler's recommendations for image sizing and scalability, supported platform requirements, deployment best practices, networking configurations, and other essential guidelines. To learn more, see Private Service Edge Deployment Prerequisites and Networking Deployed Private Service Edges.

Within the Admin Portal, you can use the Private Service Edge Dashboard to help monitor your organization's Private Service Edges.

When monitoring Private Service Edges directly, consider the following:

CPU Utilization
Peak sustained CPU utilization should be less than 70%. The value needs to take into account all Private Service Edges in the group. The CPU utilization of the group's Private Service Edges should be able to manage additional load and allow each Private Service Edge to operate under 70% utilization in the event that one of the Private Service Edges becomes unavailable. For example, if you only have two Private Service Edges and they are both at 50%, this is below the 70% threshold. However, if one Private Service Edge goes down, then the other Private Service Edge runs above 70%.
To monitor CPU utilization:
- Review Private Service Edge Status Logs in the Log Streaming Service
  Private Service Edge Status logs can be streamed to a SIEM or a syslog server using the Log Streaming Service (LSS). Private Service Edge Status logs are generated periodically for each Private Service Edge, and the field CPUUtilization displays the Private Service Edge host's CPU utilization at that instance. For more information, see About Private Service Edge Metrics Log Fields.
  Close
- Review Private Service Edge Status Logs on the Host
  Private Service Edge logs can be viewed on the Private Service Edge host whenever the Private Service Edge is running. The information displayed in the Private Service Edge logs includes the log output of the Private Service Edge to the local system logging facilities.
  To reduce and balance the load among the VMs:
  Ensure the Hypervisor is healthy.
  Change the Hypervisor setup accordingly.
  Close
- Use CLI Commands on the Private Service Edge Host
  Linux provides several utilities to view CPU utilization using Command Line Interface (CLI) commands. For example:
  The top command reports the sum of utilization across all cores on all processors.
  The sysstat package contains different tools for performance and usage.
  You can also create a script that periodically reports the Private Service Edge host’s CPU utilization to a separate statistics collection server.
  If the CPU utilization is more than 100%, it may be due to multiple CPU cores on the Private Service Edge host. For example, if there are 4 cores, the top output shows usage at 400%. Use the lscpu command to check the number of CPU cores. In this case, the calculation of the peak CPU should be normalized across the number of cores.
  Close
- Use Simple Network Management Protocol
  Simple Network Management Protocol (SNMP) provides a well-defined interface for periodically monitoring performance metrics on host servers. CPU information on the Private Service Edge host can be read using SNMP object identifiers (OIDs).
  Some of the well-known SNMP OIDs for monitoring CPU utilization on a Linux OS are:
  Percentage of user CPU time: .1.3.6.1.4.1.2021.11.9.0
  Raw user CPU time: .1.3.6.1.4.1.2021.11.50.0
  Percentage of system CPU time: .1.3.6.1.4.1.2021.11.10.0
  Raw system CPU time: .1.3.6.1.4.1.2021.11.52.0
  Percentage of idle CPU time: .1.3.6.1.4.1.2021.11.11.0
  Raw idle CPU time: .1.3.6.1.4.1.2021.11.53.0
  Close
To reduce the CPU utilization on individual Private Service Edge hosts, consider the following:
- Deploy an additional Private Service Edge into the Private Service Edge Group to distribute users across additional Private Service Edges.
- Deploy dedicated Private Service Edge Groups for specific application segments to reduce the number of applications that are requested through each Private Service Edge Group.
The CPU steal percentage is the CPU time stolen from the VM by a Hypervisor and is reported every minute in the status log. When the CPU steal percentage is high, the Hypervisor is busy, resulting in degraded performance for the Private Service Edge.
Close
Memory Utilization
Peak sustained memory utilization should be less than 80%. The value needs to take into account all Private Service Edges in the group. The memory of the group's Private Service Edge should be able to manage additional loads and allow each Private Service Edge to operate under 80% utilization in the event one of the Private Service Edge becomes unavailable. For example, if you only have two Private Service Edges and they are both at 50%, this is below the 80% threshold. However, if one Private Service Edge goes down, then the other Private Service Edge runs above 80%
To monitor memory:
- Review Private Service Edge Status Logs in the Log Streaming Service
  Private Service Edge Status logs can be streamed to a SIEM or a syslog server using the Log Streaming Service (LSS). Private Service Edge Status logs are generated periodically for each Private Service Edge, and the field MemUtilization displays the Private Service Edge host’s memory utilization at that instance. For more information, see About Private Service Edge Status Log Fields.
  Close
- Review Private Service Edge Status Logs on the Host
  Private Service Edge logs can be viewed on the Private Service Edge host whenever the Private Service Edge is running. The information displayed in the Private Service Edge logs includes the log output of the Private Service Edge to the local system logging facilities.
  To reduce and balance the load among the VMs:
  Ensure the Hypervisor is healthy.
  Change the Hypervisor setup accordingly.
  Close
- Use CLI Commands on the Private Service Edge Host
  Linux provides several utilities such as free, top, cat/proc/meminfo, htop, and others to view memory utilization using CLI commands. You can create a script that periodically reports the Private Service Edge host’s memory utilization to a separate statistics collection server.
  Close
- Use Simple Network Management Protocol
  Simple Network Management Protocol (SNMP) provides a well-defined interface for periodically monitoring performance metrics on host servers. Memory information on the Private Service Edge host can be read using SNMP object identifiers (OIDs).
  Some of the well-known SNMP OIDs for monitoring memory on a Linux OS are:
  Total RAM in machine: .1.3.6.1.4.1.2021.4.5.0
  Total RAM used: .1.3.6.1.4.1.2021.4.6.0
  Total RAM Free: .1.3.6.1.4.1.2021.4.11.0
  Close
To reduce memory utilization on individual Private Service Edge hosts, consider the following:
- Increase the amount of available memory on the Private Service Edge host.
- Deploy additional Private Service Edges into the Private Service Edge Group to distribute users across additional Private Service Edges.
- Deploy dedicated Private Service Edge Groups for specific application segments to reduce the number of applications that are requested through each Private Service Edge Group.
Close
Data Throughput
Peak sustained throughput based on recommended Private Service Edge specifications should be less than 500 Mbps.
To monitor Private Service Edge throughput:
- Review Private Service Edge Status Logs in the Log Streaming Service
  Private Service Edge Status logs can be streamed to a SIEM or a syslog server using Log Streaming Service (LSS). Private Service Edge Status logs are generated periodically for each Private Service Edge, and the fields TotalBytesRx and TotalBytesTx display the Private Service Edge host’s throughput at that instance. For more information, see About Private Service Edge Status Log Fields.
  Close
- Review Private Service Edge Status Logs on the Host
  Private Service Edge logs can be viewed on the Private Service Edge host whenever the Private Service Edge is running. The information displayed in the Private Service Edge logs includes the log output of the Private Service Edge to the local system logging facilities. The CPU steal percentage is the CPU time stolen from the VM by a Hypervisor and is reported every minute in the status log. When the CPU steal percentage is high, the Hypervisor is busy, resulting in degraded performance for the Private Service Edge.
  To reduce and balance the load among the VMs:
  Ensure the Hypervisor is healthy.
  Change the Hypervisor setup accordingly.
  Close
- Use CLI Commands on the Private Service Edge Host
  Linux provides several utilities such as nload, iftop, nethogs, bmon, and others to view bandwidth statistics on the Private Service Edge host interface using CLI commands. You can create a script that periodically reports the host’s bandwidth utilization to a separate statistics collection server.
  Close
- Use Simple Network Management Protocol
  Simple Network Management Protocol (SNMP) provides a well-defined interface for periodically monitoring performance metrics on host servers. Data throughput information on the Private Service Edge host can be read using SNMP object identifiers (OIDs).
  Some of the well-known SNMP OIDs for monitoring Private Service Edge throughput on a Linux OS are:
  The total number of bytes received on the interface: .1.3.6.1.2.1.2.2.1.10
  The total number of bytes transmitted on the interface: .1.3.6.1.2.1.2.2.1.16
  Close
To reduce data throughput on individual Private Service Edge hosts, consider the following:
- Deploy an additional Private Service Edge into the Private Service Edge Group to distribute users across additional Private Service Edges.
- Deploy dedicated Private Service Edge Groups for specific application segments to reduce the number of applications that are requested through each Private Service Edge Group.
Close
Source Port Consumption
The number of used and free ports can be obtained from the Private Service Edge host by using the following command. In the following example, 11 is the number of UDP ports and 13 is the number of TCP ports in use.
```
[root@ip-10-0-0-116 admin]# sudo su
[root@ip-10-0-0-116 admin]# ss -uln | wc -l && ss -tn state connected | wc -l
11
13
```
If the in-use count for either TCP or UDP ports is equal or close to the available ports, the Private Service Edge host is likely to experience port exhaustion and the Private Service Edge may not be able to connect to the applications.
In the following example, the command is still used. It shows the difference of 60999-32768=28231, which indicates the number of ports available for use on this Private Service Edge host. The count of in-use UDP and TCP ports must be lower than 28231.
```
[root@ip-10-0-0-116 admin]# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768 60999
[root@ip-10-0-0-116 admin]#
```
To adjust port usage, consider the following:
- Increase the Available Maximum Number of Ports
  For a non-permanent increase in ports, which does not persist across the Private Service Edge host restart, use one of the following options. The text in red is an example of port ranges:
  Use echo
  echo 32768 65000 > /proc/sys/net/ipv4/ip_local_port_range
  Use sysctl
  sysctl -w net.ipv4.ip_local_port_range="32768 65000"
  For a permanent increase in ports, add the following to /etc/sysctl.conf. The text in red is an example of port ranges:
  net.ipv4.ip_local_port_range = 32768 65000
  Close
Close
Host Application Access
There may be degraded application performance if an excessive number of applications are accessed from a smaller number of Private Service Edges within a group. Too many applications accessed from a given Private Service Edge may result in increased application latency or higher amounts of CPU or memory used on the Private Service Edge.
To reduce the number of applications per Private Service Edge, consider the following:
- Deploy dedicated Private Service Edge Groups for specific application segments to reduce the number of applications that are requested through each Private Service Edge Group.
- Zscaler recommends that you add additional Private Service Edges to the existing Private Service Edge Groups to support excess applications.
Close
Disk Space Utilization
Private Service Edges generate logs that are stored on the disk during the normal course of operation. To ensure adequate disk space is available for Private Service Edge logs and software updates, monitor the disk space for consumption.The Private Service Edge process restarts if there is less than 100 MB of disk space left, and generates a log in the /var/log/message file to reflect the cause of the restart. This ensures that the software updates for the Private Service Edge can download successfully.
If the Private Service Edge runs other software, it can generate large volumes of logs that can fill up the disk space.
To monitor disk space:
- Review the Amount of Free Disk Space
  The sudo df -h command displays free disk space on the Private Service Edge host. For example:
  [admin@ip-10-0-0-228 ~]$ sudo df -h Filesystem Size Used Avail Use% Mounted on /dev/nvme0n1p1 8.0G 1.7G 6.4G 21% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 181M 1.7G 10% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup tmpfs 375M 0 375M 0% /run/user/1000
  Close
- Check the Top Process that Consumes Disk Space
  Use the following command to see the top process that consumes disk space:
  sudo du -a /| sort -n -r | head -n 20
  Close
- Review the Number of System Log Lines Generated
  Use the following command to get a rough estimate of the lines of syslogd written by the Private Service Edge process:
  sudo grep -v "zpa-service-edge" /var/log/messages | wc -l sudo grep "zpa-service-edge" /var/log/messages | wc -l
  Close
- Use Simple Network Management Protocol
  Simple Network Management Protocol (SNMP) provides a well-defined interface for periodically monitoring performance metrics on host servers. Disk space information on the Private Service Edge host can be read using SNMP object identifiers (OIDs).
  Some of the well-known SNMP OIDs for monitoring disk space utilization on a Linux OS are:
  Available space on the disk: .1.3.6.1.4.1.2021.9.1.7.1
  Used space on the disk: .1.3.6.1.4.1.2021.9.1.8.1
  Percentage of space used on disk: .1.3.6.1.4.1.2021.9.1.9.1
  Percentage of inodes used on disk: .1.3.6.1.4.1.2021.9.1.10.1
  Path where the disk is mounted: .1.3.6.1.4.1.2021.9.1.2.1
  Path of the device for the partition: .1.3.6.1.4.1.2021.9.1.3.1
  Total size of the disk/partition (kBytes): .1.3.6.1.4.1.2021.9.1.6.1
  Close
To reduce disk space on individual Private Service Edge hosts, use the following options:
- Archive the Files
  Use a Linux utility like logrotate to archive files on a regular basis (e.g., daily, weekly, or monthly) with the /etc/logrotate.conf command.
  Some ways to use logrotate for archiving are:
  - Archive log files daily: daily
  - Keep 7 days worth of logs: rotate 7
  - Create new (empty) log files after archiving old ones: create
  - Use a date as a suffix of the archived file: dateext
  - Un-comment to compress log files: compress
  Close
- Remove the Files
  Run a cron job to delete log files or use the rm command to remove any files you think are not necessary, including archived files. For example, there is no functional implication for deleting /var/log/messages. Before deleting, check if you require the files for Private Service Edge troubleshooting.
  If you archive files to reduce disk space, you do not need to remove files as well.
  Close
- Create More Log Space
  To create more space in /var/log/journal, use the vacuum-size or the vacuum-time commands. For example:
```
sudo journalctl --vacuum-size=100m 
sudo journalctl --vacuum-time=1d 
```
  Close
Zscaler recommends at least disabling rsyslogd to reduce disk space.
Close
File Descriptors Exhaustion
If the number of file descriptors is not sufficient on the Private Service Edge host, transactions can fail due to file descriptor exhaustion. To address this, increase the SYSTEM_FD limit by executing the following command:
```
sudo sysctl -w fs.file-max=1000000 
```
To learn more, see About the Private Service Edge Dashboard.
Close

Experience Center

Monitoring Private Service Edge Performance

Related Articles