Processing element problems when the maximum number of file descriptors or the maximum number of user processes is exceeded

When the number of processing elements (PEs) or the number of external connections for a PE exceeds the maximum number of open file descriptors or the maximum number of user processes, the PEs might fail to start or to connect correctly.

The PE starts and then terminates, the PE fails to start, the PE starts but fails to reach a healthy state, or the PE starts and reaches an unknown state. You might also see a message in the PE trace file that contains the following information:
  • errno=Too many open files
  • Exception trying to start a new thread
The possible causes include:
  • The number of external connections for a PE exceeds the maximum number of file descriptors that are configured for Linux. A Linux system limits the number of file descriptors that a Linux process can use. A PE requires a file descriptor for each Teracloud® Streams connection that it opens, and therefore can reach the current limit when the application topology requires many connections.

  • The total number of PEs for the instance causes either the maximum number of open file descriptors or the maximum number of user processes to be exceeded.
To resolve these problems:
  • Verify that your ulimit settings satisfy Teracloud® Streams requirements and update your settings, if needed. For more information, see Guidelines for configuring Linux ulimit settings for Teracloud® Streams.

  • Reduce the number of connections that are required for each PE by configuring your application to use additional Split operators. For example, rather than connecting one PE to 1,200 PEs, add ten Split operators that each connect to 120 PEs.

  • For each process, increase the maximum number of configured file descriptors to be greater than the number of external connections for a PE. To tune the file descriptor parameters of the kernel, contact your Linux system administrator.

  • On the system that has the problem, reduce the number of PEs that run in an instance by specifying how operators in applications are fused into PEs..