Too much open files Exception under "unlimited" system

Question

First thing to check—can you run ulimit from inside your Java process to make sure that the file limit is the same inside? Code like this should work:

InputStream is = Runtime.getRuntime().exec(new String[] {"bash", "-c", "ulimit -a"}).getInputStream();
int c;
while ((c = is.read()) != -1) {
    System.out.write(c);
}

If the limit still shows 1 million, well, you’re up for some hard debugging.

Here are a couple of things that I would look into if I had to debug this—

Are you running out of tcp port numbers? What does netstat -an show when you hit this error?
Use strace to find out exactly what system call with what parameters is causing this error to be thrown. EMFILE is a return value of 24.

The “Too many open files” EMFILE error can actually be thrown by a number of different system calls for a number of different reasons:

$ cd /usr/share/man/man2
$ zgrep -A 2 EMFILE *
accept.2.gz:.B EMFILE
accept.2.gz:The per-process limit of open file descriptors has been reached.
accept.2.gz:.TP
accept.2.gz:--
accept.2.gz:.\" EAGAIN, EBADF, ECONNABORTED, EINTR, EINVAL, EMFILE,
accept.2.gz:.\" ENFILE, ENOBUFS, ENOMEM, ENOTSOCK, EOPNOTSUPP, EPROTO, EWOULDBLOCK.
accept.2.gz:.\" In addition, SUSv2 documents EFAULT and ENOSR.
dup.2.gz:.B EMFILE
dup.2.gz:The process already has the maximum number of file
dup.2.gz:descriptors open and tried to open a new one.
epoll_create.2.gz:.B EMFILE
epoll_create.2.gz:The per-user limit on the number of epoll instances imposed by
epoll_create.2.gz:.I /proc/sys/fs/epoll/max_user_instances
eventfd.2.gz:.B EMFILE
eventfd.2.gz:The per-process limit on open file descriptors has been reached.
eventfd.2.gz:.TP
execve.2.gz:.B EMFILE
execve.2.gz:The process has the maximum number of files open.
execve.2.gz:.TP
execve.2.gz:--
execve.2.gz:.\" document ETXTBSY, EPERM, EFAULT, ELOOP, EIO, ENFILE, EMFILE, EINVAL,
execve.2.gz:.\" EISDIR or ELIBBAD error conditions.
execve.2.gz:.SH NOTES
fcntl.2.gz:.B EMFILE
fcntl.2.gz:For
fcntl.2.gz:.BR F_DUPFD ,
getrlimit.2.gz:.BR EMFILE .
getrlimit.2.gz:(Historically, this limit was named
getrlimit.2.gz:.B RLIMIT_OFILE
inotify_init.2.gz:.B EMFILE
inotify_init.2.gz:The user limit on the total number of inotify instances has been reached.
inotify_init.2.gz:.TP
mmap.2.gz:.\" SUSv2 documents additional error codes EMFILE and EOVERFLOW.
mmap.2.gz:.SH AVAILABILITY
mmap.2.gz:On POSIX systems on which
mount.2.gz:.B EMFILE
mount.2.gz:(In case no block device is required:)
mount.2.gz:Table of dummy devices is full.
open.2.gz:.B EMFILE
open.2.gz:The process already has the maximum number of files open.
open.2.gz:.TP
pipe.2.gz:.B EMFILE
pipe.2.gz:Too many file descriptors are in use by the process.
pipe.2.gz:.TP
shmop.2.gz:.\" SVr4 documents an additional error condition EMFILE.
shmop.2.gz:
shmop.2.gz:In SVID 3 (or perhaps earlier)
signalfd.2.gz:.B EMFILE
signalfd.2.gz:The per-process limit of open file descriptors has been reached.
signalfd.2.gz:.TP
socket.2.gz:.B EMFILE
socket.2.gz:Process file table overflow.
socket.2.gz:.TP
socketpair.2.gz:.B EMFILE
socketpair.2.gz:Too many descriptors are in use by this process.
socketpair.2.gz:.TP
spu_create.2.gz:.B EMFILE
spu_create.2.gz:The process has reached its maximum open files limit.
spu_create.2.gz:.TP
timerfd_create.2.gz:.B EMFILE
timerfd_create.2.gz:The per-process limit of open file descriptors has been reached.
timerfd_create.2.gz:.TP
truncate.2.gz:.\" error conditions EMFILE, EMULTIHP, ENFILE, ENOLINK.  SVr4 documents for
truncate.2.gz:.\" .BR ftruncate ()
truncate.2.gz:.\" an additional EAGAIN error condition.

If you check out all these manpages by hand, you may find something interesting. For example, I think it’s interesting that epoll_create, the underlying system call that is used by NIO channels, will return EMFILE “Too many open files” if

The per-user limit on the number of epoll instances imposed by /proc/sys/fs/epoll/max_user_instances was encountered. See epoll(7) for further details.

Now that filename doesn’t actually exist on my system, but there are some limits defined in files in /proc/sys/fs/epoll and /proc/sys/fs/inotify that you might be hitting, especially if you’re running multiple instances of the same test on the same machine. Figuring out if that’s the case is a chore in itself—you could start by checking syslog for any messages…

Good luck!