First thing to check—can you run ulimit
from inside your Java process to make sure that the file limit is the same inside? Code like this should work:
InputStream is = Runtime.getRuntime().exec(new String[] {"bash", "-c", "ulimit -a"}).getInputStream();
int c;
while ((c = is.read()) != -1) {
System.out.write(c);
}
If the limit still shows 1 million, well, you’re up for some hard debugging.
Here are a couple of things that I would look into if I had to debug this—
Are you running out of
tcp
port numbers? What doesnetstat -an
show when you hit this error?Use
strace
to find out exactly what system call with what parameters is causing this error to be thrown.EMFILE
is a return value of 24.The “Too many open files”
EMFILE
error can actually be thrown by a number of different system calls for a number of different reasons:$ cd /usr/share/man/man2 $ zgrep -A 2 EMFILE * accept.2.gz:.B EMFILE accept.2.gz:The per-process limit of open file descriptors has been reached. accept.2.gz:.TP accept.2.gz:-- accept.2.gz:.\" EAGAIN, EBADF, ECONNABORTED, EINTR, EINVAL, EMFILE, accept.2.gz:.\" ENFILE, ENOBUFS, ENOMEM, ENOTSOCK, EOPNOTSUPP, EPROTO, EWOULDBLOCK. accept.2.gz:.\" In addition, SUSv2 documents EFAULT and ENOSR. dup.2.gz:.B EMFILE dup.2.gz:The process already has the maximum number of file dup.2.gz:descriptors open and tried to open a new one. epoll_create.2.gz:.B EMFILE epoll_create.2.gz:The per-user limit on the number of epoll instances imposed by epoll_create.2.gz:.I /proc/sys/fs/epoll/max_user_instances eventfd.2.gz:.B EMFILE eventfd.2.gz:The per-process limit on open file descriptors has been reached. eventfd.2.gz:.TP execve.2.gz:.B EMFILE execve.2.gz:The process has the maximum number of files open. execve.2.gz:.TP execve.2.gz:-- execve.2.gz:.\" document ETXTBSY, EPERM, EFAULT, ELOOP, EIO, ENFILE, EMFILE, EINVAL, execve.2.gz:.\" EISDIR or ELIBBAD error conditions. execve.2.gz:.SH NOTES fcntl.2.gz:.B EMFILE fcntl.2.gz:For fcntl.2.gz:.BR F_DUPFD , getrlimit.2.gz:.BR EMFILE . getrlimit.2.gz:(Historically, this limit was named getrlimit.2.gz:.B RLIMIT_OFILE inotify_init.2.gz:.B EMFILE inotify_init.2.gz:The user limit on the total number of inotify instances has been reached. inotify_init.2.gz:.TP mmap.2.gz:.\" SUSv2 documents additional error codes EMFILE and EOVERFLOW. mmap.2.gz:.SH AVAILABILITY mmap.2.gz:On POSIX systems on which mount.2.gz:.B EMFILE mount.2.gz:(In case no block device is required:) mount.2.gz:Table of dummy devices is full. open.2.gz:.B EMFILE open.2.gz:The process already has the maximum number of files open. open.2.gz:.TP pipe.2.gz:.B EMFILE pipe.2.gz:Too many file descriptors are in use by the process. pipe.2.gz:.TP shmop.2.gz:.\" SVr4 documents an additional error condition EMFILE. shmop.2.gz: shmop.2.gz:In SVID 3 (or perhaps earlier) signalfd.2.gz:.B EMFILE signalfd.2.gz:The per-process limit of open file descriptors has been reached. signalfd.2.gz:.TP socket.2.gz:.B EMFILE socket.2.gz:Process file table overflow. socket.2.gz:.TP socketpair.2.gz:.B EMFILE socketpair.2.gz:Too many descriptors are in use by this process. socketpair.2.gz:.TP spu_create.2.gz:.B EMFILE spu_create.2.gz:The process has reached its maximum open files limit. spu_create.2.gz:.TP timerfd_create.2.gz:.B EMFILE timerfd_create.2.gz:The per-process limit of open file descriptors has been reached. timerfd_create.2.gz:.TP truncate.2.gz:.\" error conditions EMFILE, EMULTIHP, ENFILE, ENOLINK. SVr4 documents for truncate.2.gz:.\" .BR ftruncate () truncate.2.gz:.\" an additional EAGAIN error condition.
If you check out all these manpages by hand, you may find something interesting. For example, I think it’s interesting that
epoll_create
, the underlying system call that is used by NIO channels, will returnEMFILE
“Too many open files” ifThe per-user limit on the number of epoll instances imposed by /proc/sys/fs/epoll/max_user_instances was encountered. See epoll(7) for further details.
Now that filename doesn’t actually exist on my system, but there are some limits defined in files in
/proc/sys/fs/epoll
and/proc/sys/fs/inotify
that you might be hitting, especially if you’re running multiple instances of the same test on the same machine. Figuring out if that’s the case is a chore in itself—you could start by checking syslog for any messages…
Good luck!