Question

Is there a clean way to resolve a DNS query (get IP by hostname) in Java asynchronously, in non-blocking way (i.e. state machine, not 1 query = 1 thread - I'd like to run tens of thousands queries simultaneously, but not run tens of thousands of threads)?

What I've found so far:

  • Standard InetAddress.getByName() implementation is blocking and looks like standard Java libraries lack any non-blocking implementations.
  • Resolving DNS in bulk question discusses similar problem, but the only solution found is multi-threaded approach (i.e. one thread working on only 1 query in every given moment of a time), which is not really scalable.
  • dnsjava library is also blocking only.
  • There are ancient non-blocking extensions to dnsjava dating from 2006, thus lacking any modern Java concurrency stuff such as Future paradigm usage and, alas, very limited queue-only implementation.
  • dnsjnio project is also an extension to dnsjava, but it also works in threaded model (i.e. 1 query = 1 thread).
  • asyncorg seems to be the best available solution I've found so far targeting this issue, but:
    • it's also from 2007 and looks abandoned
    • lacks almost any documentation/javadoc
    • uses lots of non-standard techniques such as Fun class

Any other ideas/implementations I've missed?

Clarification. I have a fairly large (several TB per day) amount of logs. Every log line has a host name that can be from pretty much anywhere around the internet and I need an IP address for that hostname for my further statistics calculations. Order of lines doesn't really matter, so, basically, my idea is to start 2 threads: first to iterate over lines:

  • Read a line, parse it, get the host name
  • Send a query to DNS server to resolve a given host name, don't block for answer
  • Store the line and DNS query socket handle in some buffer in memory
  • Go to the next line

And a second thread that will:

  • Wait for DNS server to answer any query (using epoll / kqueue like technique)
  • Read the answer, find which line it was for in a buffer
  • Write line with resolved IP to the output
  • Proceed to waiting for the next answer

A simple model implementation in Perl using AnyEvent shows me that my idea is generally correct and I can easily achieve speeds like 15-20K queries per second this way (naive blocking implementation gets like 2-3 queries per second - just the sake of comparison - so that's like 4 orders of magnitude difference). Now I need to implement the same in Java - and I'd like to skip rolling out my own DNS implementation ;)

Was it helpful?

Solution

It may be that the Apache Directory Services implementation of DNS on top of MINA is what you're looking for. The JavaDocs and other useful guides are on that page, in the left-hand side-bar.

OTHER TIPS

There is some work on non blocking DNS in netty, but it's still work in progress in will be probably released only in 5.0

You will, I think, have to implement the DNS client protocol yourself on top of raw UDP using base sockets support, or on top of TCP using NIO channels.

I don't have an answer to your question (I don't know if there is a DNS library that will operate in the async mode that you want) and this is too long for a comment.

But, you should be able to quickly produce an async one without having to write the full DNS handler yourself. Warning, I haven't done this so I could be all wrong.

Starting with the dnsjava code you ought to be able to implement your own resolver that will provide you both a sender and receiver method. Check out SimpleResolver and look at the send method. You ought to be able to break up this method into two methods, one to send your request that runs up to the call to either the TCPClient or the UDPClient (you would handle the actual on the wire sending at this point, as you described, with your first thread), and, one to receive, which would be called by your second thread as a response to a socket read, and handle parsing the response. You may have to either copy all of the code from the SimpleResolver (lots of private methods that you'll need and licensing allows for it), or, you could create your own version and simply load it ahead of the jared one in your classpath, or, your could reflect your way to the methods in question and set them accessible.

You can quickly build the network client side with either netty or mina. I prefer netty for the docs.

If you do go down this path and can/want to open source it, I can set aside some time to help if you get into trouble.

Linux has an asynchronous DNS lookup function: http://www.imperialviolet.org/2005/06/01/asynchronous-dns-lookups-with-glibc.html

If you are on Linux you just need to wrap that up in some JNI.

You have multiple options

Option 1: Java 5 Executors

  1. A Fixed thread pool: Executors.newFixedThreadPool(int)
  2. Future: A Future represents the result of an asynchronous computation. Methods are provided to check if the computation is complete, to wait for its completion, and to retrieve the result of the computation.

Option 2: JMS with MessageListener

  1. Requires dependency on JMS Provider etc.

Option 2: Actor based framework

You can scale this well with this.Look at Akka.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top