Question

I have to listern a file, when its content is added, I will read the new line, and work on the content of the new line. The file's length will never decrease.(in fact, it is the tomcat log file).

I use the following codes:


import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.RandomAccessFile;

import org.apache.log4j.Logger;

import com.zjswkj.analyser.ddao.LogEntryDao;
import com.zjswkj.analyser.model.LogEntry;
import com.zjswkj.analyser.parser.LogParser;

public class ListenTest {
    private RandomAccessFile    raf;
    private long                lastPosition;
    private String              logEntryPattern = "^([\\d.]+) (\\S+) (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(.+?)\" (\\d{3}) (\\S+) \"([^\"]+)\" \"([^\"]+)\"";
    private static Logger       log             = Logger.getLogger(ListenTest.class);

    public void startListenLogOfCurrentDay() {

        try {
            if (raf == null)
                raf = new RandomAccessFile(
                        "/tmp/logs/localhost_access_log.2010-12-20.txt",
                        "r");
            String line;
            while (true) {
                raf.seek(lastPosition);
                while ((line = raf.readLine()) != null) {
                    if (!line.matches(logEntryPattern)) {
                        // not a complete line,roll back
                        lastPosition = raf.getFilePointer() - line.getBytes().length;
                        log.debug("roll back:" + line.getBytes().length + " bytes");
                        if (line.equals(""))
                            continue;
                        log.warn("broken line:[" + line + "]");
                        Thread.sleep(2000);
                    } else {
                        // save it
                        LogEntry le = LogParser.parseLog(line);
                        LogEntryDao.saveLogEntry(le);
                        lastPosition = raf.getFilePointer();
                    }
                }
            }
        } catch (FileNotFoundException e) {
            log.error("can not find log file of today");
        } catch (IOException e) {
            log.error("IO Exception:" + e.getMessage());
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        new ListenTest().startListenLogOfCurrentDay();
    }
}

Now, my problem is that, if a line which is being written to the file's new line is not completed, a dead loop will occur.

For example, if the tomcat try to write to the file a new line:

10.33.2.45 - - [08/Dec/2010:08:44:43 +0800] "GET /poi.txt HTTP/1.1" 200 672 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8"

And when only one part of the line is written(for example:<10.33.2.45 - - [08/Dec/2010:08:44:43 +0800] "GET /poi.txt HTTP/1.1" 200 672>), now since it can not match the pattern I defined,that's to say, tomcat do not complete its writing work, so I will try to roll back the filepointer, and sleep 2 seconds and then read again.

During the sleep time,the last part of the line maybe written yet (in fact I write them rather than tomcat for test), in my opinion, randomaccessfile will read a new line which can match the pattern, however it seems not.

Any one can have a check the codes?

NOTE: the log file's format is "combined" like this:

10.33.2.45 - - [08/Dec/2010:08:44:43 +0800] "GET /poi.txt HTTP/1.1" 200 672 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8"
Was it helpful?

Solution

I see (from your code) that your main objective is to filter the log entries/events and then write the filtered logs to database. You have 2 options

Option 1: Best and the right way to do. But you should be able to change the log4j config file that comes with tomcat

If this is the case then the best way to do this is to use log4j's predefined extension points. In your case the tapping point is the Appender

Log4j already comes with the DBAppender that you might want to extend to filter the logs using your regular expression and then delegate the rest to DBAppender as it is well tested. Below is an example on how to configure the custome appender

log4j.rootLogger=DEBUG, S

log4j.appender.S=com.gurock.smartinspect.log4j.MyCustomAppender

log4j.appender.S.layout=org.apache.log4j.SimpleLayout

I suggest you also look at using the AsyncAppender and DBAppender if you want to improve the performance.

Option 2: Fallback option if you doesn't have access to the tomcat's log4j config file

Instead of writing your own file change listener, look this post in SO. Choose the one that best matches your needs. You are then only left with writing code for filtering and persisting the log in DB. You can use this link as an example for dealing with RandomAccessFile.

OTHER TIPS

I think it is not a good way of checking new added lines. I recommend you writing a custom appender for log4j. With a custom appender you can get every new added lines with an event. There is a sample here

And google for custom appender.

The first thing I would do in this situation were to separate the issue of reading a growing file from the issue of processing the lines.

Create a class GrowingFileReader whose readLine method does what you want. Then the rest of the code becomes simpler.

In the case of a failed match, why do you update lastPosition at all? Shouldn't it be left as is?

RAF's readline is a blocking method and is inefficient (reads byte by byte and makes so many system calls) Also note that in your code lines.getBytes().length cannot be accurately used as the readLine method skips newline/carriage return chars.

To use BufferedReader on RAF check my answer here https://stackoverflow.com/a/19867481/1282907

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top