Вопрос

I'm implementing a LRU cache for photos of users, using Commons Collections LRUMap (which is basicly a LinkedHashMap with small modifications). The findPhoto method can be called several hundred times within a few seconds.

public class CacheHandler {
    private static final int MAX_ENTRIES = 1000;
    private static Map<Long, Photo> photoCache = Collections.synchronizedMap(new LRUMap(MAX_ENTRIES));

    public static Map<Long, Photo> getPhotoCache() {
        return photoCache;
    }
}

Usage:

public Photo findPhoto(Long userId){
    User user = userDAO.find(userId);
    if (user != null) {
        Map<Long, Photo> cache = CacheHandler.getPhotoCache();

        Photo photo = cache.get(userId);
        if(photo == null){
            if (user.isFromAD()) {
                try {
                    photo = LDAPService.getInstance().getPhoto(user.getLogin());
                } catch (LDAPSearchException e) {
                    throw new EJBException(e);
                }
            } else {
                log.debug("Fetching photo from DB for external user: " + user.getLogin());
                UserFile file = userDAO.findUserFile(user.getPhotoId());
                if (file != null) {
                    photo = new Photo(file.getFilename(), "image/png", file.getFileData());
                }
            }
            cache.put(userId, photo);
        }else{
            log.debug("Fetching photo from cache, user: " + user.getLogin());
        }
        return photo;

    }else{
        return null;
    }
}

As you can see I'm not using synchronization blocks. I'm assuming that the worst case scenario here is a race condition that causes two threads to run cache.put(userId, photo) for the same userId. But the data will be the same for two threads, so that is not an issue.

Is my reasoning here correct? If not, is there a way to use a synchronization block without getting a large performance hit? Having only 1 thread accessing the map at a time feels like overkill.

Это было полезно?

Решение 2

Yes you are right - if the photo creation is idempotent (always returns the same photo), the worst thing that can happen is that you will fetch it more than once and put it into the map more than once.

Другие советы

Assylias is right that what you've got will work fine.

However, if you want to avoid fetching images more than once, that is also possible, with a bit more work. The insight is that if a thread comes along, makes a cache miss, and starts loading an image, then if a second thread comes along wanting the same image before the first thread has finished loading it, then it should wait for the first thread, rather than going and loading it itself.

This is fairly easy to coordinate using some of Java's simpler concurrency classes.

Firstly, let me refactor your example to pull out the interesting bit. Here's what you wrote:

public Photo findPhoto(User user) {
    Map<Long, Photo> cache = CacheHandler.getPhotoCache();

    Photo photo = cache.get(user.getId());
    if (photo == null) {
        photo = loadPhoto(user);
        cache.put(user.getId(), photo);
    }
    return photo;
}

Here, loadPhoto is a method which does the actual nitty-gritty of loading a photo, which isn't relevant here. I assume that the validation of the user is done in another method which calls this one. Other than that, this is your code.

What we do instead is this:

public Photo findPhoto(final User user) throws InterruptedException, ExecutionException {
    Map<Long, Future<Photo>> cache = CacheHandler.getPhotoCache();

    Future<Photo> photo;
    FutureTask<Photo> task;

    synchronized (cache) {
        photo = cache.get(user.getId());
        if (photo == null) {
            task = new FutureTask<Photo>(new Callable<Photo>() {
                @Override
                public Photo call() throws Exception {
                    return loadPhoto(user);
                }
            });
            photo = task;
            cache.put(user.getId(), photo);
        }
        else {
            task = null;
        }
    }

    if (task != null) task.run();

    return photo.get();
}

Note that you need to change the type of CacheHandler.photoCache to accommodate the wrapping FutureTasks. And since this code does explicit locking, you can remove the synchronizedMap from it. You could also use a ConcurrentMap for the cache, which would allow the use of putIfAbsent, a more concurrent alternative to the lock/get/check for null/put/unlock sequence.

Hopefully, what is happening here is fairly obvious. The basic pattern of getting something from the cache, checking to see if what you got was null, and if so putting something back in is still there. But instead of putting in a Photo, you put in a Future, which is essentially a placeholder for a Photo which may not (or may) be there right at that moment, but which will become available later. The get method on Future gets the thing that a place is being held for, blocking until it arrives if necessary.

This code uses FutureTask as an implementation of Future; this takes a Callable capable of producing a Photo as a constructor argument, and calls it when its run method is called. The call to run is guarded with a test that essentially recapitulates the if (photo == null) test from earlier, but outside the synchronized block (because as you realised, you really don't want to be loading photos while holding the cache lock).

This is a pattern i've seen or needed a few times. It's a shame it's not built into the standard library somewhere.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top