优雅地完成 SoftReference 引用对象

https://stackoverflow.com/questions/1638859

08-07-2019
|

题

我正在使用一个搜索库，它建议保持搜索句柄对象打开，这样可以有利于查询缓存。随着时间的推移，我观察到缓存往往会变得臃肿（几百兆并且不断增长）并且 OOM 开始出现。无法强制限制此缓存，也无法计划它可以使用多少内存。所以我增加了 Xmx 限制，但这只是暂时解决问题。

最终我想把这个物体变成一个 所指对象 的 java.lang.ref.SoftReference. 。因此，如果系统的可用内存不足，它就会释放该对象，并根据需要创建一个新对象。这会在重新启动后降低一些速度，但这是比 OOM 更好的选择。

我看到的关于软引用的唯一问题是没有干净的方法来最终确定它们的引用对象。就我而言，在销毁搜索句柄之前，我需要将其关闭，否则系统可能会耗尽文件描述符。显然，我可以将此句柄包装到另一个对象中，在其上编写终结器（或挂接到 ReferenceQueue/PhantomReference 上），然后放手。但是，嘿，这个星球上的每一篇文章都建议不要使用终结器，特别是反对使用终结器来释放文件句柄（例如 有效的Java 编辑。II，第 27 页。）。

所以我有些疑惑。我应该小心地忽略所有这些建议并继续吗？否则，还有其他可行的替代方案吗？提前致谢。

编辑#1：下面的文本是在测试 Tom Hawtin 建议的一些代码后添加的。对我来说，似乎要么建议不起作用，要么我错过了一些东西。这是代码：

class Bloat {  // just a heap filler really
   private double a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z;

   private final int ii;

   public Bloat(final int ii) {
      this.ii = ii;
   }
}

// as recommended by Tom Hawtin
class MyReference<T> extends SoftReference<T> {
   private final T hardRef;

   MyReference(T referent, ReferenceQueue<? super T> q) {
      super(referent, q);
      this.hardRef = referent;
   }
}

//...meanwhile, somewhere in the neighbouring galaxy...
{
   ReferenceQueue<Bloat> rq = new ReferenceQueue<Bloat>();
   Set<SoftReference<Bloat>> set = new HashSet<SoftReference<Bloat>>();
   int i=0;

   while(i<50000) {
//      set.add(new MyReference<Bloat>(new Bloat(i), rq));
      set.add(new SoftReference<Bloat>(new Bloat(i), rq));

//      MyReference<Bloat> polled = (MyReference<Bloat>) rq.poll();
      SoftReference<Bloat> polled = (SoftReference<Bloat>) rq.poll();

      if (polled != null) {
         Bloat polledBloat = polled.get();
         if (polledBloat == null) {
           System.out.println("is null :(");
         } else {
           System.out.println("is not null!");
         }
      }
      i++;
   }
}

如果我运行上面的代码片段 -Xmx10m 和 SoftReferences（如上面的代码），我得到了大量 is null :( 打印。但是如果我将代码替换为 MyReference （使用 MyReference 取消注释两行并使用 SoftReference 注释掉两行）我总是遇到 OOM。

正如我从建议中了解到的，里面有硬参考 MyReference 不应阻止物体撞击 ReferenceQueue, ，正确的？

解决方案

汤姆斯答案是正确答案，但是问题中添加的代码与汤姆提出的代码不同。汤姆提出的建议看起来更像是这样：

class Bloat {  // just a heap filler really
    public Reader res;
    private double a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z;

    private final int ii;

    public Bloat(final int ii, Reader res) {
       this.ii = ii;
       this.res = res;
    }
 }

 // as recommended by Tom Hawtin
 class MySoftBloatReference extends SoftReference<Bloat> {
    public final Reader hardRef;

    MySoftBloatReference(Bloat referent, ReferenceQueue<Bloat> q) {
       super(referent, q);
       this.hardRef = referent.res;
    }
 }

 //...meanwhile, somewhere in the neighbouring galaxy...
 {
    ReferenceQueue<Bloat> rq = new ReferenceQueue<Bloat>();
    Set<SoftReference<Bloat>> set = new HashSet<SoftReference<Bloat>>();
    int i=0;

    while(i<50000) {
        set.add(new MySoftBloatReference(new Bloat(i, new StringReader("test")), rq));

        MySoftBloatReference polled = (MySoftBloatReference) rq.poll();

        if (polled != null) {
            // close the reference that we are holding on to
            try {
                polled.hardRef.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        i++;
    }
}

请注意，最大的区别在于硬引用是指需要关闭的对象。周围的对象可以并且将被垃圾收集，因此您不会点击OOM，但是您仍然有机会关闭引用。一旦你离开循环，那也将被垃圾收集。当然，在现实世界中，您可能不会将 res 作为公共实例成员。

也就是说，如果您持有开放文件引用，那么在内存不足之前，您将面临完全没用的风险。您可能还希望拥有一个LRU缓存，以确保您只保留 stick in the air 500打开文件。它们也可以是MyReference类型，因此如果需要它们也可以被垃圾收集。

为了澄清MySoftBloatReference的工作原理，基类（即SoftReference）仍保留对占用所有内存的对象的引用。这是您需要释放以防止OOM发生的对象。但是，如果释放了该对象，您仍然需要释放Bloat正在使用的资源，也就是说，Bloat正在使用两种类型的资源，内存和文件句柄，这两种资源都需要被释放，或者您运行一个或另一个资源。 SoftReference通过释放该对象来处理内存资源的压力，但是您还需要释放其他资源，即文件句柄。由于Bloat已被释放，我们无法使用它来释放相关资源，因此MySoftBloatReference会保留对需要关闭的内部资源的硬引用。一旦被告知Bloat已被释放，即一旦ReferenceQueue中的引用出现，那么MySoftBloatReference也可以通过它的硬引用关闭相关资源。

编辑：更新了代码，以便在投入课程时进行编译。它使用StringReader来说明如何关闭Reader的概念，Reader用于表示需要释放的外部资源。在这个特殊情况下，关闭该流实际上是一个无操作，因此不需要，但它显示了如果需要它的方法。

其他提示

对于有限数量的资源：子类 SoftReference 。软引用应指向封闭对象。子类中的强引用应引用资源，因此始终可以很容易地访问它。通过 ReferenceQueue poll 读取时，可以关闭资源并从缓存中删除资源。需要正确释放缓存（如果 SoftReference 本身是垃圾收集的，则无法将其排入 ReferenceQueue ）。

请注意，缓存中只释放了有限数量的资源 - 逐出旧条目（实际上，如果符合您的情况，您可以使用有限缓存丢弃软引用）。通常情况下，非内存资源更为重要，在这种情况下，没有外来参考对象的LRU-eviction缓存就足够了。

（我的回答＃1000。发自伦敦DevDay。）

AHM。结果（据我所知）你不能从两端抓住棍子。要么你坚持你的信息，要么你放手。
但是......您可以保留一些可以让您最终确定的关键信息。当然，关键信息必须明显小于“真实信息”。并且不得在其可到达的对象图中包含真实信息（弱引用可能对您有所帮助）在现有例子的基础上（注意关键信息领域）：

public class Test1 {
    static class Bloat {  // just a heap filler really
        private double a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z;

        private final int ii;

        public Bloat(final int ii) {
            this.ii = ii;
        }
    }

    // as recommended by Tom Hawtin
    static class MyReference<T, K> extends SoftReference<T> {
        private final K keyInformation;

        MyReference(T referent, K keyInformation, ReferenceQueue<? super T> q) {
            super(referent, q);
            this.keyInformation = keyInformation;
        }

        public K getKeyInformation() {
            return keyInformation;
        }
    }

    //...meanwhile, somewhere in the neighbouring galaxy...
    public static void main(String[] args) throws InterruptedException {
        ReferenceQueue<Bloat> rq = new ReferenceQueue<Bloat>();
        Set<SoftReference<Bloat>> set = new HashSet<SoftReference<Bloat>>();
        int i = 0;

        while (i < 50000) {
            set.add(new MyReference<Bloat, Integer>(new Bloat(i), i, rq));

            final Reference<? extends Bloat> polled = rq.poll();

            if (polled != null) {
                if (polled instanceof MyReference) {
                    final Object keyInfo = ((MyReference) polled).getKeyInformation();
                    System.out.println("not null, got key info: " + keyInfo + ", finalizing...");
                } else {
                    System.out.println("null, can't finalize.");
                }
                rq.remove();
                System.out.println("removed reference");
            }

编辑：结果我想详细说明“要么保留你的信息，要么放手”。假设您有某种方法可以保留您的信息。这将迫使GC取消标记您的数据，导致数据实际上只有在您完成后才能在第二个GC循环中进行清理。这是可能的 - 它正是finalize（）的用途。由于您声明您不希望第二个周期发生，因此您无法保留您的信息（如果 - > b，那么！b - ＆gt;！a）。这意味着你必须放手。

EDIT2：结果实际上，会出现第二个周期 - 但对于您的“关键数据”，而不是您的“主要膨胀数据”。实际数据将在第一个周期清除。

EDIT3：结果显然，真正的解决方案是使用单独的线程从引用队列中删除（不要在专用线程上使用poll（），remove（），阻塞）。

@Paul - 非常感谢您的回答和澄清。

@Ran - 我认为在您当前的代码中，循环末尾缺少 i++ 。另外，您不需要在循环中执行 rq.remove() ，因为 rq.poll() 已经删除了顶部引用，不是吗？

几点：

1）我必须在循环中的 i++ 之后添加 Thread.sleep(1) 语句（对于 Paul 和 Ran 的解决方案）以避免 OOM，但这与大局无关，而且也依赖于平台。我的机器有一个四核 CPU 并且运行 Sun Linux 1.6.0_16 JDK。

2）在查看这些解决方案之后，我想我会坚持使用终结器。布洛赫的书给出了以下理由：

不能保证终结器会立即执行，因此永远不要在终结器中做任何时间关键的事情——对 SoftRererence 也没有任何保证！
永远不要依赖终结器来更新关键的持久状态——我不是
使用终结器会带来严重的性能损失——在最坏的情况下，我每分钟左右会终结一个对象。我想我可以忍受这一点。
使用 try/finally ——哦，是的，我一定会的！

仅仅为了看似简单的任务就需要创建大量的脚手架，这对我来说似乎不合理。我的意思是，从字面上看，对于其他查看此类代码的人来说，每分钟的 WTF 率会相当高。

3）悲伤的是，保罗，汤姆和ran之间没有办法分配点:(我希望汤姆不介意，因为他已经得到了很多:)判断保罗和ran之间很难是正确的。我只是对 Paul 的答案设置接受标志，因为它的评级更高（并且有更详细的解释），但 Ran 的解决方案一点也不差，如果我选择使用 SoftReferences 实现它，可能会是我的选择。多谢你们！

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow