QWebView's memory (cache) management

https://stackoverflow.com/questions/20136006

03-08-2022
|

Question

Here is the code that downloads the same page 10 times:

app = QApplication([])
event = threading.Event()

def load(url):
  def _load_finished(ok):
    event.set()

  web_view = QWebView()
  web_view.loadFinished.connect(_load_finished)
  event.clear()
  web_view.setUrl(QUrl(url));
  while not event.wait(.05): app.processEvents()
  web_view.loadFinished.disconnect(_load_finished)
  return web_view.page().mainFrame().documentElement()

QWebSettings.setMaximumPagesInCache(0)
QWebSettings.setObjectCacheCapacities(0, 0, 0)

if __name__ == '__main__':
  for i in range(10):
    load('http://www.huffingtonpost.com/')
    QWebSettings.clearMemoryCaches()
    QWebSettings.clearIconDatabase()
    print(i)
  app.exec_()

And here is Process Explorer's snapshot after 7th download:

Memory increase from 50MB to 170MB

At 10th download memory reaches 270MB. Is this normal? How do I fix it?

Oddly enough, depending on the address, consumption may fluctuate, but stay below certain threshold (here it's 90MB):

Memory stays within 70..90MB

Solution

Have stumbled onto this answer. Quoting comment in QT sources:

Dead resources in the cache are kept in non-purgeable memory.

When we prune dead resources, instead of freeing them, we mark their memory as purgeable and keep the resources until the kernel reclaims the purgeable memory.

By leaving the in-cache dead resources in dirty resident memory, we decrease the likelihood of the kernel claiming that memory and forcing us to refetch the resource (for example when a user presses back).

This sort of settles it.. and relives my restless soul.

Following bms20's advice I run QtWebKit code in a separate process (using subprocess.Popen) and cache web resources on disk (PyQt5.QtNetwork.QNetworkDiskCache) to preserve traffic:

def ExecuteCode(code):
  import os
  os.environ['PYTHONIOENCODING'] = 'utf-8' #Optionally
  from subprocess import Popen, PIPE, STDOUT
  proc = Popen('python.exe', stdin=PIPE)
  out, err = proc.communicate(code.encode())

Part of code content:

cache = QNetworkDiskCache()
cache.setCacheDirectory('cache')
web_view = QWebView()
web_view.page().networkAccessManager().setCache(cache)
# Do stuff with web_page

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow