Pregunta

A system that sometimes will need to use a pretrained machine learning model. That model is about 10Gb on disk, and when loaded uses about 10Gb of RAM.

Loading it from disk takes a nontrivial amount of time, so in general I wish to not do it too often. Certainly not every function call against it.

Right now, I am using a Lazy Loading-(ish) pattern, where the first time a function call is made against it it is loaded then stored in a global variable.

This is nice, because doing some runs of my system it will never be needed. So loading it lazily saves a couple of minutes on those runs.

However, other times my system is running as a long-running process (exposed via a web API). In these cases, I don't want to be using up 10GB of RAM all the time, it might be days (or weeks) between people using the API methods that rely on that model, and then it might be used 1000 times over 1 hour, and then be unused for days.

There are other programs (and other users) on this system, so I don't want to be hogging all the resources to this one program, when they are not being used.

So my idea is that after a certain amount of time, if no API calls have used the model I will trigger some code to unload the model (garbage collecting the memory), leaving it to be lazy-loaded again the next time it is needed.

  • Is this a sensible plan?
  • Is it a well-known pattern?
  • Maybe it is not required and I should just trust my OS to SWAP that out to disk.

This is related to Is there a name for the counterpart of the lazy loading pattern? However, that question seems unclear as to if it is actually just asking about memory management patterns in general.

¿Fue útil?

Solución 3

You don't need to do anything.

So long as your system has sufficient virtual memory (i.e Page file, Swap), the OS will swap out memory that hasn't been accessed in a long-time to disk.

You are basically describing implementing this manually, which on any normal system will not be required.

(This answer expands on the comment by @Caleth)

Otros consejos

The normal approach to caching would be to put stuff into the cache when you need it and remove it when you try to put a new thing into the cache but don't have enough memory to do so.

You then have a variety of ways to specific which of the things in the cache you want to remove first. Oldest, least used, biggest etc etc.

You don't say which language you are using, but I would be surprised if there were not already some caching libraries available for you to use.

Many garbage collected languages have the concept of a WeakReference<T>, including Java and .Net. This allows your code that loads the data to have a reference that can be garbage collected. As long as something in your code has a strong reference to the data it remains in memory.

This allows a fairly flexible way to lazy load data on demand, but normal memory pressure will purge the things that aren't being used any longer.

A typical use case would be something along these lines:

Get the value from the reference

If the value is null,
    load the data
    set the reference

return the data

The exact mechanism depends on your language. What this means is that you don't have to do your own garbage collection mechanisms, and it allows data to be unloaded to make room for new data.

However, there are many alternatives to explore:

  • Flyweight pattern with instance pooling
    • Allows fewer object references in memory since you likely are only referencing a handful of individual records at once
    • Makes sense when object creation is expensive but setting data on an existing object is fast--which can be true of any object when there are enough of them in RAM at once.
  • Caching
  • Manual garbage collection
Licenciado bajo: CC-BY-SA con atribución
scroll top