Con @functools.lru_cache con il dizionario di argomenti

https://stackoverflow.com/questions/6358481

28-10-2019
|

Domanda

Ho un metodo che prende, tra gli altri, un dizionario come argomento.Il metodo di analisi delle stringhe e il dizionario fornisce sostituzione di alcuni sottostringhe, quindi, non deve essere modificabile.

Questa funzione viene chiamata molto spesso, e su elementi ridondanti così ho pensato che la cache di migliorare la sua efficienza.

Ma, come si può immaginare, dal momento che dict è mutevole, e quindi non hashable, @functools.lru_cache non è possibile decorare la mia funzione.Allora, come posso ovviare a questo?

Punto di Bonus se si ha bisogno solo di standard di libreria di classi e metodi.Idealmente, se esiste un qualche tipo di frozendict in libreria standard che non ho visto renderebbe la mia giornata.

PS: namedtuple solo in ultima istanza, dal momento che avrebbe bisogno di un grande sintassi turno.

Soluzione

Invece di utilizzare un custom hashable dizionario, di utilizzare questo e evitare di reinventare la ruota!Si tratta di un surgelati dizionario che tutti hashable.

https://pypi.org/project/frozendict/

Codice:

def freezeargs(func):
    """Transform mutable dictionnary
    Into immutable
    Useful to be compatible with cache
    """

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        args = tuple([frozendict(arg) if isinstance(arg, dict) else arg for arg in args])
        kwargs = {k: frozendict(v) if isinstance(v, dict) else v for k, v in kwargs.items()}
        return func(*args, **kwargs)
    return wrapped

e poi

@freezeargs
@lru_cache
def func(...):
    pass

Codice preso da @fast_cen 's risposta

Nota:questo non funziona in ricorsiva datastructures;per esempio, si potrebbe avere un argomento un elenco, che è unhashable.Siete invitati a fare il wrapping ricorsiva, in modo tale che si scende in profondità la struttura dei dati e rende ogni dict congelati e ogni list tupla.

(So che OP nolonger vogliono una soluzione, ma sono venuto qui in cerca per la stessa soluzione, in questo modo, lasciando questo per le generazioni future)

Altri suggerimenti

Che ne dici di creare un hashable dict Classe come così:

class HDict(dict):
    def __hash__(self):
        return hash(frozenset(self.items()))

substs = HDict({'foo': 'bar', 'baz': 'quz'})
cache = {substs: True}

Ecco un decoratore che usa il trucco @mhyfritz.

def hash_dict(func):
    """Transform mutable dictionnary
    Into immutable
    Useful to be compatible with cache
    """
    class HDict(dict):
        def __hash__(self):
            return hash(frozenset(self.items()))

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        args = tuple([HDict(arg) if isinstance(arg, dict) else arg for arg in args])
        kwargs = {k: HDict(v) if isinstance(v, dict) else v for k, v in kwargs.items()}
        return func(*args, **kwargs)
    return wrapped

Basta aggiungerlo prima del tuo LRU_CACHE.

@hash_dict
@functools.lru_cache()
def your_function():
    ...

Che ne dici di sottoclasse namedtuple e aggiungere accesso da x["key"]?

class X(namedtuple("Y", "a b c")):
    def __getitem__(self, item):
        if isinstance(item, int):
            return super(X, self).__getitem__(item)
        return getattr(self, item)

Ecco un decoratore che può essere usato come functools.lru_cache. Ma questo è mirato alle funzioni che prendono solo un argomento il quale è un mappatura piatta insieme a Valori hashable e ha un fisso maxsize di 64. Per il tuo caso d'uso dovresti adattare questo esempio o il codice client. Inoltre, per impostare il maxsize Individualmente, uno ha dovuto implementare un altro decoratore, ma non ho avvolto la testa intorno a questo poiché non ne avevo bisogno.

from functools import (_CacheInfo, _lru_cache_wrapper, lru_cache,
                       partial, update_wrapper)
from typing import Any, Callable, Dict, Hashable

def lru_dict_arg_cache(func: Callable) -> Callable:
    def unpacking_func(func: Callable, arg: frozenset) -> Any:
        return func(dict(arg))

    _unpacking_func = partial(unpacking_func, func)
    _cached_unpacking_func = \
        _lru_cache_wrapper(_unpacking_func, 64, False, _CacheInfo)

    def packing_func(arg: Dict[Hashable, Hashable]) -> Any:
        return _cached_unpacking_func(frozenset(arg.items()))

    update_wrapper(packing_func, func)
    packing_func.cache_info = _cached_unpacking_func.cache_info
    return packing_func


@lru_dict_arg_cache
def uppercase_keys(arg: dict) -> dict:
    """ Yelling keys. """
    return {k.upper(): v for k, v in arg.items()}


assert uppercase_keys.__name__ == 'uppercase_keys'
assert uppercase_keys.__doc__ == ' Yelling keys. '
assert uppercase_keys({'ham': 'spam'}) == {'HAM': 'spam'}
assert uppercase_keys({'ham': 'spam'}) == {'HAM': 'spam'}
cache_info = uppercase_keys.cache_info()
assert cache_info.hits == 1
assert cache_info.misses == 1
assert cache_info.maxsize == 64
assert cache_info.currsize == 1
assert uppercase_keys({'foo': 'bar'}) == {'FOO': 'bar'}
assert uppercase_keys({'foo': 'baz'}) == {'FOO': 'baz'}
cache_info = uppercase_keys.cache_info()
assert cache_info.hits == 1
assert cache_info.misses == 3
assert cache_info.currsize == 3

Per un approccio più generico si potrebbe usare il decoratore @cachetools.cache da una libreria di terze parti con una funzione appropriata set come key.

Dopo aver deciso di abbandonare LRU Cache per il nostro caso d'uso per ora, abbiamo comunque escogitato una soluzione. Questo decoratore usa JSON per serializzare e deserializzare gli Args/Kwargs inviati alla cache. Funziona con un numero qualsiasi di Args. Usalo come decoratore su una funzione invece di @lru_cache. La dimensione massima è impostata su 1024.

def hashable_lru(func):
    cache = lru_cache(maxsize=1024)

    def deserialise(value):
        try:
            return json.loads(value)
        except Exception:
            return value

    def func_with_serialized_params(*args, **kwargs):
        _args = tuple([deserialise(arg) for arg in args])
        _kwargs = {k: deserialise(v) for k, v in kwargs.items()}
        return func(*_args, **_kwargs)

    cached_function = cache(func_with_serialized_params)

    @wraps(func)
    def lru_decorator(*args, **kwargs):
        _args = tuple([json.dumps(arg, sort_keys=True) if type(arg) in (list, dict) else arg for arg in args])
        _kwargs = {k: json.dumps(v, sort_keys=True) if type(v) in (list, dict) else v for k, v in kwargs.items()}
        return cached_function(*_args, **_kwargs)
    lru_decorator.cache_info = cached_function.cache_info
    lru_decorator.cache_clear = cached_function.cache_clear
    return lru_decorator

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow