what the author of nedtries means by “in-place”?

https://stackoverflow.com/questions/4692075

11-10-2019
|

Question

I. Just implemented a kind of bitwise trie (based on nedtries), but my code does lot Of memory allocation (for each node). Contrary to my implemetation, nedtries are claimed to be fast , among othet things, Because of their small number of memory allocation (if any). The author claim his implementation to be "in-place", but what does it really means in this context ? And how does nedtries achieve such a small number of dynamic memory allocation ?

Ps: I know that the sources are available, but the code is pretty hard to follow and I cannot figure how it works

Solution

I took a look at the nedtrie.h source code. It seems that the reason it is "in-place" is that you have to add the trie bookkeeping data to the items that you want to store.

You use the NEDTRIE_ENTRY macro to add parent/child/next/prev links to your data structure, and you can then pass that data structure to the various trie routines, which will extract and use those added members.

So it is "in-place" in the sense that you augment your existing data structures and the trie code piggybacks on that.

At least that's what it looks like. There's lots of macro goodness in that code so I could have gotten myself confused (:

OTHER TIPS

I'm the author, so this is for the benefit of the many according to Google who are similarly having difficulties in using nedtries. I would like to thank the people here on stackflow for not making unpleasant comments about me personally which some other discussions about nedtries do.

I am afraid I don't understand the difficulties with knowing how to use it. Usage is exceptionally easy - simply copy the example in the Readme.html file:

typedef struct foo_s foo_t;
struct foo_s {
  NEDTRIE_ENTRY(foo_t) link;
  size_t key;
};
typedef struct foo_tree_s foo_tree_t;
NEDTRIE_HEAD(foo_tree_s, foo_t);
static foo_tree_t footree;

static size_t fookeyfunct(const foo_t *RESTRICT r)
{
  return r->key;
}

NEDTRIE_GENERATE(static, foo_tree_s, foo_s, link, fookeyfunct, NEDTRIE_NOBBLEZEROS(foo_tree_s));

int main(void)
{
  foo_t a, b, c, *r;
  NEDTRIE_INIT(&footree);
  a.key=2;
  NEDTRIE_INSERT(foo_tree_s, &footree, &a);
  b.key=6;
  NEDTRIE_INSERT(foo_tree_s, &footree, &b);
  r=NEDTRIE_FIND(foo_tree_s, &footree, &b);
  assert(r==&b);
  c.key=5;
  r=NEDTRIE_NFIND(foo_tree_s, &footree, &c);
  assert(r==&b); /* NFIND finds next largest. Invert the key function to invert this */
  NEDTRIE_REMOVE(foo_tree_s, &footree, &a);
  NEDTRIE_FOREACH(r, foo_tree_s, &footree)
  {
    printf("%p, %u\n", r, r->key);
  }
  NEDTRIE_PREV(foo_tree_s, &footree, &a);
  return 0;
}

You declare your item type - here it's struct foo_s. You need the NEDTRIE_ENTRY() inside it otherwise it can contain whatever you like. You also need a key generating function. Other than that, it's pretty boilerplate.

I wouldn't have chosen this system of macro based initialisation myself! But it's for compatibility with the BSD rbtree.h so nedtries is very easy to swap in to anything using BSD rbtree.h.

Regarding my usage of "in place" algorithms, well I guess my lack of computer science training shows here. What I would call "in place" is when you only use the memory passed into a piece of code, so if you hand 64 bytes to an in place algorithm it will only touch that 64 bytes i.e. it won't make use of extra metadata, or allocate some extra memory, or indeed write to global state. A good example is an "in place" sort implementation where only the collection being sorted (and I suppose the thread stack) gets touched.

Hence no, nedtries doesn't need a memory allocator. It stores all the data it needs in the NEDTRIE_ENTRY and NEDTRIE_HEAD macro expansions. In other words, when you allocate your struct foo_s, you do all the memory allocation for nedtries.
Regarding understanding the "macro goodness", it's far easier to understand the logic if you compile it as C++ and then debug it :). The C++ build uses templates and the debugger will cleanly show you state at any given time. In fact, all debugging from my end happens in a C++ build and I meticulously transcribe the C++ changes into macroised C.
Lastly, before a new release, I search Google for people having problems with my software to see if I can fix things and I am typically amazed what someone people say about me and my free software. Firstly, why didn't those people having difficulties ask me directly for help? If I know that there is something wrong with the docs, then I can fix them - equally, asking on stackoverflow doesn't let me know immediately that there is a docs problem bur rather relies on me to find it next release. So all I would say is that if anyone finds a problem with my docs, please do email me and say so, even if there is a discussion say like here on stackflow.

Niall

In-place means you operate on the original (input) data, so the input data becomes the output data. Not-in-place means that you have separate input and output data, and the input data is not modified. In-place operations have a number of advantages - smaller cache/memory footprint, lower memory bandwidth, hence typically better performance, etc, but they have the disadvantage that they are destructive, i.e. you lose the original input data (which may or may not matter, depending on the use case).

In-place means to operate on the input data and (possibly) update it. The implication is that there no copying and/moving of the input data. This may result in loosing the input data original values which you will need to consider if it is relevant for your particular case.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow