PyTorch vs. Tensorflow Fold

https://datascience.stackexchange.com/questions/16835

16-10-2019
|

Question

Both PyTorch and Tensorflow Fold are deep learning frameworks meant to deal with situations where the input data has non-uniform length or dimensions (that is, situations where dynamic graphs are useful or needed).

I would like to know how they compare, in the sense of paradigms they rely on (e.g. dynamic batching) and their implications, things that can/cannot be implemented in each one, weaknesses/strengths, etc.

I intend to use this info to choose one of them to start exploring dynamic computation graphs, but I have no specific task in mind.

Note 1: other dynamic computation graph frameworks like DyNet or Chainer are also welcome in the comparison, but I'd like to focus on PyTorch and Tensorflow Fold because I think they are/will be the most used ones.

Note 2: I have found this hackernews thread on PyTorch with some sparse info, but not much.

Note 3: Another relevant hackernews thread, about Tensorflow Fold, that contains some info about how they compare.

Note 4: relevant Reddit thread.

Note 5: relevant bug in Tensorflow Fold's github that identifies an important limitation: impossibility to do conditional branching during evaluation.

Note 6: discussion on pytorch forum about variable length inputs in relation to the algorithms used (e.g. dynamic batching).

Solution

There are a couple of good threads on Reddit right now (here and here).

I haven't used either of these frameworks, but from reading around and talking to users I gather that support for dynamic graphs in PyTorch is a 'top down design principle', whereas TensorFlow Fold is bolted on to the original Tensorflow framework, so if you're doing anything reasonably complicated with Tensorflow Fold you're probably going to end up doing a lot more hacking around than if you're using PyTorch.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange