Question

I'm captivated by autoencoders and really like the idea of convolution. It seems though that both Theano and TensorFlow only support conv2d to go from an array of 2D-RGB (n 3D arrays) to an array of higher-depth images. That makes sense from the traditional tensor-product math, c_ijkl = sum{a_ijk*b_klm}, but means it's hard to 'de-convolve' an image.

In both cases, if I have an image (in #batch, depth, height, width form), I can do a conv to get (#batch, num_filters, height/k, width/k). I'd really like to do the opposite, like going from (#batch, some_items, height/k, width/k) to (#batch, depth, height, width).

TensorFlow had the hidden deconv2d function for a while (in 0.6, I think, undocumented), but I'd like to know if there's a math trick I can use to get a bigger output in the last two dimensions after a convolution than the input. I'd settle for a series of differentiable operations, like conv -> resize, but I want to avoid just doing a dense matrix multiplication -> resize like I've been doing so far.

EDIT: As of today (2016/02/17) TensorFlow 0.7 has the tf.depth_to_space method, which helps greatly in this endeavor. (https://www.tensorflow.org/api_docs/python/tf/depth_to_space) I would still love a Theano based solution, too, to complete my understanding of the material.

Was it helpful?

Solution

Things have changed in TensorFlow since this question was asked but here is a link to doing conv2d_transpose. I think thats what you are looking for

OTHER TIPS

Maybe have a look at this post. You can do a convolution which produces an output of similar size and then "unpool" those feature maps.

Not sure if you're looking for filter weights in the deconvolutional layer to be tied to corresponding convolutional layer, but either is possible in Lasagne which runs on Theano. An untied implementation of deconvolutional layer which outputs an image larger than its input: https://groups.google.com/forum/?hl=en#!topic/lasagne-users/9H6-mmnkHX0

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top