How to extract features from the encoded layer of an autoencoder?

https://datascience.stackexchange.com/questions/64412

19-10-2020
|

Question

I have done some research on autoencoders, and I have come to understand that they can also be used for feature extraction (see this question on this site as an example). Most of the examples out there seem to focus on autoencoders applied to image data, but I would like to apply them to a more general data set.

Therefore, I have implemented an autoencoder using the keras framework in Python. For simplicity, and to test my program, I have tested it against the Iris Data Set, telling it to compress my original data from 4 features down to 2, to see how it would behave.

The encoder seems to be doing its job in compressing the data (the output of the encoder layer does indeed show only two columns). However, the values of these two columns do not appear in the original dataset, which makes me think that the autoencoder is doing something in the background, selecting/combining the features in order to get to the compressed representation.

Here is the complete working example:

from pandas import read_csv
from numpy.random import seed
from sklearn.model_selection import train_test_split
from keras.layers import Input, Dense
from keras.models import Model

# Get input data and separate features from labels
df = read_csv("iris.data")
Y = df.iloc[:,4]
X = df.iloc[:, : 4]

# Split data set in train and test data
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.5, random_state=seed(1234))

# Input information
col_num = X.shape[1]
input_dim = Input(shape=(col_num,))

# Encoding information
encoding_dim = 2
encoded = Dense(encoding_dim, activation='relu')(input_dim)
# Decoding information
decoded = Dense(col_num, activation='sigmoid')(encoded)
# Autoencoder information (encoder + decoder)
autoencoder = Model(input=input_dim, output=decoded)

# Train the autoencoder
autoencoder.compile(optimizer='adadelta', loss='mean_squared_error')
autoencoder.fit(X_train, X_train, nb_epoch=50, batch_size=100, shuffle=True, validation_data=(X_test, X_test))

# Encoder information for feature extraction
encoder = Model(input=input_dim, output=encoded)
encoded_input = Input(shape=(encoding_dim,))
encoded_output = encoder.predict(X_test)

# Show the encoded values
print(encoded_output[:5])

This is the output from this example:

[[ 0.28065908  6.151131  ]
 [ 0.8104178   5.042427  ]
 [-0.          6.4602194 ]
 [ 3.0278277   2.7351477 ]
 [ 0.06134868  5.064625  ]]

Basically, my idea was to use the autoencoder to extract the most relevant features from the original data set. However, so far I have only managed to get the autoencoder to compress the data, without really understanding what the most important features are though.

My question is therefore this: is there any way to understand which features are being considered by the autoencoder to compress the data, and how exactly they are used to get to the 2-column compressed representation?

Solution

You are using a dense neural network layer to do encoding. This layer does a linear combination of the input layers + specified non-linearity operation on the input.

Important to note that auto-encoders can be used for feature extraction and not feature selection. It will take information represented in the original space and transform it to another space. The compression happens because there's some redundancy in the input representation for this specific task, the transformation removes that redundancy. Original features are lost, you have features in the new space.

Which input features are being used by the encoder? Answer is all of them. For how exactly are they used? Answer is you can check the weights assigned by the neural network for the input to Dense layer transformation to give you some idea. You can probably build some intuition based on the weights assigned (example: output feature 1 is built by giving high weight to input feature 2 & 3. So encoder combined feature 2 and 3 into single feature) . But there's a non-linearity (ReLu) involved so there's no simple linear combination of inputs.

If your aim is to get qualitative understanding of how features can be combined, you can use a simpler method like Principal Component Analysis. The factor loadings given in PCA method's output tell you how the input features are combined. If the aim is to find most efficient feature transformation for accuracy, neural network based encoder is useful. But you loose interpretability of the feature extraction/transformation somewhat.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange