문제

I am working on research, where need to classify one of three event WINNER=(win, draw, lose)

WINNER  LEAGUE  HOME    AWAY    MATCH_HOME  MATCH_DRAW  MATCH_AWAY  MATCH_U2_50 MATCH_O2_50
3         13    550      571          1.86        3.34        4.23       1.66     2.11
3         7     322     334           7.55         4.1         1.4       2.17     1.61

My current model is:

def build_model(input_dim, output_classes):
    model = Sequential()
    model.add(Dense(input_dim=input_dim, output_dim=12, activation=relu))
    model.add(Dropout(0.5))
    model.add(Dense(output_dim=output_classes, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adadelta')
    return model
  1. I am not sure that is the correct one for multi-class classification
  2. What is the best setup for binary classification?

EDIT: #2 - Like that?

model.add(Dense(input_dim=input_dim, output_dim=12, activation='sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(output_dim=output_classes, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adadelta')
도움이 되었습니까?

해결책

Your choices of activation='softmax' in the last layer and compile choice of loss='categorical_crossentropy' are good for a model to predict multiple mutually-exclusive classes.

Regarding more general choices, there is rarely a "right" way to construct the architecture. Instead that should be something you test with different meta-params (such as layer sizes, number of layers, amount of drop-out), and should be results-driven (including any limits you might have on resource use for training time/memory use etc).

Use a cross-validation set to help choose a suitable architecture. Once done, to get a more accurate measure of your model's general performance, you should use a separate test set. Data held out from your training set separate to the CV set should be used for this. A reasonable split might be 60/20/20 train/cv/test, depending on how much data you have, and how much you need to report an accurate final figure.

For Question #2, you can either just have two outputs with a softmax final similar to now, or you can have final layer with one output, activation='sigmoid' and loss='binary_crossentropy'.

Purely from a gut feel from what might work with this data, I would suggest trying with 'tanh' or 'sigmoid' activations in the hidden layer, instead of 'relu', and I would also suggest increasing the number of hidden neurons (e.g. 100) and reducing the amount of dropout (e.g. 0.2). Caveat: Gut feeling on neural network architecture is not scientific. Try it, and test it.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top