Need a little help Understanding how to build model's in KerasSimple prediction with KerasValidation score (f1) remains the same when swapping labelsMy Neural network in Tensorflow does a bad job in comparison to the same Neural network in KerasTraining Accuracy stuck in KerasValue error in Merging two different models in kerasValue of loss and accuracy does not change over Epochskeras model only predicts one class for all the test imagesModel loss and validation loss not decreasing? How to speed?What causes the network validation loss to always be lower than train loss?loss/val_loss decrease but acc/val_acc are consistent
Is there any deeper thematic meaning to the white horse that Arya finds in The Bells (S08E05)?
Meaning of "legitimate" in Carl Jung's quote "Neurosis is always a substitute for legitimate suffering."
Wifi is sometimes soft blocked by unknown service
When did game consoles begin including FPUs?
Why did the metro bus stop at each railway crossing, despite no warning indicating a train was coming?
Why is the Advance Variation considered strong vs the Caro-Kann but not vs the Scandinavian?
What do the "optional" resistor and capacitor do in this circuit?
Does the Rogue's Reliable Talent feature work for thieves' tools, since the rogue is proficient in them?
Capital gains on stocks sold to take initial investment off the table
Is my test coverage up to snuff?
Is the seat-belt sign activation when a pilot goes to the lavatory standard procedure?
Do crew rest seats count towards the maximum allowed number of seats per flight attendant?
What metal is most suitable for a ladder submerged in an underground water tank?
How to continually let my readers know what time it is in my story, in an organic way?
Polynomial division: Is this trick obvious?
Why can't I share a one use code with anyone else?
Would life always name the light from their sun "white"
Assembly writer vs compiler
Holding rent money for my friend which amounts to over $10k?
tikz drawing rectangle discretized with triangle lattices and its centroids
How could it be that 80% of townspeople were farmers during the Edo period in Japan?
Can my American children re-enter the USA by International flight with a passport card? Being that their passport book has expired
Can I say: "When was your train leaving?" if the train leaves in the future?
Why doesn't Iron Man's action affect this person in Endgame?
Need a little help Understanding how to build model's in Keras
Simple prediction with KerasValidation score (f1) remains the same when swapping labelsMy Neural network in Tensorflow does a bad job in comparison to the same Neural network in KerasTraining Accuracy stuck in KerasValue error in Merging two different models in kerasValue of loss and accuracy does not change over Epochskeras model only predicts one class for all the test imagesModel loss and validation loss not decreasing? How to speed?What causes the network validation loss to always be lower than train loss?loss/val_loss decrease but acc/val_acc are consistent
$begingroup$
I am trying to make a CNN in Keras, and to test the validity of my model i am trying to get it to train on MNIST dataset, so i am sure that everything is working fine, but unfortunately model is barely training and i suspect that nothing updating.
My model is :
model=Sequential()
#conv1_1
model.add(Conv2D(128,kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv1_1',input_shape=(28,28,1)))
#conv1_2
model.add(Conv2D(128, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv1_2'))
model.add(MaxPooling2D(pool_size=2,strides=2))
#conv2_1
model.add(Conv2D(64, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name="conv2_1"))
#conv2_2
model.add(Conv2D(64, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv2_2'))
model.add(MaxPooling2D(pool_size=2,strides=2))
model.add(Flatten())
model.add(Dense(1024, activation='relu',name='Dense1'))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu',name='Dense2'))
model.add(Dense(10, activation='softmax',name='output'))
Compiled with:
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])
model.fit(X_train,y_train,batch_size=10,validation_split=0.2,epochs=10)
My X_train and y_train look like:
plt.imshow(X_train[0].reshape(28,28))
plt.show()
y_train[0]
array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])
Here are the Results of first 3 epochs:
Epoch 1/10
48000/48000 [==============================] - 45s 927us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 2/10
48000/48000 [==============================] - 44s 915us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 3/10
48000/48000 [==============================] - 44s 924us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 4/10
48000/48000 [==============================] - 45s 930us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
This is my first Keras Model, and i think i am missing something important here.
keras cnn mnist
$endgroup$
add a comment |
$begingroup$
I am trying to make a CNN in Keras, and to test the validity of my model i am trying to get it to train on MNIST dataset, so i am sure that everything is working fine, but unfortunately model is barely training and i suspect that nothing updating.
My model is :
model=Sequential()
#conv1_1
model.add(Conv2D(128,kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv1_1',input_shape=(28,28,1)))
#conv1_2
model.add(Conv2D(128, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv1_2'))
model.add(MaxPooling2D(pool_size=2,strides=2))
#conv2_1
model.add(Conv2D(64, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name="conv2_1"))
#conv2_2
model.add(Conv2D(64, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv2_2'))
model.add(MaxPooling2D(pool_size=2,strides=2))
model.add(Flatten())
model.add(Dense(1024, activation='relu',name='Dense1'))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu',name='Dense2'))
model.add(Dense(10, activation='softmax',name='output'))
Compiled with:
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])
model.fit(X_train,y_train,batch_size=10,validation_split=0.2,epochs=10)
My X_train and y_train look like:
plt.imshow(X_train[0].reshape(28,28))
plt.show()
y_train[0]
array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])
Here are the Results of first 3 epochs:
Epoch 1/10
48000/48000 [==============================] - 45s 927us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 2/10
48000/48000 [==============================] - 44s 915us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 3/10
48000/48000 [==============================] - 44s 924us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 4/10
48000/48000 [==============================] - 45s 930us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
This is my first Keras Model, and i think i am missing something important here.
keras cnn mnist
$endgroup$
add a comment |
$begingroup$
I am trying to make a CNN in Keras, and to test the validity of my model i am trying to get it to train on MNIST dataset, so i am sure that everything is working fine, but unfortunately model is barely training and i suspect that nothing updating.
My model is :
model=Sequential()
#conv1_1
model.add(Conv2D(128,kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv1_1',input_shape=(28,28,1)))
#conv1_2
model.add(Conv2D(128, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv1_2'))
model.add(MaxPooling2D(pool_size=2,strides=2))
#conv2_1
model.add(Conv2D(64, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name="conv2_1"))
#conv2_2
model.add(Conv2D(64, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv2_2'))
model.add(MaxPooling2D(pool_size=2,strides=2))
model.add(Flatten())
model.add(Dense(1024, activation='relu',name='Dense1'))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu',name='Dense2'))
model.add(Dense(10, activation='softmax',name='output'))
Compiled with:
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])
model.fit(X_train,y_train,batch_size=10,validation_split=0.2,epochs=10)
My X_train and y_train look like:
plt.imshow(X_train[0].reshape(28,28))
plt.show()
y_train[0]
array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])
Here are the Results of first 3 epochs:
Epoch 1/10
48000/48000 [==============================] - 45s 927us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 2/10
48000/48000 [==============================] - 44s 915us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 3/10
48000/48000 [==============================] - 44s 924us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 4/10
48000/48000 [==============================] - 45s 930us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
This is my first Keras Model, and i think i am missing something important here.
keras cnn mnist
$endgroup$
I am trying to make a CNN in Keras, and to test the validity of my model i am trying to get it to train on MNIST dataset, so i am sure that everything is working fine, but unfortunately model is barely training and i suspect that nothing updating.
My model is :
model=Sequential()
#conv1_1
model.add(Conv2D(128,kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv1_1',input_shape=(28,28,1)))
#conv1_2
model.add(Conv2D(128, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv1_2'))
model.add(MaxPooling2D(pool_size=2,strides=2))
#conv2_1
model.add(Conv2D(64, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name="conv2_1"))
#conv2_2
model.add(Conv2D(64, kernel_size=3, strides=1,
padding='SAME', use_bias=False,
activation='relu',name='conv2_2'))
model.add(MaxPooling2D(pool_size=2,strides=2))
model.add(Flatten())
model.add(Dense(1024, activation='relu',name='Dense1'))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu',name='Dense2'))
model.add(Dense(10, activation='softmax',name='output'))
Compiled with:
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=['accuracy'])
model.fit(X_train,y_train,batch_size=10,validation_split=0.2,epochs=10)
My X_train and y_train look like:
plt.imshow(X_train[0].reshape(28,28))
plt.show()
y_train[0]
array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])
Here are the Results of first 3 epochs:
Epoch 1/10
48000/48000 [==============================] - 45s 927us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 2/10
48000/48000 [==============================] - 44s 915us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 3/10
48000/48000 [==============================] - 44s 924us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
Epoch 4/10
48000/48000 [==============================] - 45s 930us/step - loss: 14.2813 - acc: 0.1140 - val_loss: 14.4096 - val_acc: 0.1060
This is my first Keras Model, and i think i am missing something important here.
keras cnn mnist
keras cnn mnist
asked May 4 at 4:26
Muhammad Fasiurrehman SohiMuhammad Fasiurrehman Sohi
132
132
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
There are two things I can suspect. First, the dropout rate at the last layer seems way to high. Its better to have a lower dropout rate after each CNN layer. Secondly, you should use a bias in your CNN layers.
Try out this code as a starting point and then you can start tuning your model from here.
Load the data
from keras.datasets import mnist
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)
Import Keras stuff
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Now we reshape the data such that it can fit with the tensorflow backend. This requires the channel to be the last dimension. We will also set up our one-hot encoded outputs
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Define the model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Train the model
epochs = 10
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))
Evaluate the model
score = model.evaluate(x_test_reshaped, y_test_binary, verbose=0)
print('Model accuracy:')
print('Test loss:', score[0])
print('Test accuracy:', score[1])
$endgroup$
add a comment |
$begingroup$
I have implemented your model to the astonishment, there is a very minute error that is hard to notice.
The way, I was able to get better accuracy is by changing the optimizer to "SGD" or "ADAM".
As you have used "ADADELTA" which is an extension of "ADAGRAD" optimizer. In "ADAGRAD" has good performs on sparse data & while training a large scale neural network. Its monotonic learning rate usually proves too aggressive, stops learning too early.
Refer to this link for understanding on optimizers
$endgroup$
$begingroup$
Do note that the MNIST dataset is sparse.
$endgroup$
– JahKnows
May 4 at 17:04
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f51354%2fneed-a-little-help-understanding-how-to-build-models-in-keras%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
There are two things I can suspect. First, the dropout rate at the last layer seems way to high. Its better to have a lower dropout rate after each CNN layer. Secondly, you should use a bias in your CNN layers.
Try out this code as a starting point and then you can start tuning your model from here.
Load the data
from keras.datasets import mnist
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)
Import Keras stuff
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Now we reshape the data such that it can fit with the tensorflow backend. This requires the channel to be the last dimension. We will also set up our one-hot encoded outputs
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Define the model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Train the model
epochs = 10
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))
Evaluate the model
score = model.evaluate(x_test_reshaped, y_test_binary, verbose=0)
print('Model accuracy:')
print('Test loss:', score[0])
print('Test accuracy:', score[1])
$endgroup$
add a comment |
$begingroup$
There are two things I can suspect. First, the dropout rate at the last layer seems way to high. Its better to have a lower dropout rate after each CNN layer. Secondly, you should use a bias in your CNN layers.
Try out this code as a starting point and then you can start tuning your model from here.
Load the data
from keras.datasets import mnist
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)
Import Keras stuff
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Now we reshape the data such that it can fit with the tensorflow backend. This requires the channel to be the last dimension. We will also set up our one-hot encoded outputs
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Define the model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Train the model
epochs = 10
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))
Evaluate the model
score = model.evaluate(x_test_reshaped, y_test_binary, verbose=0)
print('Model accuracy:')
print('Test loss:', score[0])
print('Test accuracy:', score[1])
$endgroup$
add a comment |
$begingroup$
There are two things I can suspect. First, the dropout rate at the last layer seems way to high. Its better to have a lower dropout rate after each CNN layer. Secondly, you should use a bias in your CNN layers.
Try out this code as a starting point and then you can start tuning your model from here.
Load the data
from keras.datasets import mnist
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)
Import Keras stuff
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Now we reshape the data such that it can fit with the tensorflow backend. This requires the channel to be the last dimension. We will also set up our one-hot encoded outputs
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Define the model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Train the model
epochs = 10
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))
Evaluate the model
score = model.evaluate(x_test_reshaped, y_test_binary, verbose=0)
print('Model accuracy:')
print('Test loss:', score[0])
print('Test accuracy:', score[1])
$endgroup$
There are two things I can suspect. First, the dropout rate at the last layer seems way to high. Its better to have a lower dropout rate after each CNN layer. Secondly, you should use a bias in your CNN layers.
Try out this code as a starting point and then you can start tuning your model from here.
Load the data
from keras.datasets import mnist
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
print('Training data shape: ', x_train.shape)
print('Testing data shape : ', x_test.shape)
Import Keras stuff
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras import backend as K
Now we reshape the data such that it can fit with the tensorflow backend. This requires the channel to be the last dimension. We will also set up our one-hot encoded outputs
# The known number of output classes.
num_classes = 10
# Input image dimensions
img_rows, img_cols = 28, 28
# Channels go last for TensorFlow backend
x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
# Convert class vectors to binary class matrices. This uses 1 hot encoding.
y_train_binary = keras.utils.to_categorical(y_train, num_classes)
y_test_binary = keras.utils.to_categorical(y_test, num_classes)
Define the model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Train the model
epochs = 10
batch_size = 128
# Fit the model weights.
model.fit(x_train_reshaped, y_train_binary,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test_reshaped, y_test_binary))
Evaluate the model
score = model.evaluate(x_test_reshaped, y_test_binary, verbose=0)
print('Model accuracy:')
print('Test loss:', score[0])
print('Test accuracy:', score[1])
answered May 4 at 5:22
JahKnowsJahKnows
5,452727
5,452727
add a comment |
add a comment |
$begingroup$
I have implemented your model to the astonishment, there is a very minute error that is hard to notice.
The way, I was able to get better accuracy is by changing the optimizer to "SGD" or "ADAM".
As you have used "ADADELTA" which is an extension of "ADAGRAD" optimizer. In "ADAGRAD" has good performs on sparse data & while training a large scale neural network. Its monotonic learning rate usually proves too aggressive, stops learning too early.
Refer to this link for understanding on optimizers
$endgroup$
$begingroup$
Do note that the MNIST dataset is sparse.
$endgroup$
– JahKnows
May 4 at 17:04
add a comment |
$begingroup$
I have implemented your model to the astonishment, there is a very minute error that is hard to notice.
The way, I was able to get better accuracy is by changing the optimizer to "SGD" or "ADAM".
As you have used "ADADELTA" which is an extension of "ADAGRAD" optimizer. In "ADAGRAD" has good performs on sparse data & while training a large scale neural network. Its monotonic learning rate usually proves too aggressive, stops learning too early.
Refer to this link for understanding on optimizers
$endgroup$
$begingroup$
Do note that the MNIST dataset is sparse.
$endgroup$
– JahKnows
May 4 at 17:04
add a comment |
$begingroup$
I have implemented your model to the astonishment, there is a very minute error that is hard to notice.
The way, I was able to get better accuracy is by changing the optimizer to "SGD" or "ADAM".
As you have used "ADADELTA" which is an extension of "ADAGRAD" optimizer. In "ADAGRAD" has good performs on sparse data & while training a large scale neural network. Its monotonic learning rate usually proves too aggressive, stops learning too early.
Refer to this link for understanding on optimizers
$endgroup$
I have implemented your model to the astonishment, there is a very minute error that is hard to notice.
The way, I was able to get better accuracy is by changing the optimizer to "SGD" or "ADAM".
As you have used "ADADELTA" which is an extension of "ADAGRAD" optimizer. In "ADAGRAD" has good performs on sparse data & while training a large scale neural network. Its monotonic learning rate usually proves too aggressive, stops learning too early.
Refer to this link for understanding on optimizers
answered May 4 at 7:27
Swapnil PoteSwapnil Pote
512
512
$begingroup$
Do note that the MNIST dataset is sparse.
$endgroup$
– JahKnows
May 4 at 17:04
add a comment |
$begingroup$
Do note that the MNIST dataset is sparse.
$endgroup$
– JahKnows
May 4 at 17:04
$begingroup$
Do note that the MNIST dataset is sparse.
$endgroup$
– JahKnows
May 4 at 17:04
$begingroup$
Do note that the MNIST dataset is sparse.
$endgroup$
– JahKnows
May 4 at 17:04
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f51354%2fneed-a-little-help-understanding-how-to-build-models-in-keras%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown