Multilayer Perceptrons
Multilayer Perceptrons
- Hidden Layers
- Activation Functions
- Sigmoid Function
- Tanh Function
- Implement Multilayer Percemtrons from Scartch
- Initializing Parameters
- Activation Function
- Model
- Loss Function
- Training
Multilayer Perceptrons is a truly deep netwok. This is a simplest deep network consist of multiple layers of neurons each fully connected to those in the layer below and those abuse.
We can overcome the limitations of linear models and handle more general class of functions by incorporating one or more hiddn layers.
In order to realize the potential of multilayer architectures, we need one more key ingredient: a nonlinear activation function
$$\operatorname{sigmoid}(x) = \frac{1}{1 + \exp(-x)}.$$
y = tf.nn.sigmoid(x)
sns.lineplot(x=x.numpy(), y=y.numpy()).set( xlabel='x', ylabel='Sigmoid')
y = tf.nn.tanh(x)
sns.lineplot(x=x.numpy(), y=y.numpy()).set( xlabel='x', ylabel='Tanh')
from d2l import tensorflow as d2l
import tensorflow as tf
batch_size = 256
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
num_inputs, num_outputs, num_hiddens = 784, 10, 256
W1 = tf.Variable(tf.random.normal(
shape=(num_inputs, num_hiddens), mean=0, stddev=0.01))
b1 = tf.Variable(tf.zeros(num_hiddens))
W2 = tf.Variable(tf.random.normal(
shape=(num_hiddens, num_outputs), mean=0, stddev=0.01))
b2 = tf.Variable(tf.random.normal([num_outputs], stddev=.01))
params = [W1, b1, W2, b2]
def relu(X):
return tf.math.maximum(X,0)
def net(X):
X = tf.reshape(X, (-1, num_inputs))
H = relu(tf.matmul(X, W1) + b1)
return tf.matmul(H, W2) + b2
def loss(y_hat, y):
return tf.losses.sparse_categorical_crossentropy(
y, y_hat, from_logits=True)
num_epochs, lr = 10, 0.1
updater = d2l.Updater([W1, W2, b1, b2], lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)