Linear Regression
Linear Regression from scratch
Regression is a set of methods for estimating the relationships between a outcome variable
and features
.
When our input consist of d features, we express our prediction $\hat{y}$ as $$\hat{y} = w_1 x_1 + ... + w_d x_d + b$$
Collecting all features into a vector $\mathbf{x} \in \mathbb{R}^d$ and all weights into a vector $\mathbf{w} \in \mathbb{R}^d$, we can express our model compacity using a dot product: $$ \hat{y} = \mathbf{w}^T\mathbf{x} +b$$
The vector $\mathbb{x}$ corresponds to features of a single data example. To represent tho whole dataset we use $\mathbf{X} \in \mathbb{R}^{n\times d}$. Here $\mathbf{X}$ contains one row for every example and one column for every feature. The predictions $\mathbf{\hat{y}} \in \mathbb{R}^n$ can be expressed as: $$ \mathbf{\hat{y}} = \mathbf{Xw} +b$$
The Loss function
quantifies the distance between the real and predicted value of the target.
$$ (y - \hat{y})^2 = \|\mathbf{y} - \mathbf{X}\mathbf{w}\|^2 $$
We need to find $\mathbf{w}$ to minimize the loss function. The analytic solution is: $$ (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X^T}\mathbf{y}$$
import tensorflow as tf
import random
def synthetic_data(w, b, num_examples):
"""Generate y = Xw + b + noise."""
X = tf.zeros((num_examples, w.shape[0]))
X += tf.random.normal(shape=X.shape)
y = tf.matmul(X, tf.reshape(w, (-1, 1))) + b
y += tf.random.normal(shape=y.shape, stddev=0.01)
y = tf.reshape(y, (-1, 1))
return X, y
true_w = tf.constant([2, -3.4])
true_b = 4.2
X, y = synthetic_data(true_w, true_b, 1000)
def solution(X, y):
X = tf.concat([X,tf.ones(y.shape)],1)
A = tf.linalg.inv(tf.matmul(tf.transpose(X),X))
B = tf.matmul(tf.transpose(X),y)
return tf.matmul(A,B)
solution(X,y)