# Linear Regression: Basis Functions, Vectorization

## Preview text

Linear Regression: Basis Functions, Vectorization
These slides were assembled by Byron Boots, with grateful acknowledgement to Eric Eaton and the many others who made their course materials freely available online. Feel free to reuse or adapt these slides for your own academic purposes, provided that you include proper attribution.
Robot Image Credit: Viktoriya Sukhanova © 123RF.com

Last Time: Linear Regression

• Hypothesis:

Xd

y = ✓0 + ✓1x1 + ✓2x2 + . . . + ✓dxd = ✓j xj

j=0

• Fit model by minimizing sum of squared errors

x

Figures are courtesy of Greg Shakhnarovich

2

• Initialize ✓ • Repeat until convergence
@ ✓j ✓j ↵ J (✓)
@✓j

simultaneous update for j = 0 ... d

3
2
J (✓) 1 ↵
0 -0.5 0 0.5 1 1.5 2 2.5
✓3

Regression

Given:

n

o

– Data X = x(1), . . . , x(n) where x(i) 2 Rd

n

o

– Corresponding labels y = y(1), . . . , y(n) where

y(i) 2 R

9

September Arctic Sea Ice Extent (1,000,000 sq km)

8

7

6

5

4
3 Linear Regression 2 Quadratic Regression

1

0 1975

1980

1985

1990

1995 Year

2000

2005

2010

2015

Data from G. Witt. Journal of Statistics Education, Volume 21, Number 1 (2013)

4

Extending Linear Regression to More Complex Models
• The inputs X for linear regression can be: – Original quantitative inputs – Transformation of quantitative inputs
• e.g. log, exp, square root, square, etc.
– Polynomial transformation
• example: y = b0 + b1×x + b2×x2 + b3×x3
– Basis expansions – Dummy coding of categorical inputs – Interactions between variables
• example: x3 = x1 × x2
This allows use of linear regression techniques to fit non-linear datasets.

Linear Basis Function Models
• Generally, Xd h✓(x) = ✓j j(x)
j=0
basis function
• Typically, 0(x) = 1 so that ✓0 acts as a bias • In the simplest case, we use linear basis functions :
j(x) = xj
Based on slide by Christopher Bishop (PRML)

Linear Basis Function Models
• Polynomial basis functions:
– These are global; a small change in x affects all basis functions
• Gaussian basis functions:
– These are local; a small change in x only affect nearby basis functions. μj and s control location and scale (width).
Based on slide by Christopher Bishop (PRML)

Linear Basis Function Models
• Sigmoidal basis functions:
where
– These are also local; a small change in x only affects nearby basis functions. μj and s control location and scale (slope).
Based on slide by Christopher Bishop (PRML)

Example of Fitting a Polynomial Curve with a Linear Model
Xp y = ✓0 + ✓1x + ✓2x2 + . . . + ✓pxp = ✓jxj
j=0

Linear Basis Function Models

• Basic linear model:

Xd h✓(x) = ✓jxj

j=0

Xd • More general linear model: h✓(x) = ✓j j(x)

j=0

• Once we have replaced the data by the outputs of the basis functions, fitting the generalized model is exactly the same problem as fitting the basic model
– Unless we use the kernel trick – more on that when we cover support vector machines

Based on slide by Geoff Hinton

10 