Skip to content

Commit 5b873e7

Browse files
xhluluxhlulu
authored andcommitted
Added 3 sections, drafted out 2 sections
1 parent 8e15e60 commit 5b873e7

File tree

1 file changed

+155
-2
lines changed

1 file changed

+155
-2
lines changed

doc/python/ml-regression.md

Lines changed: 155 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,91 @@
1-
# Regression
1+
---
2+
jupyter:
3+
jupytext:
4+
notebook_metadata_filter: all
5+
text_representation:
6+
extension: .md
7+
format_name: markdown
8+
format_version: '1.1'
9+
jupytext_version: 1.1.1
10+
kernelspec:
11+
display_name: Python 3
12+
language: python
13+
name: python3
14+
language_info:
15+
codemirror_mode:
16+
name: ipython
17+
version: 3
18+
file_extension: .py
19+
mimetype: text/x-python
20+
name: python
21+
nbconvert_exporter: python
22+
pygments_lexer: ipython3
23+
version: 3.7.6
24+
plotly:
25+
description: Visualize regression in scikit-learn with Plotly
26+
display_as: ai_ml
27+
language: python
28+
layout: base
29+
name: ML Regression
30+
order: 2
31+
page_type: example_index
32+
permalink: python/ml-regression/
33+
thumbnail: thumbnail/knn-classification.png
34+
---
235

36+
## Basic linear regression
337

4-
### Visualizing kNN Regression
38+
This example shows how to train a simple linear regression from `sklearn` to predicts the tips servers will receive based on the value of the total bill (dataset is included in `px.data`).
39+
40+
```python
41+
import numpy as np
42+
import plotly.express as px
43+
import plotly.graph_objects as go
44+
from sklearn.linear_model import LinearRegression
45+
46+
df = px.data.tips()
47+
X = df.total_bill.values.reshape(-1, 1)
48+
49+
model = LinearRegression()
50+
model.fit(X, df.tip)
51+
52+
x_range = np.linspace(X.min(), X.max(), 100)
53+
y_range = model.predict(x_range.reshape(-1, 1))
54+
55+
fig = px.scatter(df, x='total_bill', y='tip', opacity=0.65)
56+
fig.add_traces(go.Scatter(x=x_range, y=y_range, name='Regression Fit'))
57+
fig.show()
58+
```
59+
60+
## Model generalization on unseen data
61+
62+
```python
63+
import numpy as np
64+
import plotly.express as px
65+
import plotly.graph_objects as go
66+
from sklearn.linear_model import LinearRegression
67+
from sklearn.model_selection import train_test_split
68+
69+
df = px.data.tips()
70+
X = df.total_bill.values.reshape(-1, 1)
71+
X_train, X_test, y_train, y_test = train_test_split(X, df.tip, random_state=0)
72+
73+
model = LinearRegression()
74+
model.fit(X_train, y_train)
75+
76+
x_range = np.linspace(X.min(), X.max(), 100)
77+
y_range = model.predict(x_range.reshape(-1, 1))
78+
79+
80+
fig = go.Figure([
81+
go.Scatter(x=X_train.squeeze(), y=y_train, name='train', mode='markers'),
82+
go.Scatter(x=X_test.squeeze(), y=y_test, name='test', mode='markers'),
83+
go.Scatter(x=x_range, y=y_range, name='prediction')
84+
])
85+
fig.show()
86+
```
87+
88+
## Comparing different kNN models parameters
589

690
```python
791
import numpy as np
@@ -27,6 +111,75 @@ fig.add_traces(go.Scatter(x=x_range, y=y_dist, name='Weights: Distance'))
27111
fig.show()
28112
```
29113

114+
## 3D regression surface with `px.scatter_3d` and `go.Surface`
115+
116+
```python
117+
import numpy as np
118+
import plotly.express as px
119+
import plotly.graph_objects as go
120+
from sklearn.neighbors import KNeighborsRegressor
121+
122+
mesh_size = .02
123+
margin = 0
124+
125+
df = px.data.iris()
126+
features = ["sepal_width", "sepal_length", "petal_width"]
127+
128+
X = df[['sepal_width', 'sepal_length']]
129+
y = df['petal_width']
130+
131+
# Condition the model on sepal width and length, predict the petal width
132+
knn = KNeighborsRegressor(10, weights='distance')
133+
knn.fit(X, y)
134+
135+
# Create a mesh grid on which we will run our model
136+
x_min, x_max = X.sepal_width.min() - margin, X.sepal_width.max() + margin
137+
y_min, y_max = X.sepal_length.min() - margin, X.sepal_length.max() + margin
138+
xrange = np.arange(x_min, x_max, mesh_size)
139+
yrange = np.arange(y_min, y_max, mesh_size)
140+
xx, yy = np.meshgrid(xrange, yrange)
141+
142+
# Run kNN
143+
pred = knn.predict(np.c_[xx.ravel(), yy.ravel()])
144+
pred = pred.reshape(xx.shape)
145+
146+
# Generate the plot
147+
fig = px.scatter_3d(df, x='sepal_width', y='sepal_length', z='petal_width')
148+
fig.update_traces(marker=dict(size=5))
149+
fig.add_traces(go.Surface(x=xrange, y=yrange, z=pred, name='pred_surface'))
150+
fig.show()
151+
```
152+
153+
## Label polynomial fits with latex
154+
155+
```python
156+
157+
```
158+
159+
## Prediction Error Plots
160+
161+
162+
### Simple Prediction Error
163+
164+
```python
165+
166+
```
167+
168+
### Augmented Prediction Error plot using `px`
169+
170+
```python
171+
172+
```
173+
174+
### Grid Search Visualization using `px.scatter_matrix`
175+
176+
177+
## Residual Plots
178+
179+
```python
180+
181+
```
182+
30183
### Reference
31184

32185
Learn more about `px` here:

0 commit comments

Comments
 (0)