Linear Regression and the Cost Function: Step-by-Step Guide with Calculations

As someone who is just beginning to explore the field of artificial intelligence, I’ve realized that it’s not enough to focus solely on large models — I also need to understand the foundational building blocks behind them. That’s why I’m documenting my learning process step by step, creating content that serves both as a personal reference and a resource for others on a similar path.

In this post, we’ll explore one of the core components of linear regression: the cost function, using a hands-on, example-based approach with detailed calculations.

Linear Regression with One Variable – What Is the Cost Function?

Let’s say you’re given a list:

Each house has a square meter (m²) value
And each house has a known price

Now you want to do something like this:

“If I know the square meter, can I write an equation that can roughly estimate the price?”

That equation is what we call linear regression — in other words, you’re trying to draw a straight line that fits the data.

But here’s the problem:
This line can’t perfectly pass through every data point.
The actual prices and your estimated prices won’t match exactly.

So naturally, the following question arises:

“How wrong are my predictions? How far off am I?”

The thing that measures this difference is called the cost function.

Why Do We Need a Cost Function?

Because you need a concrete way to measure how good (or bad) your model’s predictions are.

If your predictions are very close to the actual values → Low cost
If your predictions are far off → High cost

Goal: Minimize this cost.

What Does the Cost Function Do?

It checks each prediction (e.g., for 100 houses) one by one.

For every house:

It calculates the difference between the predicted price and the actual price
It takes the square of this difference (so negative and positive errors don’t cancel each other)
Then it finds the average of all these squared differences

And finally, it says:

“Your average error is this much.”

That value is called the cost.

What is Linear Regression?

Linear regression models the relationship between variables using a straight line.

$$
\hat{y} = \theta_1 x + \theta_0
$$

And describe terms like this:

$ \hat{y} $ — predicted value

$ x $ — input (e.g., house size)

$ \theta_1 $ — slope (weight)

$ \theta_0 $ — intercept (bias)

What Is the Cost Function?

The cost function measures how accurate predictions are. The most common is Mean Squared Error (MSE):

$$
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n}(y_i – \hat{y}_i)^2
$$

Training form (used in gradient descent):

$$
J(\theta) = \frac{1}{2n} \sum_{i=1}^{n}(y_i – \hat{y}_i)^2
$$

Example Dataset: 10 Houses

House	Size (m²)	Price ($1000)
x1	80	200
x2	90	220
x3	100	240
x4	110	265
x5	120	280
x6	130	300
x7	140	320
x8	150	340
x9	160	360
x10	170	385

Initial Model y=1.9x−5

Error Calculations

House	x	Prediction $$y^\hat{y}y^$$	Actual y	Error	Error²
x1	80	147.0	200	-53.0	2809.0
x2	90	166.0	220	-54.0	2916.0
x3	100	185.0	240	-55.0	3025.0
x4	110	204.0	265	-61.0	3721.0
x5	120	223.0	280	-57.0	3249.0
x6	130	242.0	300	-58.0	3364.0
x7	140	261.0	320	-59.0	3481.0
x8	150	280.0	340	-60.0	3600.0
x9	160	299.0	360	-61.0	3721.0
x10	170	318.0	385	-67.0	4489.0

Prediction function:
y = 1.9x - 5

Then:

$$
2809 + 2916 + 3025 + 3721 + 3249 + 3364 + 3481 + 3600 + 3721 + 4489 = 34375
$$

MSE: $$
\frac{34375}{10} = 3437.5
$$

Cost function: $$
J(\theta) = \frac{1}{2} \times 3437.5 = 1718.75
$$

Learned Model y=2.31x−9.92

Error Calculations

House	x	Prediction $$y^\hat{y}y^$$	Actual y	Error	Error²
x1	80	174.88	200	-25.12	630.98
x2	90	198.98	220	-21.02	442.02
x3	100	223.08	240	-16.92	286.20
x4	110	247.18	265	-17.82	317.53
x5	120	271.28	280	-8.72	76.03
x6	130	295.38	300	-4.62	21.34
x7	140	319.48	320	-0.52	0.27
x8	150	343.58	340	3.58	12.81
x9	160	367.68	360	7.68	59.01
x10	170	391.78	385	6.78	45.97

Prediction function:
y = 2.31x - 9.92

$$
630.98 + 442.02 + 286.20 + 317.53 + 76.03 + 21.34 + 0.27 + 12.81 + 59.01 + 45.97 = 2220.91
$$

MSE: $$
\frac{2220.91}{10} = 222.09
$$

Cost function: $$
J(\theta) = \frac{1}{2} \times 222.09 = 111.05
$$

The learned model reduces cost by more than 15x.

Graph Summary

Black dots: actual prices
Red dashed line: initial model
Green line: learned model

Understanding the Cost Function in Linear Regression with Real Examples

Linear Regression and the Cost Function: Step-by-Step Guide with Calculations

Linear Regression with One Variable – What Is the Cost Function?

Why Do We Need a Cost Function?

What Does the Cost Function Do?

What is Linear Regression?

What Is the Cost Function?

Example Dataset: 10 Houses

Initial Model y=1.9x−5

Error Calculations

Learned Model y=2.31x−9.92

Error Calculations

Graph Summary

Leave a ReplyCancel Reply

House	Size (m²)	Price ($1000)
x1	80	200
x2	90	220
x3	100	240
x4	110	265
x5	120	280
x6	130	300
x7	140	320
x8	150	340
x9	160	360
x10	170	385

House	Size (m²)	Price ($1000)
x1	80	200
x2	90	220
x3	100	240
x4	110	265
x5	120	280
x6	130	300
x7	140	320
x8	150	340
x9	160	360
x10	170	385

House	Size (m²)	Price ($1000)
x1	80	200
x2	90	220
x3	100	240
x4	110	265
x5	120	280
x6	130	300
x7	140	320
x8	150	340
x9	160	360
x10	170	385