Understanding the Cost Function in Linear Regression with Real Examples

Linear Regression and the Cost Function: Step-by-Step Guide with Calculations

As someone who is just beginning to explore the field of artificial intelligence, I’ve realized that it’s not enough to focus solely on large models — I also need to understand the foundational building blocks behind them. That’s why I’m documenting my learning process step by step, creating content that serves both as a personal reference and a resource for others on a similar path.

In this post, we’ll explore one of the core components of linear regression: the cost function, using a hands-on, example-based approach with detailed calculations.

Linear Regression with One Variable – What Is the Cost Function?

Let’s say you’re given a list:

  • Each house has a square meter (m²) value
  • And each house has a known price

Now you want to do something like this:

“If I know the square meter, can I write an equation that can roughly estimate the price?”

That equation is what we call linear regression — in other words, you’re trying to draw a straight line that fits the data.

But here’s the problem:
This line can’t perfectly pass through every data point.
The actual prices and your estimated prices won’t match exactly.

So naturally, the following question arises:

“How wrong are my predictions? How far off am I?”

The thing that measures this difference is called the cost function.

Why Do We Need a Cost Function?

Because you need a concrete way to measure how good (or bad) your model’s predictions are.

  • If your predictions are very close to the actual values → Low cost
  • If your predictions are far off → High cost

Goal: Minimize this cost.

What Does the Cost Function Do?

It checks each prediction (e.g., for 100 houses) one by one.

For every house:

  • It calculates the difference between the predicted price and the actual price
  • It takes the square of this difference (so negative and positive errors don’t cancel each other)
  • Then it finds the average of all these squared differences

And finally, it says:

“Your average error is this much.”

That value is called the cost.

What is Linear Regression?

Linear regression models the relationship between variables using a straight line.

$$
\hat{y} = \theta_1 x + \theta_0
$$

And describe terms like this:

\( \hat{y} \) — predicted value

\( x \) — input (e.g., house size)

\( \theta_1 \) — slope (weight)

\( \theta_0 \) — intercept (bias)

What Is the Cost Function?

The cost function measures how accurate predictions are. The most common is Mean Squared Error (MSE):

$$
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n}(y_i – \hat{y}_i)^2
$$

Training form (used in gradient descent):

$$
J(\theta) = \frac{1}{2n} \sum_{i=1}^{n}(y_i – \hat{y}_i)^2
$$

Example Dataset: 10 Houses

HouseSize (m²)Price ($1000)
x180200
x290220
x3100240
x4110265
x5120280
x6130300
x7140320
x8150340
x9160360
x10170385

Initial Model y=1.9x−5

Error Calculations

HousexPrediction $$y^\hat{y}y^​$$Actual yError Error²
x180147.0200-53.02809.0
x290166.0220-54.02916.0
x3100185.0240-55.03025.0
x4110204.0265-61.03721.0
x5120223.0280-57.03249.0
x6130242.0300-58.03364.0
x7140261.0320-59.03481.0
x8150280.0340-60.03600.0
x9160299.0360-61.03721.0
x10170318.0385-67.04489.0

Prediction function:
y = 1.9x - 5

Then:

$$
2809 + 2916 + 3025 + 3721 + 3249 + 3364 + 3481 + 3600 + 3721 + 4489 = 34375
$$

MSE: $$
\frac{34375}{10} = 3437.5
$$

Cost function: $$
J(\theta) = \frac{1}{2} \times 3437.5 = 1718.75
$$

Learned Model y=2.31x−9.92

Error Calculations

HousexPrediction $$y^\hat{y}y^​$$Actual yErrorError²
x180174.88200-25.12630.98
x290198.98220-21.02442.02
x3100223.08240-16.92286.20
x4110247.18265-17.82317.53
x5120271.28280-8.7276.03
x6130295.38300-4.6221.34
x7140319.48320-0.520.27
x8150343.583403.5812.81
x9160367.683607.6859.01
x10170391.783856.7845.97

Prediction function:
y = 2.31x - 9.92

$$
630.98 + 442.02 + 286.20 + 317.53 + 76.03 + 21.34 + 0.27 + 12.81 + 59.01 + 45.97 = 2220.91
$$

MSE: $$
\frac{2220.91}{10} = 222.09
$$

Cost function: $$
J(\theta) = \frac{1}{2} \times 222.09 = 111.05
$$

The learned model reduces cost by more than 15x.

Graph Summary

  • Black dots: actual prices
  • Red dashed line: initial model
  • Green line: learned model

Leave a Reply

Your email address will not be published. Required fields are marked *