Welcome to one of the most important concepts in Machine Learning and AI.

Until now you've learned:

Data Cleaning ✅
Feature Engineering ✅
Regression ✅
Classification ✅
Clustering ✅

Today you'll learn:

How to reduce data complexity while keeping most of the information.

This is exactly what PCA does.

🎯 Goal of Day-15

You will:

✅ Understand dimensionality reduction
✅ Learn PCA basics
✅ Reduce features intelligently
✅ Visualize high-dimensional data

🧠 Why PCA Exists

Imagine a dataset:

Age	Experience	Salary	Bonus	Performance
25	2	50000	5000	Good

Now imagine: we have -

100 columns
1000 columns
5000 columns

Problems:

❌ Slower training
❌ More memory usage
❌ Harder visualization
❌ More noise

🧠 Real World Example

Consider:

Customer Age
Years Experience

These are often related.

Instead of storing both separately, PCA can create: Experience_Score

that captures most information.

🚀 Part 1 – Import Libraries

import pandas as pd
import matplotlib.pyplot as plt

from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

🚀 Part 2 – Create Dataset

data = {
"Hours": [1,2,3,4,5,6,7,8],
"Marks": [40,45,50,55,70,80,90,95],
"Attendance": [60,65,70,75,80,85,90,95]
}

df = pd.DataFrame(data)

print(df.head())

🚀 Part 3 – Scale Data

PCA works best when data is scaled.

scaler = StandardScaler()

scaled_data = scaler.fit_transform(df)

🚀 Part 4 – Apply PCA

Reduce 3 features → 2 features

pca = PCA(n_components=2)

reduced_data = pca.fit_transform(scaled_data)

print(reduced_data)

🧠 What Happened?

Original:

Hours
Marks
Attendance

3 dimensions

Now:

PC1
PC2

2 dimensions

🚀 Part 5 – Create PCA DataFrame

pca_df = pd.DataFrame(
reduced_data,
columns=["PC1", "PC2"]
)

print(pca_df.head())

🚀 Part 6 – Visualize PCA

plt.scatter(
pca_df["PC1"],
pca_df["PC2"]
)

plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.title("PCA Visualization")

plt.show()

🚀 Part 7 – Explained Variance

MOST IMPORTANT PART

print(pca.explained_variance_ratio_)

Example:

[0.95, 0.04]

Meaning:

PC1 captures 95% information
PC2 captures 4% information

Total: 99% information retained

🧠 Interview Question

What is PCA?

Answer:

PCA is a dimensionality reduction technique that transforms data into fewer features while preserving maximum variance (information).

🧠 Real AI Uses

PCA is used in:

Face Recognition
Image Compression
Fraud Detection
Recommendation Systems
Data Visualization

⚠ Important Concept

PCA does NOT select columns.

It creates:

New Features

called:

Principal Components

🧠 Real AI Engineer Insight

For large datasets:

1000 features
↓
50 PCA components

Training becomes:

Faster
Less noisy
More efficient

🎯 End of Day-15 Goals

You now:

✅ Understand PCA
✅ Reduce dimensions
✅ Visualize transformed data
✅ Interpret explained variance

Github Link: https://github.com/dotnetfullstackdeveloper/ai-engineer-journey/blob/main/Week-02-Machine-Learning/Day-15%20-%20PCA%20%2B%20Dimensionality%20Reduction

Header Ads Widget

Ticker

Day-15 – PCA (Principal Component Analysis) + Dimensionality Reduction

🎯 Goal of Day-15

🧠 Why PCA Exists

🧠 Real World Example

🧠 What Happened?

🧠 Interview Question

What is PCA?

🧠 Real AI Uses

⚠ Important Concept

🧠 Real AI Engineer Insight

🎯 End of Day-15 Goals

Post a Comment

0 Comments

Subscribe Us

Ad Space

Popular Posts

30 - Day AI Engineer Roadmap

Day-11 – Feature Engineering + Data Cleaning

Day-13 – Hyperparameter Tuning + GridSearchCV

Labels

Random Posts

Latest Updates

Popular Posts

Compute Engine: Qwik Start - Windows | GSP093 | qwiklabs

Build a Secure Google Cloud Network: Challenge Lab | GSP322 | Google Cloud Skills Boost | QUICK-GCP-LAB | 2024 #qwiklabs

New Tax Regime vs Old Tax Regime- Income Tax Slabs 2020-21 - New Tax Exemptions - Which is better? #IncomeTaxslab #NewTaxRegime #OldTaxRegime

Menu Footer Widget

Header Ads Widget

Ticker

Day-15 – PCA (Principal Component Analysis) + Dimensionality Reduction

🎯 Goal of Day-15

🧠 Why PCA Exists

🧠 Real World Example

🧠 What Happened?

🧠 Interview Question

What is PCA?

🧠 Real AI Uses

⚠ Important Concept

🧠 Real AI Engineer Insight

🎯 End of Day-15 Goals

You may like these posts

Post a Comment

0 Comments

Social Plugin

Subscribe Us

Ad Space

Popular Posts

30 - Day AI Engineer Roadmap

Day-11 – Feature Engineering + Data Cleaning

Day-13 – Hyperparameter Tuning + GridSearchCV

Labels

Random Posts

Latest Updates

Popular Posts

Compute Engine: Qwik Start - Windows | GSP093 | qwiklabs

Build a Secure Google Cloud Network: Challenge Lab | GSP322 | Google Cloud Skills Boost | QUICK-GCP-LAB | 2024 #qwiklabs

New Tax Regime vs Old Tax Regime- Income Tax Slabs 2020-21 - New Tax Exemptions - Which is better? #IncomeTaxslab #NewTaxRegime #OldTaxRegime

Menu Footer Widget