Welcome to Day-3 – NumPy + Pandas (Real Data Handling Begins).

Today is very important because AI/ML is mostly about data.
Think of it like this:

Backend developer → works with APIs
Database engineer → works with tables
AI engineer → works with datasets

Today we’ll learn the two most important Python libraries:

NumPy → numerical computing
Pandas → table-like data analysis

We will learn this session in 4 Parts:

🧠 Part 1 – NumPy Basics (Arrays for AI)

NumPy works like a high-performance array engine for Python.

Example 1 – Create Array

In Colab run:

import numpy as np

numbers = np.array([10, 20, 30, 40, 50])

print(numbers)

Output:

[10 20 30 40 50]

Example 2 – Basic Operations

Run:

print("Mean:", np.mean(numbers))
print("Max:", np.max(numbers))
print("Min:", np.min(numbers))
print("Sum:", np.sum(numbers))

This is how ML libraries process numerical data.

Mean (average) is one of the most common statistics in data analysis:

mean = (x1 + x2 + ... + xn) / n

In AI, averages like this are used for:

model evaluation
normalization
feature engineering

Example 3 – Vector Operations

Run:

a = np.array([1,2,3])

b = np.array([4,5,6])

print(a + b)

print(a * b)

Output:

[5 7 9]
[4 10 18]

This vectorized computation is why NumPy is powerful.

🧠 Part 2 – Pandas (Working With Data Tables)

Pandas is like Excel + SQL inside Python.

Step 1 – Import Pandas

import pandas as pd

Step 2 – Create DataFrame

data = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"Salary": [50000, 60000, 70000]
}

df = pd.DataFrame(data)

print(df)

Output:

This structure is called a DataFrame.

Think of it like:

SQL Table

Excel Sheet

Step 3 – Basic Data Exploration

Run:

print(df.head())

Shows first rows.

Step 4 – Get Column

print(df["Salary"])

Step 5 – Calculate Statistics

print(df["Salary"].mean())

🧠 Part 3 – Filtering Data (Very Important)

Filter like SQL WHERE clause.

high_salary = df[df["Salary"] > 55000]

print(high_salary)

Equivalent SQL:

SELECT * FROM employees
WHERE salary > 55000

🧠 Part 4 – Load CSV Dataset

AI usually starts with CSV datasets.

There are multiple ways to upload csv file. Find below:

✅ Method 1 – Upload CSV from Your Laptop (Easiest):

Step 1: Run this in Colab

from google.colab import files
uploaded = files.upload()

👉 It will open file picker → select your .csv file

Step 2: Read CSV using Pandas

import pandas as pd

df = pd.read_csv("your_file_name.csv")

print(df.head())

Note: Use exact file name (case-sensitive)

🧠 Example

If your file is: Test.csv then

df = pd.read_csv("test.csv")
df.head()

✅ Method 2 – Upload from Left Sidebar (UI Way)

In Colab:

Left side → Click folder icon 📁
Click Upload
Select CSV file
File appears in /content/

Then:

df = pd.read_csv("/content/test.csv")

✅ Method 3 – Load from URL (Advanced)

If dataset is online:

url = "https://example.com/data.csv"
df = pd.read_csv(url)

df.head()

🔍 Useful Commands After Loading

df.head() # First 5 rows
df.tail() # Last 5 rows
df.info() # Structure of data
df.describe() # Statistics
df.columns # Column names

Example:

df = pd.read_csv("data.csv")

In Colab you can upload file from sidebar.

Then explore:

df.head()
df.info()
df.describe()

🎯 Mini Practice (Do This)

Create dataset:

data = {
"Name": ["A", "B", "C", "D"],
"Marks": [70, 85, 90, 60],
"Age": [20, 21, 19, 22]
}

df = pd.DataFrame(data)

Now try:

1️⃣ Print students with Marks > 80
2️⃣ Find average marks
3️⃣ Find maximum age

🎯 End of Day-3 Goals

You should now understand:

✅ NumPy arrays
✅ Basic statistics
✅ Pandas DataFrame
✅ Filtering datasets
✅ Loading data

These are core AI data skills.

💡 Important insight for developers:

Most AI projects spend 70–80% time on data processing, not model building.

So mastering Pandas early is a huge advantage.

Github Link: https://github.com/dotnetfullstackdeveloper/ai-engineer-journey/blob/main/Week-01-AI-Foundations/Day-3:%20NumPy%20+%20Pandas

Header Ads Widget

Ticker

Day-3 – NumPy + Pandas (Real Data Handling Begins)

Example 1 – Create Array

Step 1 – Import Pandas

🎯 Mini Practice (Do This)

Post a Comment

0 Comments

Subscribe Us

Ad Space

Popular Posts

Understanding Program.cs & Startup.cs File

Day-7 – Decision Tree + Model Comparison

Create an Internal Load Balancer | GSP216 | Google Cloud Skills | QUICK-GCP-LAB | 2024 #qwiklabs

Labels

Random Posts

Latest Updates

Popular Posts

New Tax Regime vs Old Tax Regime- Income Tax Slabs 2020-21 - New Tax Exemptions - Which is better? #IncomeTaxslab #NewTaxRegime #OldTaxRegime

Build a Secure Google Cloud Network: Challenge Lab | GSP322 | Google Cloud Skills Boost | QUICK-GCP-LAB | 2024 #qwiklabs

Compute Engine: Qwik Start - Windows | GSP093 | qwiklabs

Menu Footer Widget

Header Ads Widget

Ticker

Day-3 – NumPy + Pandas (Real Data Handling Begins)

Example 1 – Create Array

Step 1 – Import Pandas

🎯 Mini Practice (Do This)

You may like these posts

Post a Comment

0 Comments

Social Plugin

Subscribe Us

Ad Space

Popular Posts

Understanding Program.cs & Startup.cs File

Day-7 – Decision Tree + Model Comparison

Create an Internal Load Balancer | GSP216 | Google Cloud Skills | QUICK-GCP-LAB | 2024 #qwiklabs

Labels

Random Posts

Latest Updates

Popular Posts

New Tax Regime vs Old Tax Regime- Income Tax Slabs 2020-21 - New Tax Exemptions - Which is better? #IncomeTaxslab #NewTaxRegime #OldTaxRegime

Build a Secure Google Cloud Network: Challenge Lab | GSP322 | Google Cloud Skills Boost | QUICK-GCP-LAB | 2024 #qwiklabs

Compute Engine: Qwik Start - Windows | GSP093 | qwiklabs

Menu Footer Widget