Header Ads Widget

Responsive Advertisement

Ticker

6/recent/ticker-posts

Day-3 – NumPy + Pandas (Real Data Handling Begins)

Welcome to Day-3 – NumPy + Pandas (Real Data Handling Begins).


Today is very important because AI/ML is mostly about data.
Think of it like this:

  1. Backend developer → works with APIs

  2. Database engineer → works with tables

  3. AI engineer → works with datasets


Today we’ll learn the two most important Python libraries:

  1. NumPy → numerical computing

  2. Pandas → table-like data analysis


We will learn this session in 4 Parts:

🧠 Part 1 – NumPy Basics (Arrays for AI) 

NumPy works like a high-performance array engine for Python.

Example 1 – Create Array

In Colab run:

import numpy as np

numbers = np.array([10, 20, 30, 40, 50])

print(numbers)


Output:

[10 20 30 40 50]


Example 2 – Basic Operations

Run:

print("Mean:", np.mean(numbers))
print("Max:", np.max(numbers))
print("Min:", np.min(numbers))
print("Sum:", np.sum(numbers))

This is how ML libraries process numerical data.

Mean (average) is one of the most common statistics in data analysis:

mean = (x1 + x2 + ... + xn) / n


In AI, averages like this are used for:

  1. model evaluation

  2. normalization

  3. feature engineering


Example 3 – Vector Operations

Run:

a = np.array([1,2,3])

b = np.array([4,5,6])

print(a + b)

print(a * b)


Output:

[5 7 9]
[4 10 18]

This vectorized computation is why NumPy is powerful.


🧠 Part 2 – Pandas (Working With Data Tables)

Pandas is like Excel + SQL inside Python.

Step 1 – Import Pandas

import pandas as pd

Step 2 – Create DataFrame

data = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"Salary": [50000, 60000, 70000]
}

df = pd.DataFrame(data)

print(df)


Output:



This structure is called a DataFrame.

Think of it like:

SQL Table

OR

Excel Sheet


Step 3 – Basic Data Exploration

Run:

print(df.head())

Shows first rows.


Step 4 – Get Column

print(df["Salary"])


Step 5 – Calculate Statistics

print(df["Salary"].mean())


🧠 Part 3 – Filtering Data (Very Important)

Filter like SQL WHERE clause.

high_salary = df[df["Salary"] > 55000]

print(high_salary)


Equivalent SQL:

SELECT * FROM employees
WHERE salary > 55000


🧠 Part 4 – Load CSV Dataset

AI usually starts with CSV datasets.

There are multiple ways to upload csv file. Find below:


 Method 1 – Upload CSV from Your Laptop (Easiest):

Step 1: Run this in Colab

from google.colab import files
uploaded = files.upload()

👉 It will open file picker → select your .csv file


Step 2: Read CSV using Pandas

import pandas as pd

df = pd.read_csv("your_file_name.csv")

print(df.head())

Note: Use exact file name (case-sensitive)


🧠 Example

If your file is: Test.csv then

df = pd.read_csv("test.csv")
df.head()


✅ Method 2 – Upload from Left Sidebar (UI Way)

In Colab:

  1. Left side → Click folder icon 📁

  2. Click Upload

  3. Select CSV file

  4. File appears in /content/

Then:

df = pd.read_csv("/content/test.csv")


✅ Method 3 – Load from URL (Advanced)

If dataset is online:

url = "https://example.com/data.csv"
df = pd.read_csv(url)

df.head()


🔍 Useful Commands After Loading

df.head() # First 5 rows
df.tail() # Last 5 rows
df.info() # Structure of data
df.describe() # Statistics
df.columns # Column names


Example:

df = pd.read_csv("data.csv")

In Colab you can upload file from sidebar.

Then explore:

df.head()
df.info()
df.describe()


🎯 Mini Practice (Do This)

Create dataset:

data = {
"Name": ["A", "B", "C", "D"],
"Marks": [70, 85, 90, 60],
"Age": [20, 21, 19, 22]
}

df = pd.DataFrame(data)

Now try:

1️⃣ Print students with Marks > 80
2️⃣ Find average marks
3️⃣ Find maximum age


🎯 End of Day-3 Goals

You should now understand:

✅ NumPy arrays
✅ Basic statistics
✅ Pandas DataFrame
✅ Filtering datasets
✅ Loading data

These are core AI data skills.


💡 Important insight for developers:

Most AI projects spend 70–80% time on data processing, not model building.

So mastering Pandas early is a huge advantage.


Github Link: https://github.com/dotnetfullstackdeveloper/ai-engineer-journey/blob/main/Week-01-AI-Foundations/Day-3:%20NumPy%20+%20Pandas




Post a Comment

0 Comments