Now we upgrade from a single tree → multiple trees working together.
🎯 Goal of Day-8
You will:
✅ Understand ensemble learning
✅ Build Random Forest model
✅ Improve accuracy over Decision Tree
✅ Learn industry-level concept
🧠 What is Random Forest?
Instead of 1 decision tree → use many trees.
Each tree gives prediction → final answer = majority vote
🌳 Concept Visualization
👉 Think like:Tree 1 → Pass
Tree 2 → Fail
Tree 3 → Pass
Final → Pass (majority)
🧠 Why Random Forest is Powerful
Decision Tree problem:
- Overfitting (too specific)
Random Forest solution:
- Multiple trees → better generalization
- More stable predictions
🚀 Part 1 – Import Library
from sklearn.ensemble import RandomForestClassifier
import pandas as pd
🚀 Part 2 – Dataset
data = {
"Hours": [1,2,3,4,5,6,7,8],
"Pass": [0,0,0,0,1,1,1,1]
}
df = pd.DataFrame(data)
🚀 Part 3 – Prepare Data
X = df[["Hours"]]
y = df["Pass"]
🚀 Part 4 – Train Model
model = RandomForestClassifier(n_estimators=10)
model.fit(X, y)
👉 n_estimators=10 means 10 trees
🚀 Part 5 – Predict
new_data = pd.DataFrame({"Hours": [3.5]})
prediction = model.predict(new_data)
print("Prediction:", prediction)
🚀 Part 6 – Compare Models (IMPORTANT)
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
log_model = LogisticRegression()
tree_model = DecisionTreeClassifier()
rf_model = RandomForestClassifier()
log_model.fit(X, y)
tree_model.fit(X, y)
rf_model.fit(X, y)
test = pd.DataFrame({"Hours": [3.5]})
print("Logistic:", log_model.predict(test))
print("Decision Tree:", tree_model.predict(test))
print("Random Forest:", rf_model.predict(test))
🧠 Key Difference
| Model | Strength |
|---|---|
| Logistic Regression | Simple, fast |
| Decision Tree | Easy to understand |
| Random Forest | High accuracy, stable |
0 Comments
If you have any queries, please let me know. Thanks.