Spaces:
Sleeping
Sleeping
Upload 61 files
Browse filesJust Lazy and upload via browser...
This view is limited to 50 files because it contains too many changes.
See raw diff
- .gitattributes +3 -0
- .gitignore +0 -0
- AlphanumericQR.ttf +3 -0
- BallpointPenNHK-Regular.ttf +3 -0
- Carnival.ttf +0 -0
- Chesilin.ttf +0 -0
- Daemon.otf +0 -0
- Daemon_Training.png +0 -0
- EULA BallpointPenNHK.pdf +3 -0
- FortuneTeller.ttf +0 -0
- Jajapanan.ttf +0 -0
- Kakuji.ttf +0 -0
- Ling_Ling_-_Cost_vs_Accuracy.png +0 -0
- Ling_Ling_-_Six_Predictions.png +0 -0
- LogisticRegression.md +22 -0
- MusicNotation.ttf +0 -0
- Photograph Signature.ttf +0 -0
- PyTorchClass.zip +3 -0
- README.md +608 -14
- ShallowNeuralNetwork.md +86 -0
- SoftmaxRegression.md +16 -0
- UpperLower.svg +309 -0
- ZXX Bold.otf +0 -0
- ZXX Camo.otf +0 -0
- ZXX False.otf +0 -0
- ZXX Noise.otf +0 -0
- ZXX Sans.otf +0 -0
- ZXX Xed.otf +0 -0
- after_logistic.png +0 -0
- after_shallow.png +0 -0
- after_softmax.png +0 -0
- after_training_words.png +0 -0
- after_training_words_Chesilin.png +0 -0
- app.py +219 -0
- before_logistic.png +0 -0
- before_shallow.png +0 -0
- before_softmax.png +0 -0
- bullets4.ttf +0 -0
- convolutional_neural_networks.md +703 -0
- convolutional_neural_networks.py +69 -0
- dataset_loader.py +69 -0
- deep_networks.md +356 -0
- deep_networks.py +80 -0
- final_project.md +29 -0
- final_project.py +260 -0
- font.py +138 -0
- install.bat +7 -0
- install.sh +7 -0
- logistic_regression.py +65 -0
- metrics.png +0 -0
.gitattributes
CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
AlphanumericQR.ttf filter=lfs diff=lfs merge=lfs -text
|
37 |
+
BallpointPenNHK-Regular.ttf filter=lfs diff=lfs merge=lfs -text
|
38 |
+
EULA[[:space:]]BallpointPenNHK.pdf filter=lfs diff=lfs merge=lfs -text
|
.gitignore
ADDED
File without changes
|
AlphanumericQR.ttf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c16ec58d9b2ece98c013a72666a031d4b13b7349d33b19e306b9966303121977
|
3 |
+
size 105364
|
BallpointPenNHK-Regular.ttf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:82108314e1dcd918c09403edc6ec88218f56af0628c95bfc1d475386d7b4ae83
|
3 |
+
size 309684
|
Carnival.ttf
ADDED
Binary file (13.3 kB). View file
|
|
Chesilin.ttf
ADDED
Binary file (31.8 kB). View file
|
|
Daemon.otf
ADDED
Binary file (23.3 kB). View file
|
|
Daemon_Training.png
ADDED
![]() |
EULA BallpointPenNHK.pdf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fa81b3b9cb5e2ea540922ade44185601ba9b2a745ce477f563dddf5351c51889
|
3 |
+
size 643136
|
FortuneTeller.ttf
ADDED
Binary file (38.4 kB). View file
|
|
Jajapanan.ttf
ADDED
Binary file (21 kB). View file
|
|
Kakuji.ttf
ADDED
Binary file (26.3 kB). View file
|
|
Ling_Ling_-_Cost_vs_Accuracy.png
ADDED
![]() |
Ling_Ling_-_Six_Predictions.png
ADDED
![]() |
LogisticRegression.md
ADDED
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
7️⃣ **Cross Entropy Loss: Teaching AI to Learn Better** 🎯
|
2 |
+
- AI makes mistakes, so we **measure how bad they are** using a **loss function**.
|
3 |
+
- Cross Entropy Loss helps AI **learn from its mistakes** and get better.
|
4 |
+
- Instead of guessing randomly, the AI **adjusts itself to improve its answers**.
|
5 |
+
|
6 |
+
8️⃣ **Backpropagation: AI Fixing Its Own Mistakes** 🔄
|
7 |
+
- AI learns by **guessing, checking, and fixing mistakes**.
|
8 |
+
- It uses **backpropagation** to update itself, just like **learning from practice**.
|
9 |
+
- This helps AI **get smarter every time it trains**.
|
10 |
+
|
11 |
+
9️⃣ **Multi-Class Neural Networks: Picking the Best Answer** 🎨
|
12 |
+
- AI doesn’t always choose between **just two things**; sometimes, it picks from **many choices**!
|
13 |
+
- It uses **Softmax** to figure out which answer is **most likely**.
|
14 |
+
- This helps in **image recognition, language processing, and more**!
|
15 |
+
|
16 |
+
🔟 **Activation Functions: Helping AI Think Faster** ⚡
|
17 |
+
- AI uses **activation functions** to **decide which patterns matter**.
|
18 |
+
- Three important ones:
|
19 |
+
- **Sigmoid** → Helps with probabilities.
|
20 |
+
- **Tanh** → Balances data better.
|
21 |
+
- **ReLU** → Fastest and most useful!
|
22 |
+
- These make AI **learn faster and make better decisions**!
|
MusicNotation.ttf
ADDED
Binary file (47.9 kB). View file
|
|
Photograph Signature.ttf
ADDED
Binary file (37.3 kB). View file
|
|
PyTorchClass.zip
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3d1c7a965cb19f8338176d8377fa077731290baeaa2844144c5ac4e0174cf0e5
|
3 |
+
size 22431051
|
README.md
CHANGED
@@ -1,14 +1,608 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
# 🧠 **What is Logistic Regression?**
|
3 |
+
Imagine you have a **robot** that tries to guess if a fruit is an 🍎 **apple** or a 🍌 **banana**.
|
4 |
+
- The robot uses **Logistic Regression** to make its guess.
|
5 |
+
- It looks at things like the fruit’s **color**, **shape**, and **size** to decide.
|
6 |
+
- The robot gives a score from **0 to 1**:
|
7 |
+
- 0 → Definitely a banana 🍌
|
8 |
+
- 1 → Definitely an apple 🍎
|
9 |
+
- 0.5 → The robot is unsure 🤖
|
10 |
+
|
11 |
+
## 🔥 **What does the notebook do?**
|
12 |
+
1. **Makes fake data** → It creates pretend fruits with made-up colors and sizes.
|
13 |
+
2. **Builds the Logistic Regression model** → This is the robot that learns how to guess.
|
14 |
+
3. **Trains the robot** → It lets the robot practice guessing until it gets better.
|
15 |
+
4. **Shows why bad initialization is bad** → If the robot starts with **wrong guesses**, it takes a long time to learn.
|
16 |
+
- Good start ➡️ 🟢 The robot learns fast.
|
17 |
+
- Bad start ➡️ 🔴 The robot takes forever or never learns properly.
|
18 |
+
5. **Shows how to fix bad initialization** → We can **reinitialize** the robot with -**Random weights** to start with good guesses.
|
19 |
+
|
20 |
+
|
21 |
+
# 🧠 **What is Cross-Entropy?**
|
22 |
+
Imagine you are playing a **guessing game** with a 🦉 **wise owl**.
|
23 |
+
- The owl has to guess if a fruit is an 🍎 **apple** or a 🍌 **banana**.
|
24 |
+
- The owl makes a **prediction** (for example: 90% sure it’s an apple).
|
25 |
+
- If the owl is **right**, it gets a ⭐️.
|
26 |
+
- If the owl is **wrong**, it gets a 👎.
|
27 |
+
|
28 |
+
**Cross-Entropy** is like a **scorekeeper**:
|
29 |
+
- If the owl guesses correctly ➡️ **low score** 🟢 (good)
|
30 |
+
- If the owl guesses wrong ➡️ **high score** 🔴 (bad)
|
31 |
+
|
32 |
+
## 🔥 **What does the notebook do?**
|
33 |
+
1. **Makes fake fruit data** → It creates pretend fruits with random colors and shapes.
|
34 |
+
2. **Builds the Logistic Regression model** → This is the owl’s brain that makes guesses.
|
35 |
+
3. **Trains the model with Cross-Entropy** → It helps the owl learn by keeping score.
|
36 |
+
4. **Improves accuracy** → The owl gets better at guessing with practice by trying to lower its Cross-Entropy score.
|
37 |
+
|
38 |
+
|
39 |
+
# 🧠 **What is Softmax?**
|
40 |
+
Imagine you have a bag of colorful candies. Each candy represents a possible answer (like cat, dog, or bird). The **Softmax function** is like a magical machine that takes all the candies and tells you the **probability** of each one being picked.
|
41 |
+
|
42 |
+
For example:
|
43 |
+
- 🍬?->😺 **Cat** → 70% chance
|
44 |
+
- 🍬?->🐶**Dog** → 20% chance
|
45 |
+
- 🍬?->🐦 **Bird** → 10% chance
|
46 |
+
|
47 |
+
Softmax makes sure that all the probabilities add up to **100%** (because one of them will definitely be the right answer).
|
48 |
+
|
49 |
+
## 🔥 **What does the notebook do?**
|
50 |
+
1. **Makes fake data** → It creates some pretend candies (data points) to practice with.
|
51 |
+
2. **Builds the Softmax classifier** → This is the machine that guesses which candy you will pick based on its features.
|
52 |
+
3. **Trains the model** → It lets the machine practice guessing so it gets better at it.
|
53 |
+
4. **Shows the results** → It checks how good the machine is at guessing the correct candy.
|
54 |
+
|
55 |
+
|
56 |
+
|
57 |
+
# 📚 Understanding Softmax and MNIST 🖊️
|
58 |
+
|
59 |
+
## 1️⃣ What are we doing?
|
60 |
+
We want to teach a computer how to recognize numbers (0-9) by looking at images. Just like how you can tell the difference between a "2" and a "5", we want the computer to do the same!
|
61 |
+
|
62 |
+
## 2️⃣ What is MNIST? 🤔
|
63 |
+
MNIST is a big collection of handwritten numbers. People have written digits (0-9) on paper, and all those images were put into a dataset for computers to learn from.
|
64 |
+
|
65 |
+
## 3️⃣ What is a Softmax Classifier? 🤖
|
66 |
+
A **Softmax Classifier** is like a decision-maker. When it sees a number, it checks **how sure** it is that the number is a 0, 1, 2, etc. It picks the number it is most confident about.
|
67 |
+
|
68 |
+
Think of it like:
|
69 |
+
- You see a blurry animal. 🐶🐱🐭
|
70 |
+
- You think: "It **looks** like a dog, but **maybe** a cat."
|
71 |
+
- You decide: "I'm **80% sure** it's a dog, **15% sure** it's a cat, and **5% sure** it's a mouse."
|
72 |
+
- You pick the one you're most sure about → 🐶 Dog!
|
73 |
+
|
74 |
+
That's exactly how Softmax works, but with numbers instead of animals!
|
75 |
+
|
76 |
+
## 4️⃣ How do we train the computer? 🎓
|
77 |
+
1. We **show** the computer many images of numbers. 📸
|
78 |
+
2. It **tries to guess** what number is in the image. 🔢
|
79 |
+
3. If it's wrong, we **correct** it and help it learn. 📚
|
80 |
+
4. After training, it becomes **really good** at recognizing numbers! 🚀
|
81 |
+
|
82 |
+
## 5️⃣ What will we do in the notebook? 📝
|
83 |
+
- Load the MNIST dataset. 📊
|
84 |
+
- Build a Softmax Classifier. 🏗️
|
85 |
+
- Train it to recognize numbers. 🏋️♂️
|
86 |
+
- Test if it works! ✅
|
87 |
+
|
88 |
+
Let's start teaching our computer to recognize numbers! 🧠💡
|
89 |
+
|
90 |
+
# 🧠 Building a Simple Neural Network! 🤖
|
91 |
+
|
92 |
+
## 1️⃣ What are we doing? 🎯
|
93 |
+
We are teaching a computer to recognize patterns! It will learn from examples and make smart guesses, just like how you learn from practice.
|
94 |
+
|
95 |
+
## 2️⃣ What is a Neural Network? 🕸️
|
96 |
+
A **neural network** is like a **tiny brain** inside a computer. It looks at data, finds patterns, and makes decisions.
|
97 |
+
|
98 |
+
Imagine your brain trying to recognize your best friend:
|
99 |
+
- Your **eyes** see their face. 👀
|
100 |
+
- Your **brain** processes what you see. 🧠
|
101 |
+
- You **decide**: "Hey, that's my friend!" 🎉
|
102 |
+
|
103 |
+
A neural network does the same thing but with numbers!
|
104 |
+
|
105 |
+
|
106 |
+
## 3️⃣ What is a Hidden Layer? 🤔
|
107 |
+
A **hidden layer** is like a smart helper inside the network. It helps break down complex problems step by step.
|
108 |
+
|
109 |
+
Think of it like:
|
110 |
+
- 🏠 A house → **Too big to understand at once!**
|
111 |
+
- 🧱 A hidden layer **breaks it down**: first walls, then windows, then doors!
|
112 |
+
- 🏗️ This makes it easier to recognize and understand!
|
113 |
+
|
114 |
+
## 4️⃣ How do we train the computer? 🎓
|
115 |
+
1. We **show** it some data (like numbers or pictures). 👀
|
116 |
+
2. It **guesses** what it sees. 🤔
|
117 |
+
3. If it’s **wrong**, we **correct** it! ✏️
|
118 |
+
4. After **practicing a lot**, it becomes **really good** at guessing. 🚀
|
119 |
+
|
120 |
+
## 5️⃣ What will we do in the notebook? 📝
|
121 |
+
- **Build a simple neural network** with **one hidden layer**. 🏗️
|
122 |
+
- **Give it some data** to learn from. 📊
|
123 |
+
- **Train it** so it gets better. 🏋️♂️
|
124 |
+
- **Test it** to see if it works! ✅
|
125 |
+
|
126 |
+
By the end, our computer will be **smarter** and ready to recognize patterns! 🧠💡
|
127 |
+
|
128 |
+
# 🤖 Making a Smarter Neural Network! 🧠
|
129 |
+
|
130 |
+
## 1️⃣ What are we doing? 🎯
|
131 |
+
We are making a **better and smarter brain** for the computer! Instead of just one smart helper (neuron), we will have **many neurons working together**!
|
132 |
+
|
133 |
+
## 2️⃣ What are Neurons? ⚡
|
134 |
+
Neurons are like **tiny workers** inside a neural network. They take information, process it, and pass it along. The more neurons we have, the **smarter** our network becomes!
|
135 |
+
|
136 |
+
Think of it like:
|
137 |
+
- 🏗️ A simple house = **one worker** 🛠️ (slow)
|
138 |
+
- 🏙️ A big city = **many workers** 🏗️ (faster & better!)
|
139 |
+
|
140 |
+
## 3️⃣ Why More Neurons? 🤔
|
141 |
+
More neurons mean:
|
142 |
+
✅ The network **understands more details**.
|
143 |
+
✅ It **learns better** and makes **fewer mistakes**.
|
144 |
+
✅ It can solve **harder problems**!
|
145 |
+
|
146 |
+
Imagine:
|
147 |
+
- One person trying to solve a big puzzle 🧩 = **hard**
|
148 |
+
- A team of people working together = **faster & easier!**
|
149 |
+
|
150 |
+
## 4️⃣ How do we train it? 🎓
|
151 |
+
1. **Give it some data** 📊
|
152 |
+
2. **Let the neurons think** 🧠
|
153 |
+
3. **If it’s wrong, we correct it** 📚
|
154 |
+
4. **After practice, it gets really smart!** 🚀
|
155 |
+
|
156 |
+
## 5️⃣ What will we do in the notebook? 📝
|
157 |
+
- **Build a bigger neural network** with more neurons! 🏗️
|
158 |
+
- **Feed it data to learn from** 📊
|
159 |
+
- **Train it to get better** 🏋️♂️
|
160 |
+
- **Test it to see how smart it is!** ✅
|
161 |
+
|
162 |
+
By the end, our computer will be **super smart** at recognizing patterns! 🧠💡
|
163 |
+
|
164 |
+
# 🤖 Teaching a Computer to Solve XOR! 🧠
|
165 |
+
|
166 |
+
## 1️⃣ What are we doing? 🎯
|
167 |
+
We are teaching a computer to understand a special kind of problem called **XOR**. It's like a puzzle where the answer is only "Yes" when things are different.
|
168 |
+
|
169 |
+
## 2️⃣ What is XOR? ❌🔄✅
|
170 |
+
XOR is a rule that works like this:
|
171 |
+
- If two things are the **same** → ❌ NO
|
172 |
+
- If two things are **different** → ✅ YES
|
173 |
+
|
174 |
+
Example:
|
175 |
+
| Input 1 | Input 2 | XOR Output |
|
176 |
+
|---------|---------|------------|
|
177 |
+
| 0 | 0 | 0 ❌ |
|
178 |
+
| 0 | 1 | 1 ✅ |
|
179 |
+
| 1 | 0 | 1 ✅ |
|
180 |
+
| 1 | 1 | 0 ❌ |
|
181 |
+
|
182 |
+
It's like a **light switch** that only turns on if one switch is flipped!
|
183 |
+
|
184 |
+
## 3️⃣ Why is XOR tricky for computers? 🤔
|
185 |
+
Basic computers **don’t understand XOR easily**. They need a **hidden layer** with **multiple neurons** to figure it out!
|
186 |
+
|
187 |
+
## 4️⃣ What do we do in this notebook? 📝
|
188 |
+
- **Create a neural network** with one hidden layer 🏗️
|
189 |
+
- **Train it** to learn the XOR rule 🎓
|
190 |
+
- **Try different numbers of neurons** (1, 2, 3...) to see what works best! ⚡
|
191 |
+
|
192 |
+
By the end, our computer will **solve the XOR puzzle** and be smarter! 🧠🚀
|
193 |
+
|
194 |
+
# 🧠 Teaching a Computer to Read Numbers! 🔢🤖
|
195 |
+
|
196 |
+
## 1️⃣ What are we doing? 🎯
|
197 |
+
We are training a **computer brain** to look at pictures of numbers (0-9) and guess what they are!
|
198 |
+
|
199 |
+
## 2️⃣ What is the MNIST Dataset? 📸
|
200 |
+
MNIST is a **big collection of handwritten numbers** that we use to teach computers how to recognize digits.
|
201 |
+
|
202 |
+
## 3️⃣ How does the Computer Learn? 🏗️
|
203 |
+
- The computer looks at **lots of examples** of numbers. 👀
|
204 |
+
- It tries to guess what number each image shows. 🤔
|
205 |
+
- If it’s **wrong**, we help it learn and get better! 📚
|
206 |
+
- After **lots of practice**, it becomes really smart! 🚀
|
207 |
+
|
208 |
+
## 4️⃣ What’s Special About This Network? 🤔
|
209 |
+
We are using a **simple neural network** with **one hidden layer**. This layer helps the computer **understand patterns** in the numbers!
|
210 |
+
|
211 |
+
## 5️⃣ What Will We Do in This Notebook? 📝
|
212 |
+
- **Build a simple neural network** with **one hidden layer**. 🏗️
|
213 |
+
- **Train it** to recognize numbers. 🎓
|
214 |
+
- **Test it** to see how smart it is! ✅
|
215 |
+
|
216 |
+
By the end, our computer will **read numbers just like you!** 🧠💡
|
217 |
+
|
218 |
+
# ⚡ Making the Computer Think Better! 🧠
|
219 |
+
|
220 |
+
## 1️⃣ What are we doing? 🎯
|
221 |
+
We are learning about **activation functions** – special rules that help a computer **decide things**!
|
222 |
+
|
223 |
+
## 2️⃣ What is an Activation Function? 🤔
|
224 |
+
Think of a **light switch**! 💡
|
225 |
+
- If you turn it **ON**, the light shines.
|
226 |
+
- If you turn it **OFF**, the light is dark.
|
227 |
+
|
228 |
+
Activation functions help a computer **decide** what to focus on, just like flipping a switch!
|
229 |
+
|
230 |
+
## 3️⃣ Types of Activation Functions 🔢
|
231 |
+
We will learn about:
|
232 |
+
- **Sigmoid**: A soft switch that makes decisions slowly.
|
233 |
+
- **Tanh**: A stronger version of Sigmoid.
|
234 |
+
- **ReLU**: The fastest and strongest switch for learning!
|
235 |
+
|
236 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
237 |
+
- **Learn about different activation functions** ⚡
|
238 |
+
- **Try them in a neural network** 🏗️
|
239 |
+
- **See which one works best** ✅
|
240 |
+
|
241 |
+
By the end, we’ll know how computers **make smart choices!** 🤖
|
242 |
+
|
243 |
+
# 🔢 Helping a Computer Read Numbers Better! 🧠🤖
|
244 |
+
|
245 |
+
## 1️⃣ What are we doing? 🎯
|
246 |
+
We are testing **three different activation functions** to see which one helps the computer **read numbers the best!**
|
247 |
+
|
248 |
+
## 2️⃣ What is an Activation Function? 🤔
|
249 |
+
An activation function helps the computer **decide things**!
|
250 |
+
It’s like a **brain switch** that turns information **ON or OFF** so the computer can learn better.
|
251 |
+
|
252 |
+
## 3️⃣ What Activation Functions Are We Testing? ⚡
|
253 |
+
- **Sigmoid**: Soft decision-making. 🧐
|
254 |
+
- **Tanh**: A stronger version of Sigmoid. 🔥
|
255 |
+
- **ReLU**: The fastest and most powerful! ⚡
|
256 |
+
|
257 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
258 |
+
- **Train a computer** to read handwritten numbers! 🔢
|
259 |
+
- **Use different activation functions** and compare them. ⚡
|
260 |
+
- **See which one works best** for accuracy! ✅
|
261 |
+
|
262 |
+
By the end, we’ll know which function helps the computer **think the smartest!** 🧠🚀
|
263 |
+
|
264 |
+
# 🧠 What is a Deep Neural Network? 🤖
|
265 |
+
|
266 |
+
## 1️⃣ What are we doing? 🎯
|
267 |
+
We are building a **Deep Neural Network (DNN)** to help a computer **understand and recognize numbers**!
|
268 |
+
|
269 |
+
## 2️⃣ What is a Deep Neural Network? 🤔
|
270 |
+
A Deep Neural Network is a **super smart computer brain** with **many layers**.
|
271 |
+
Each layer **learns something new** and helps the computer make better decisions.
|
272 |
+
|
273 |
+
Think of it like:
|
274 |
+
👶 **A baby** trying to recognize a cat 🐱 → It might get confused!
|
275 |
+
👦 **A child** learning from books 📚 → Gets better at it!
|
276 |
+
🧑 **An expert** who has seen many cats 🏆 → Can recognize them instantly!
|
277 |
+
|
278 |
+
A **Deep Neural Network** works the same way—it **learns step by step**!
|
279 |
+
|
280 |
+
## 3️⃣ Why is a Deep Neural Network better? 🚀
|
281 |
+
✅ **More layers** = **More learning!**
|
282 |
+
✅ Can understand **complex patterns**.
|
283 |
+
✅ Can make **smarter decisions**!
|
284 |
+
|
285 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
286 |
+
- **Build a Deep Neural Network** with multiple layers 🏗️
|
287 |
+
- **Train it** to recognize handwritten numbers 🔢
|
288 |
+
- **Try different activation functions** (Sigmoid, Tanh, ReLU) ⚡
|
289 |
+
- **See which one works best!** ✅
|
290 |
+
|
291 |
+
By the end, our computer will be **super smart** at recognizing patterns! 🧠🚀
|
292 |
+
|
293 |
+
# 🌀 Teaching a Computer to See Spirals! 🤖
|
294 |
+
|
295 |
+
## 1️⃣ What are we doing? 🎯
|
296 |
+
We are teaching a **computer brain** to look at points in a spiral shape and **figure out which group they belong to**!
|
297 |
+
|
298 |
+
## 2️⃣ Why is this tricky? 🤔
|
299 |
+
The points are **twisted into spirals** 🌀, so the computer needs to be **really smart** to tell them apart.
|
300 |
+
It needs a **deep neural network** to **understand the swirl**!
|
301 |
+
|
302 |
+
## 3️⃣ How does the Computer Learn? 🏗️
|
303 |
+
- It looks at **many points** 👀
|
304 |
+
- It **guesses** which spiral they belong to ❓
|
305 |
+
- If it’s **wrong**, we help it fix mistakes! 🚀
|
306 |
+
- After **lots of practice**, it gets really good at sorting them! ✅
|
307 |
+
|
308 |
+
## 4️⃣ What’s Special About This Network? 🧠
|
309 |
+
- We use **ReLU activation** ⚡ to make learning **faster and better**!
|
310 |
+
- We **train it** to separate the spiral points into **different colors**! 🎨
|
311 |
+
|
312 |
+
## 5️⃣ What Will We Do in This Notebook? 📝
|
313 |
+
- **Build a deep neural network** with **many layers** 🏗️
|
314 |
+
- **Train it** to separate spirals 🌀
|
315 |
+
- **Check if it gets them right**! ✅
|
316 |
+
|
317 |
+
By the end, our computer will **see the spirals just like us!** 🧠✨
|
318 |
+
|
319 |
+
# 🎓 Teaching a Computer to Be Smarter with Dropout! 🤖
|
320 |
+
|
321 |
+
## 1️⃣ What are we doing? 🎯
|
322 |
+
We are training a **computer brain** to make better predictions by using **Dropout**!
|
323 |
+
|
324 |
+
## 2️⃣ What is Dropout? 🤔
|
325 |
+
Dropout is like **playing a game with one eye closed**! 👀
|
326 |
+
- It makes the computer **forget** some parts of what it learned **on purpose**!
|
327 |
+
- This helps it **not get stuck** memorizing the training examples.
|
328 |
+
- Instead, it learns to **think better** and make **stronger predictions**!
|
329 |
+
|
330 |
+
## 3️⃣ Why is Dropout Important? 🧠
|
331 |
+
Imagine learning math but only using the same **five problems** over and over.
|
332 |
+
- You’ll **memorize** them but struggle with new ones! 😕
|
333 |
+
- Dropout **mixes things up** so the computer learns **general rules**, not just examples! 🚀
|
334 |
+
|
335 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
336 |
+
- **Make some data** to train our computer. 📊
|
337 |
+
- **Build a neural network** and use Dropout. 🏗️
|
338 |
+
- **Train it using Batch Gradient Descent** (a way to help the computer learn step by step). 🏃
|
339 |
+
- **See how Dropout helps prevent overfitting!** ✅
|
340 |
+
|
341 |
+
By the end, our computer will **make smarter decisions** instead of just memorizing! 🧠✨
|
342 |
+
|
343 |
+
|
344 |
+
# 📉 Teaching a Computer to Predict Numbers with Dropout! 🤖
|
345 |
+
|
346 |
+
## 1️⃣ What is Regression? 🤔
|
347 |
+
Regression is when a computer **learns from past numbers** to **predict future numbers**!
|
348 |
+
For example:
|
349 |
+
- If you save **$5 every week**, how much will you have in **10 weeks**? 💰
|
350 |
+
- The computer **looks at patterns** and **makes a smart guess**!
|
351 |
+
|
352 |
+
## 2️⃣ Why Do We Need Dropout? 🚀
|
353 |
+
Sometimes, the computer **memorizes too much** and doesn’t learn the real pattern. 😵
|
354 |
+
Dropout **randomly turns off** parts of the computer’s learning, so it **thinks smarter** instead of just remembering numbers.
|
355 |
+
|
356 |
+
## 3️⃣ What’s Happening in This Notebook? 📝
|
357 |
+
- **We make number data** for the computer to learn from. 📊
|
358 |
+
- **We build a model** using PyTorch to predict numbers. 🏗️
|
359 |
+
- **We add Dropout** to stop the model from memorizing. ❌🧠
|
360 |
+
- **We check if Dropout helps the model predict better!** ✅
|
361 |
+
|
362 |
+
By the end, our computer will be **smarter at guessing numbers!** 🧠✨
|
363 |
+
|
364 |
+
# 🏗️ Why Can't We Start with the Same Weights? 🤖
|
365 |
+
|
366 |
+
## 1️⃣ What is Weight Initialization? 🤔
|
367 |
+
When a computer **learns** using a neural network, it starts with **random numbers** (weights) and adjusts them over time to get better.
|
368 |
+
|
369 |
+
## 2️⃣ What Happens if We Use the Same Weights? 🚨
|
370 |
+
If all the starting weights are **the same**, the computer gets **confused**! 😵
|
371 |
+
- Every neuron learns **the exact same thing** → No variety!
|
372 |
+
- The network **doesn’t improve**, and learning **gets stuck**.
|
373 |
+
|
374 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
375 |
+
- **Make a simple neural network** to test this. 🏗️
|
376 |
+
- **Initialize all weights the same way** to see what happens. ⚖️
|
377 |
+
- **Try using different random weights** and compare the results! 🎯
|
378 |
+
|
379 |
+
By the end, we’ll see why **random weight initialization is important** for a smart neural network! 🧠✨
|
380 |
+
|
381 |
+
# 🎯 Helping a Computer Learn Better with Xavier Initialization! 🤖
|
382 |
+
|
383 |
+
## 1️⃣ What is Weight Initialization? 🤔
|
384 |
+
When a neural network **starts learning**, it needs to begin with **some numbers** (called weights).
|
385 |
+
If we **pick bad starting numbers**, the network **won't learn well**!
|
386 |
+
|
387 |
+
## 2️⃣ What is Xavier Initialization? ⚖️
|
388 |
+
Xavier Initialization is a **smart way** to pick these starting numbers.
|
389 |
+
It **balances** them so they’re **not too big** or **too small**.
|
390 |
+
This helps the computer **learn faster** and **make better decisions**! 🚀
|
391 |
+
|
392 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
393 |
+
- **Build a neural network** to recognize handwritten numbers. 🔢
|
394 |
+
- **Use Xavier Initialization** to set up good starting weights. 🎯
|
395 |
+
- **Compare** how well the network learns! ✅
|
396 |
+
|
397 |
+
By the end, we’ll see why **starting right** helps a neural network **become smarter!** 🧠✨
|
398 |
+
|
399 |
+
# 🚀 Helping a Computer Learn Faster with Momentum! 🤖
|
400 |
+
|
401 |
+
## 1️⃣ What is a Polynomial Function? 📈
|
402 |
+
A polynomial function is a math equation with **powers** (like squared or cubed numbers).
|
403 |
+
For example:
|
404 |
+
- \( y = x^2 + 3x + 5 \)
|
405 |
+
- \( y = x^3 - 2x^2 + x \)
|
406 |
+
|
407 |
+
These are tricky for a computer to learn! 😵
|
408 |
+
|
409 |
+
## 2️⃣ What is Momentum? ⚡
|
410 |
+
Imagine rolling a ball down a hill. ⛰️🏀
|
411 |
+
- If the ball **stops at every step**, it takes **a long time** to reach the bottom.
|
412 |
+
- But if we give it **momentum**, it **keeps going** and moves faster! 🚀
|
413 |
+
|
414 |
+
Momentum helps a neural network **move in the right direction** without getting stuck.
|
415 |
+
|
416 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
417 |
+
- **Teach a computer to learn polynomial functions.** 📊
|
418 |
+
- **Use Momentum** to help it learn faster. 🏃
|
419 |
+
- **Compare it to normal learning** and see why Momentum is better! ✅
|
420 |
+
|
421 |
+
By the end, we’ll see how **Momentum helps a neural network** learn tricky math problems **faster and smarter!** 🧠✨
|
422 |
+
|
423 |
+
# 🏃♂️ Helping a Neural Network Learn Faster with Momentum! 🚀
|
424 |
+
|
425 |
+
## 1️⃣ What is a Neural Network? 🤖
|
426 |
+
A neural network is a **computer brain** that learns by **adjusting numbers (weights)** to make good predictions.
|
427 |
+
|
428 |
+
## 2️⃣ What is Momentum? ⚡
|
429 |
+
Imagine pushing a heavy box. 📦
|
430 |
+
- If you **push and stop**, it moves slowly. 😴
|
431 |
+
- But if you **keep pushing**, it **gains speed** and moves **faster**! 🚀
|
432 |
+
|
433 |
+
Momentum helps a neural network **keep moving in the right direction** without getting stuck!
|
434 |
+
|
435 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
436 |
+
- **Train a neural network** to recognize patterns. 🎯
|
437 |
+
- **Use Momentum** to help it learn faster. 🏃♂️
|
438 |
+
- **Compare it to normal learning** and see why Momentum is better! ✅
|
439 |
+
|
440 |
+
By the end, we’ll see how **Momentum helps a neural network** become **faster and smarter!** 🧠✨
|
441 |
+
|
442 |
+
# 🚀 Helping a Neural Network Learn Better with Batch Normalization! 🤖
|
443 |
+
|
444 |
+
## 1️⃣ What is a Neural Network? 🧠
|
445 |
+
A neural network is like a **computer brain** that learns by adjusting **numbers (weights)** to make smart decisions.
|
446 |
+
|
447 |
+
## 2️⃣ What is Batch Normalization? ⚖️
|
448 |
+
Imagine a race where everyone starts at **different speeds**. Some are too slow, and some are too fast. 🏃♂️💨
|
449 |
+
Batch Normalization **balances the speeds** so everyone runs **smoothly together**!
|
450 |
+
|
451 |
+
For a neural network, this means:
|
452 |
+
- **Making learning faster** 🚀
|
453 |
+
- **Stopping extreme values** that cause bad learning ❌
|
454 |
+
- **Helping the network work better** with deep layers! 🏗️
|
455 |
+
|
456 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
457 |
+
- **Train a neural network** to recognize patterns. 🎯
|
458 |
+
- **Use Batch Normalization** to help it learn better. ⚖️
|
459 |
+
- **Compare it to normal learning** and see the difference! ✅
|
460 |
+
|
461 |
+
By the end, we’ll see why **Batch Normalization** makes neural networks **faster and smarter!** 🧠✨
|
462 |
+
|
463 |
+
# 👀 How Do Computers See? Understanding Convolution! 🤖
|
464 |
+
|
465 |
+
## 1️⃣ What is Convolution? 🔍
|
466 |
+
Convolution is like **giving a computer glasses** to help it focus on parts of an image! 🕶️
|
467 |
+
- It **looks at small parts** of a picture instead of the whole thing at once. 🖼️
|
468 |
+
- It **finds patterns**, like edges, shapes, or textures. 🔲
|
469 |
+
|
470 |
+
## 2️⃣ Why Do We Use It? 🎯
|
471 |
+
Imagine finding **Waldo** in a giant picture! 🔎👦
|
472 |
+
- Instead of looking at everything at once, we **scan** small parts at a time.
|
473 |
+
- Convolution helps computers **scan images smartly** to recognize objects! 🏆
|
474 |
+
|
475 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
476 |
+
- **Learn how convolution works** step by step. 🛠️
|
477 |
+
- **See how it helps computers find patterns** in images. 🖼️
|
478 |
+
- **Understand why convolution is used in AI** for image recognition! 🤖✅
|
479 |
+
|
480 |
+
By the end, we’ll see how convolution helps computers **see and understand pictures like humans!** 🧠✨
|
481 |
+
|
482 |
+
# 🖼️ How Do Computers See Images? Understanding Activation & Max Pooling! 🤖
|
483 |
+
|
484 |
+
## 1️⃣ What is an Activation Function? ⚡
|
485 |
+
Activation functions **help the computer make smart decisions**! 🧠
|
486 |
+
- They decide **which patterns are important** in an image.
|
487 |
+
- Without them, the computer wouldn’t know what to focus on! 🎯
|
488 |
+
|
489 |
+
## 2️⃣ What is Max Pooling? 🔍
|
490 |
+
Max Pooling is like **shrinking an image** while keeping the best parts!
|
491 |
+
- It **takes the most important details** and removes extra noise. 🎛️
|
492 |
+
- This makes the computer **faster and better at recognizing objects!** 🚀
|
493 |
+
|
494 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
495 |
+
- **See how activation functions work** to find patterns. 🔎
|
496 |
+
- **Learn how max pooling makes images smaller but useful.** 📉
|
497 |
+
- **Understand why these tricks make AI smarter!** 🤖✅
|
498 |
+
|
499 |
+
By the end, we’ll see how **activation & pooling help computers "see" images like we do!** 🧠✨
|
500 |
+
|
501 |
+
# 🌈 How Do Computers See Color? Understanding Multiple Channel Convolution! 🤖
|
502 |
+
|
503 |
+
## 1️⃣ What is a Channel in an Image? 🎨
|
504 |
+
Think of a picture on your screen. 🖼️
|
505 |
+
- A **black & white** image has **1 channel** (just light & dark). ⚫⚪
|
506 |
+
- A **color image** has **3 channels**: **Red, Green, and Blue (RGB)!** 🌈
|
507 |
+
|
508 |
+
Computers **combine these channels** to see full-color pictures!
|
509 |
+
|
510 |
+
## 2️⃣ What is Multiple Channel Convolution? 🔍
|
511 |
+
- Instead of looking at just one channel, the computer **processes all 3 (RGB)** at the same time. 🔴🟢🔵
|
512 |
+
- This helps it **find edges, textures, and patterns in color images**! 🎯
|
513 |
+
|
514 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
515 |
+
- **See how convolution works on multiple channels.** 👀
|
516 |
+
- **Understand how computers recognize colors & details.** 🖼️
|
517 |
+
- **Learn why this is important for AI and image recognition!** 🤖✅
|
518 |
+
|
519 |
+
By the end, we’ll see how **computers process full-color images like we do!** 🧠✨
|
520 |
+
|
521 |
+
# 🖼️ How Do Computers Recognize Pictures? Understanding CNNs! 🤖
|
522 |
+
|
523 |
+
## 1️⃣ What is a Convolutional Neural Network (CNN)? 🧠
|
524 |
+
A CNN is a special **computer brain** designed to **look at pictures** and find patterns! 🔍
|
525 |
+
- It **scans an image** like our eyes do. 👀
|
526 |
+
- It learns to recognize **shapes, edges, and objects**. 🎯
|
527 |
+
- This helps AI **identify things in pictures**, like cats 🐱, dogs 🐶, or numbers 🔢!
|
528 |
+
|
529 |
+
## 2️⃣ How Does a CNN Work? ⚙️
|
530 |
+
A CNN has **layers** that help it learn step by step:
|
531 |
+
1. **Convolution Layer** – Finds small details like edges and corners. 🔲
|
532 |
+
2. **Pooling Layer** – Shrinks the image but keeps the important parts. 📉
|
533 |
+
3. **Fully Connected Layer** – Makes the final decision! ✅
|
534 |
+
|
535 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
536 |
+
- **Build a simple CNN** that can recognize images. 🏗️
|
537 |
+
- **See how each layer helps the computer "see" better.** 👀
|
538 |
+
- **Understand why CNNs are great at image recognition!** 🚀
|
539 |
+
|
540 |
+
By the end, we’ll see how **CNNs help computers recognize pictures just like humans do!** 🧠✨
|
541 |
+
|
542 |
+
---
|
543 |
+
|
544 |
+
# 🖼️ Teaching a Computer to See Small Pictures! 🤖
|
545 |
+
|
546 |
+
## 1️⃣ What is a CNN? 🧠
|
547 |
+
A **Convolutional Neural Network (CNN)** is a special AI that **looks at pictures and finds patterns**! 🔍
|
548 |
+
- It scans images **piece by piece** like a puzzle. 🧩
|
549 |
+
- It learns to recognize **shapes, edges, and objects**. 🎯
|
550 |
+
- CNNs help AI recognize **faces, animals, and numbers**! 🐱🔢👀
|
551 |
+
|
552 |
+
## 2️⃣ Why Small Images? 📏
|
553 |
+
Small images are **harder to understand** because they have **fewer details**!
|
554 |
+
- A CNN needs to **work extra hard** to find important features. 💪
|
555 |
+
- We use **smaller filters and layers** to capture details. 🎛️
|
556 |
+
|
557 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
558 |
+
- **Train a CNN on small images.** 🏗️
|
559 |
+
- **See how it learns to recognize patterns.** 🔎
|
560 |
+
- **Understand why CNNs work well, even with tiny pictures!** 🚀
|
561 |
+
|
562 |
+
By the end, we’ll see how **computers can recognize even small images with AI!** 🧠✨
|
563 |
+
|
564 |
+
---
|
565 |
+
|
566 |
+
# 🖼️ Teaching a Computer to See Small Pictures with Batches! 🤖
|
567 |
+
|
568 |
+
## 1️⃣ What is a CNN? 🧠
|
569 |
+
A **Convolutional Neural Network (CNN)** is a special AI that **looks at pictures and learns patterns**! 🔍
|
570 |
+
- It **finds shapes, edges, and objects** in an image. 🎯
|
571 |
+
- It helps AI recognize **faces, animals, and numbers**! 🐱🔢👀
|
572 |
+
|
573 |
+
## 2️⃣ What is a Batch? 📦
|
574 |
+
Instead of looking at **one image at a time**, the computer looks at **a group (batch) of images** at once!
|
575 |
+
- This **makes learning faster**. 🚀
|
576 |
+
- It helps the CNN **understand patterns better**. 🧠✅
|
577 |
+
|
578 |
+
## 3️⃣ Why Small Images? 📏
|
579 |
+
Small images have **fewer details**, so the CNN must **work harder to find patterns**. 💪
|
580 |
+
- We **train in batches** to help the computer **learn faster and better**. 🎛️
|
581 |
+
|
582 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
583 |
+
- **Train a CNN on small images using batches.** 🏗️
|
584 |
+
- **See how it learns to recognize objects better.** 🔎
|
585 |
+
- **Understand why batching helps AI train efficiently!** ⚡
|
586 |
+
|
587 |
+
By the end, we’ll see how **CNNs learn faster and smarter with batches!** 🧠✨
|
588 |
+
|
589 |
+
---
|
590 |
+
|
591 |
+
# 🖼️ Teaching a Computer to Recognize Handwritten Numbers! 🤖
|
592 |
+
|
593 |
+
## 1️⃣ What is a CNN? 🧠
|
594 |
+
A **Convolutional Neural Network (CNN)** is a smart AI that **looks at pictures and learns patterns**! 🔍
|
595 |
+
- It **finds shapes, lines, and curves** in images. 🔢
|
596 |
+
- It helps AI recognize **digits and handwritten numbers**! ✏️
|
597 |
+
|
598 |
+
## 2️⃣ Why Handwritten Numbers? 🔢
|
599 |
+
Handwritten numbers are **tricky** because everyone writes differently!
|
600 |
+
- A CNN must **learn the different ways** people write the same number.
|
601 |
+
- This helps it **recognize digits** even if they are messy. 💡
|
602 |
+
|
603 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
604 |
+
- **Train a CNN to classify images of handwritten numbers.** 🏗️
|
605 |
+
- **See how it learns to recognize different digits.** 🔎
|
606 |
+
- **Understand how AI can analyze images of handwritten numbers!** 🚀
|
607 |
+
|
608 |
+
By the end, we’ll see how **computers can recognize handwritten numbers just like we do!** 🧠✨
|
ShallowNeuralNetwork.md
ADDED
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
1️⃣ **Introduction to Neural Networks (One Hidden Layer)** 🤖
|
2 |
+
- A neural network is like a **thinking machine** that makes decisions.
|
3 |
+
- It **learns from data** and gets better over time.
|
4 |
+
- We build a network with **one hidden layer** to help it **think smarter**.
|
5 |
+
|
6 |
+
2️⃣ **More Neurons, Better Learning!** 🧠
|
7 |
+
- If a network **isn’t smart enough**, we add **more neurons**!
|
8 |
+
- More neurons = **better decision-making**.
|
9 |
+
- We train the network to **recognize patterns more accurately**.
|
10 |
+
|
11 |
+
3️⃣ **Neural Networks with Multiple Inputs** 🔢
|
12 |
+
- Instead of just **one piece of data**, we give the network **many inputs**.
|
13 |
+
- This helps it **understand more complex problems**.
|
14 |
+
- Too many neurons = **overfitting (too specific)**, too few = **underfitting (too simple)**.
|
15 |
+
|
16 |
+
4️⃣ **Multi-Class Neural Networks** 🎨
|
17 |
+
- Instead of choosing between **two options**, the network can choose **many!**
|
18 |
+
- It learns to **classify things into multiple groups**, like recognizing **different animals**.
|
19 |
+
- The Softmax function helps it **pick the best answer**.
|
20 |
+
|
21 |
+
5️⃣ **Backpropagation: Learning from Mistakes** 🔄
|
22 |
+
- The network **makes a guess**, checks if it’s right, and **fixes itself**.
|
23 |
+
- It does this using **backpropagation**, which adjusts the neurons.
|
24 |
+
- This is how AI **gets smarter with time**!
|
25 |
+
|
26 |
+
6️⃣ **Activation Functions: Helping AI Decide** ⚡
|
27 |
+
- Activation functions **control how neurons react**.
|
28 |
+
- Three common types:
|
29 |
+
- **Sigmoid** → Good for probabilities.
|
30 |
+
- **Tanh** → Helps balance data.
|
31 |
+
- **ReLU** → Fastest and most useful!
|
32 |
+
- These functions help the network **learn efficiently**.
|
33 |
+
|
34 |
+
# 📖 AI Terms and Definitions (Based on the Videos) 🤖
|
35 |
+
|
36 |
+
### 🧠 **Neural Network**
|
37 |
+
A **computer brain** that learns by adjusting numbers (weights) to make decisions.
|
38 |
+
|
39 |
+
### 🎯 **Classification**
|
40 |
+
Teaching AI to **sort things into groups**, like recognizing cats 🐱 and dogs 🐶 in pictures.
|
41 |
+
|
42 |
+
### ⚡ **Activation Function**
|
43 |
+
A rule that helps AI **decide which information is important**. Examples:
|
44 |
+
- **Sigmoid** → Soft decision-making.
|
45 |
+
- **Tanh** → Balances positive and negative values.
|
46 |
+
- **ReLU** → Fast and effective!
|
47 |
+
|
48 |
+
### 🔄 **Backpropagation**
|
49 |
+
AI’s way of **fixing mistakes** by looking at errors and adjusting itself.
|
50 |
+
|
51 |
+
### 📉 **Loss Function**
|
52 |
+
A **score** that tells AI **how wrong** it was, so it can improve.
|
53 |
+
|
54 |
+
### 🚀 **Gradient Descent**
|
55 |
+
A method that helps AI **learn step by step** by making small changes to improve.
|
56 |
+
|
57 |
+
### 🏗️ **Hidden Layer**
|
58 |
+
A **middle part of a neural network** that helps process complex information.
|
59 |
+
|
60 |
+
### 🌀 **Softmax Function**
|
61 |
+
Helps AI **choose the best answer** when there are multiple choices.
|
62 |
+
|
63 |
+
### ⚖️ **Cross Entropy Loss**
|
64 |
+
A way to measure **how well AI is learning** when making choices.
|
65 |
+
|
66 |
+
### 📊 **Multi-Class Neural Networks**
|
67 |
+
AI models that can **choose from many options**, not just two.
|
68 |
+
|
69 |
+
### 🏎️ **Momentum**
|
70 |
+
A trick that helps AI **learn faster** by keeping track of past updates.
|
71 |
+
|
72 |
+
### 🔍 **Overfitting**
|
73 |
+
When AI **memorizes too much** and struggles with new data.
|
74 |
+
|
75 |
+
### 😕 **Underfitting**
|
76 |
+
When AI **doesn’t learn enough** and makes bad predictions.
|
77 |
+
|
78 |
+
### 🎨 **Convolutional Neural Network (CNN)**
|
79 |
+
A special AI for **understanding images**, used in things like face recognition.
|
80 |
+
|
81 |
+
### 📦 **Batch Processing**
|
82 |
+
Instead of training on **one piece of data at a time**, AI looks at **many pieces at once** to learn faster.
|
83 |
+
|
84 |
+
### 🏗️ **PyTorch**
|
85 |
+
A tool that helps build and train neural networks easily.
|
86 |
+
|
SoftmaxRegression.md
ADDED
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
🔢 **Softmax: Helping AI Pick the Best Answer** 🎯
|
2 |
+
- AI sometimes has **many choices**, not just **yes or no**.
|
3 |
+
- The **Softmax function** helps AI **decide which answer is most likely**.
|
4 |
+
- It looks at **all possibilities** and picks the **best one**!
|
5 |
+
|
6 |
+
🖼️ **Softmax for Images: Teaching AI to Recognize Pictures** 🧠
|
7 |
+
- AI can look at a picture and **guess what it is**.
|
8 |
+
- It checks **different parts** of the image and compares them.
|
9 |
+
- Softmax helps AI **pick the right label** (like "cat" or "dog").
|
10 |
+
|
11 |
+
📊 **Softmax in PyTorch: Making AI Smarter** ⚙️
|
12 |
+
- AI needs **training** to get better at choosing answers.
|
13 |
+
- PyTorch helps AI **use Softmax** to **learn from mistakes**.
|
14 |
+
- After training, AI **makes better predictions**!
|
15 |
+
|
16 |
+
By the end, AI will **think smarter** and **recognize things better**! 🚀
|
UpperLower.svg
ADDED
|
ZXX Bold.otf
ADDED
Binary file (11.8 kB). View file
|
|
ZXX Camo.otf
ADDED
Binary file (31 kB). View file
|
|
ZXX False.otf
ADDED
Binary file (13.7 kB). View file
|
|
ZXX Noise.otf
ADDED
Binary file (18.8 kB). View file
|
|
ZXX Sans.otf
ADDED
Binary file (11.7 kB). View file
|
|
ZXX Xed.otf
ADDED
Binary file (21.2 kB). View file
|
|
after_logistic.png
ADDED
![]() |
after_shallow.png
ADDED
![]() |
after_softmax.png
ADDED
![]() |
after_training_words.png
ADDED
![]() |
after_training_words_Chesilin.png
ADDED
![]() |
app.py
ADDED
@@ -0,0 +1,219 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import gradio as gr
|
2 |
+
import torch
|
3 |
+
import numpy as np
|
4 |
+
import matplotlib.pyplot as plt
|
5 |
+
from torch import nn, optim
|
6 |
+
from torch.utils.data import DataLoader
|
7 |
+
from io import StringIO
|
8 |
+
import os
|
9 |
+
import base64
|
10 |
+
# Import your modules
|
11 |
+
from logistic_regression import LogisticRegressionModel
|
12 |
+
from softmax_regression import SoftmaxRegressionModel
|
13 |
+
from shallow_neural_network import ShallowNeuralNetwork
|
14 |
+
import convolutional_neural_networks
|
15 |
+
from dataset_loader import CustomMNISTDataset
|
16 |
+
from final_project import train_final_model, get_dataset_options, FinalCNN
|
17 |
+
import torchvision.transforms as transforms
|
18 |
+
|
19 |
+
import torch
|
20 |
+
import matplotlib.pyplot as plt
|
21 |
+
from matplotlib import font_manager
|
22 |
+
import matplotlib.pyplot as plt
|
23 |
+
def number_to_char(number):
|
24 |
+
if 0 <= number <= 9:
|
25 |
+
return str(number) # 0-9
|
26 |
+
elif 10 <= number <= 35:
|
27 |
+
return chr(number + 87) # a-z (10 -> 'a', 35 -> 'z')
|
28 |
+
elif 36 <= number <= 61:
|
29 |
+
return chr(number + 65) # A-Z (36 -> 'A', 61 -> 'Z')
|
30 |
+
else:
|
31 |
+
return ''
|
32 |
+
|
33 |
+
def visualize_predictions_svg(model, train_loader, stage):
|
34 |
+
"""Visualizes predictions and returns SVG string for Gradio display."""
|
35 |
+
# Load the Daemon font
|
36 |
+
font_path = './Daemon.otf' # Path to your Daemon font
|
37 |
+
prop = font_manager.FontProperties(fname=font_path)
|
38 |
+
|
39 |
+
fig, ax = plt.subplots(6, 3, figsize=(12, 16)) # 6 rows and 3 columns for 18 images
|
40 |
+
|
41 |
+
model.eval()
|
42 |
+
images, labels = next(iter(train_loader))
|
43 |
+
images, labels = images[:18], labels[:18] # Get 18 images and labels
|
44 |
+
|
45 |
+
with torch.no_grad():
|
46 |
+
outputs = model(images)
|
47 |
+
_, predictions = torch.max(outputs, 1)
|
48 |
+
|
49 |
+
for i in range(18): # Iterate over 18 images
|
50 |
+
ax[i // 3, i % 3].imshow(images[i].squeeze(), cmap='gray')
|
51 |
+
|
52 |
+
# Convert predictions and labels to characters
|
53 |
+
pred_char = number_to_char(predictions[i].item())
|
54 |
+
label_char = number_to_char(labels[i].item())
|
55 |
+
|
56 |
+
# Display = or != based on prediction
|
57 |
+
if pred_char == label_char:
|
58 |
+
title_text = f"{pred_char} = {label_char}"
|
59 |
+
color = 'green' # Green if correct
|
60 |
+
else:
|
61 |
+
title_text = f"{pred_char} != {label_char}"
|
62 |
+
color = 'red' # Red if incorrect
|
63 |
+
|
64 |
+
# Set title with Daemon font and color
|
65 |
+
ax[i // 3, i % 3].set_title(title_text, fontproperties=prop, fontsize=12, color=color)
|
66 |
+
ax[i // 3, i % 3].axis('off')
|
67 |
+
|
68 |
+
|
69 |
+
# Convert the figure to SVG
|
70 |
+
svg_str = figure_to_svg(fig)
|
71 |
+
save_svg_to_output_folder(svg_str, f"{stage}_predictions.svg") # Save SVG to output folder
|
72 |
+
plt.close(fig)
|
73 |
+
|
74 |
+
return svg_str
|
75 |
+
|
76 |
+
def figure_to_svg(fig):
|
77 |
+
"""Convert a matplotlib figure to SVG string."""
|
78 |
+
from io import StringIO
|
79 |
+
from matplotlib.backends.backend_svg import FigureCanvasSVG
|
80 |
+
canvas = FigureCanvasSVG(fig)
|
81 |
+
output = StringIO()
|
82 |
+
canvas.print_svg(output)
|
83 |
+
return output.getvalue()
|
84 |
+
|
85 |
+
def save_svg_to_output_folder(svg_str, filename):
|
86 |
+
"""Save the SVG string to the output folder."""
|
87 |
+
output_path = f'./output/{filename}' # Ensure your output folder exists
|
88 |
+
with open(output_path, 'w') as f:
|
89 |
+
f.write(svg_str)
|
90 |
+
|
91 |
+
|
92 |
+
def plot_metrics_svg(losses, accuracies):
|
93 |
+
"""Generate training metrics as SVG string."""
|
94 |
+
fig, ax = plt.subplots(1, 2, figsize=(12, 5))
|
95 |
+
|
96 |
+
ax[0].plot(losses, label='Loss', color='red')
|
97 |
+
ax[0].set_title('Training Loss')
|
98 |
+
ax[0].set_xlabel('Epoch')
|
99 |
+
ax[0].set_ylabel('Loss')
|
100 |
+
ax[0].legend()
|
101 |
+
|
102 |
+
ax[1].plot(accuracies, label='Accuracy', color='green')
|
103 |
+
ax[1].set_title('Training Accuracy')
|
104 |
+
ax[1].set_xlabel('Epoch')
|
105 |
+
ax[1].set_ylabel('Accuracy')
|
106 |
+
ax[1].legend()
|
107 |
+
|
108 |
+
plt.tight_layout()
|
109 |
+
svg_str = figure_to_svg(fig)
|
110 |
+
save_svg_to_output_folder(svg_str, "training_metrics.svg") # Save metrics SVG to output folder
|
111 |
+
plt.close(fig)
|
112 |
+
|
113 |
+
return svg_str
|
114 |
+
|
115 |
+
def train_model_interface(module, dataset_name, epochs=100, lr=0.01):
|
116 |
+
"""Train the selected model with the chosen dataset."""
|
117 |
+
transform = transforms.Compose([
|
118 |
+
transforms.Resize((28, 28)),
|
119 |
+
transforms.Grayscale(num_output_channels=1),
|
120 |
+
transforms.ToTensor(),
|
121 |
+
transforms.Normalize(mean=[0.5], std=[0.5])
|
122 |
+
])
|
123 |
+
|
124 |
+
# Load dataset using CustomMNISTDataset
|
125 |
+
train_dataset = CustomMNISTDataset(os.path.join("data", dataset_name, "raw"), transform=transform)
|
126 |
+
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
|
127 |
+
|
128 |
+
# Select Model
|
129 |
+
if module == "Logistic Regression":
|
130 |
+
model = LogisticRegressionModel(input_size=1)
|
131 |
+
elif module == "Softmax Regression":
|
132 |
+
model = SoftmaxRegressionModel(input_size=2, num_classes=2)
|
133 |
+
elif module == "Shallow Neural Networks":
|
134 |
+
model = ShallowNeuralNetwork(input_size=2, hidden_size=5, output_size=2)
|
135 |
+
elif module == "Deep Networks":
|
136 |
+
import deep_networks
|
137 |
+
model = deep_networks.DeepNeuralNetwork(input_size=10, hidden_sizes=[20, 10], output_size=2)
|
138 |
+
elif module == "Convolutional Neural Networks":
|
139 |
+
model = convolutional_neural_networks.ConvolutionalNeuralNetwork()
|
140 |
+
elif module == "AI Calligraphy":
|
141 |
+
model = FinalCNN()
|
142 |
+
else:
|
143 |
+
return "Invalid module selection", None, None, None, None
|
144 |
+
|
145 |
+
# Visualize before training
|
146 |
+
before_svg = visualize_predictions_svg(model, train_loader, "Before")
|
147 |
+
|
148 |
+
# Train the model
|
149 |
+
criterion = nn.CrossEntropyLoss()
|
150 |
+
optimizer = optim.SGD(model.parameters(), lr=lr)
|
151 |
+
|
152 |
+
losses, accuracies = train_final_model(model, criterion, optimizer, train_loader, epochs)
|
153 |
+
|
154 |
+
# Visualize after training
|
155 |
+
after_svg = visualize_predictions_svg(model, train_loader, "After")
|
156 |
+
|
157 |
+
# Metrics SVG
|
158 |
+
metrics_svg = plot_metrics_svg(losses, accuracies)
|
159 |
+
|
160 |
+
return model, losses, accuracies, before_svg, after_svg, metrics_svg
|
161 |
+
|
162 |
+
|
163 |
+
def list_datasets():
|
164 |
+
"""List all available datasets dynamically"""
|
165 |
+
dataset_options = get_dataset_options()
|
166 |
+
if not dataset_options:
|
167 |
+
return ["No datasets found"]
|
168 |
+
return dataset_options
|
169 |
+
|
170 |
+
### 🎯 Gradio Interface ###
|
171 |
+
def run_module(module, dataset_name, epochs, lr):
|
172 |
+
"""Gradio interface callback"""
|
173 |
+
# Train model
|
174 |
+
model, losses, accuracies, before_svg, after_svg, metrics_svg = train_model_interface(
|
175 |
+
module, dataset_name, epochs, lr
|
176 |
+
)
|
177 |
+
|
178 |
+
if model is None:
|
179 |
+
return "Error: Invalid selection.", None, None, None, None
|
180 |
+
|
181 |
+
# Simply pass the SVG strings to Gradio's gr.Image for rendering
|
182 |
+
return (
|
183 |
+
f"Training completed for {module} with {epochs} epochs.",
|
184 |
+
before_svg, # Pass raw SVG for before training
|
185 |
+
after_svg, # Pass raw SVG for after training
|
186 |
+
metrics_svg # Return training metrics SVG directly
|
187 |
+
)
|
188 |
+
|
189 |
+
### 🌟 Gradio UI ###
|
190 |
+
with gr.Blocks() as app:
|
191 |
+
with gr.Tab("Techniques"):
|
192 |
+
gr.Markdown("### 🧠 Select Model to Train")
|
193 |
+
|
194 |
+
module_select = gr.Dropdown(
|
195 |
+
choices=[
|
196 |
+
"AI Calligraphy"
|
197 |
+
],
|
198 |
+
label="Select Module"
|
199 |
+
)
|
200 |
+
|
201 |
+
dataset_list = gr.Dropdown(choices=list_datasets(), label="Select Dataset")
|
202 |
+
epochs = gr.Slider(10, 1024, value=100, step=10, label="Epochs")
|
203 |
+
lr = gr.Slider(0.001, 0.1, value=0.01, step=0.001, label="Learning Rate")
|
204 |
+
|
205 |
+
train_button = gr.Button("Train Model")
|
206 |
+
|
207 |
+
output = gr.Textbox(label="Training Output")
|
208 |
+
before_svg = gr.HTML(label="Before Training Predictions")
|
209 |
+
after_svg = gr.HTML(label="After Training Predictions")
|
210 |
+
metrics_svg = gr.HTML(label="Metrics")
|
211 |
+
|
212 |
+
train_button.click(
|
213 |
+
run_module,
|
214 |
+
inputs=[module_select, dataset_list, epochs, lr],
|
215 |
+
outputs=[output, before_svg, after_svg, metrics_svg]
|
216 |
+
)
|
217 |
+
|
218 |
+
# Launch Gradio app
|
219 |
+
app.launch(server_name="127.0.0.1", server_port=5555, share=True)
|
before_logistic.png
ADDED
![]() |
before_shallow.png
ADDED
![]() |
before_softmax.png
ADDED
![]() |
bullets4.ttf
ADDED
Binary file (88.7 kB). View file
|
|
convolutional_neural_networks.md
ADDED
@@ -0,0 +1,703 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
l🎵 **Music Playing**
|
2 |
+
|
3 |
+
👋 **Welcome!** Today, we’re learning about **Convolution** in Neural Networks! 🧠🖼️
|
4 |
+
|
5 |
+
## 🤔 What is Convolution?
|
6 |
+
Convolution helps computers **understand pictures** by looking at **patterns** instead of exact positions! 🖼️🔍
|
7 |
+
|
8 |
+
Imagine you have **two images** that look almost the same, but one is a little **moved**.
|
9 |
+
A computer might think they are totally **different**! 😲
|
10 |
+
**Convolution fixes this problem!** ✅
|
11 |
+
|
12 |
+
---
|
13 |
+
|
14 |
+
## 🛠️ How Convolution Works
|
15 |
+
|
16 |
+
We use something called a **kernel** (a small filter 🔲) that slides over an image.
|
17 |
+
It **checks different parts** of the picture and creates a new image called an **activation map**!
|
18 |
+
|
19 |
+
1️⃣ The **image** is a grid of numbers 🖼️
|
20 |
+
2️⃣ The **kernel** is a small grid 🔳 that moves across the image
|
21 |
+
3️⃣ It **multiplies** numbers in the image with the numbers in the kernel ✖️
|
22 |
+
4️⃣ The results are **added together** ➕
|
23 |
+
5️⃣ We move to the next spot and **repeat!** 🔄
|
24 |
+
6️⃣ The final result is the **activation map** 🎯
|
25 |
+
|
26 |
+
---
|
27 |
+
|
28 |
+
## 📏 How Big is the Activation Map?
|
29 |
+
|
30 |
+
The size of the **activation map** depends on:
|
31 |
+
- **M (image size)** 📏
|
32 |
+
- **K (kernel size)** 🔳
|
33 |
+
- **Stride** (how far the kernel moves) 👣
|
34 |
+
|
35 |
+
Formula:
|
36 |
+
```
|
37 |
+
New size = (Image size - Kernel size) + 1
|
38 |
+
```
|
39 |
+
|
40 |
+
Example:
|
41 |
+
- **4×4 image** 📷
|
42 |
+
- **2×2 kernel** 🔳
|
43 |
+
- Activation map = **3×3** ✅
|
44 |
+
|
45 |
+
---
|
46 |
+
|
47 |
+
## 👣 What is Stride?
|
48 |
+
|
49 |
+
Stride is **how far** the kernel moves each time!
|
50 |
+
- **Stride = 1** ➝ Moves **one step** at a time 🐢
|
51 |
+
- **Stride = 2** ➝ Moves **two steps** at a time 🚶♂️
|
52 |
+
- **Bigger stride** = **Smaller** activation map! 📏
|
53 |
+
|
54 |
+
---
|
55 |
+
|
56 |
+
## 🛑 What is Zero Padding?
|
57 |
+
|
58 |
+
Sometimes, the kernel **doesn’t fit** perfectly in the image. 😕
|
59 |
+
So, we **add extra rows and columns of zeros** around the image! 0️⃣0️⃣0️⃣
|
60 |
+
|
61 |
+
This makes sure the **kernel covers everything**! ✅
|
62 |
+
|
63 |
+
Formula:
|
64 |
+
```
|
65 |
+
New Image Size = Old Size + 2 × Padding
|
66 |
+
```
|
67 |
+
|
68 |
+
---
|
69 |
+
|
70 |
+
## 🎨 What About Color Images?
|
71 |
+
|
72 |
+
For **black & white** images, we use **Conv2D** with **one channel** (grayscale). 🌑
|
73 |
+
For **color images**, we use **three channels** (Red, Green, Blue - RGB)! 🎨🌈
|
74 |
+
|
75 |
+
---
|
76 |
+
|
77 |
+
## 🏆 Summary
|
78 |
+
|
79 |
+
✅ Convolution helps computers **find patterns** in images!
|
80 |
+
✅ We use a **kernel** to create an **activation map**!
|
81 |
+
✅ **Stride & padding** change how the convolution works!
|
82 |
+
✅ This is how computers **"see"** images! 👀🤖
|
83 |
+
|
84 |
+
---
|
85 |
+
|
86 |
+
🎉 **Great job!** Now, let’s try convolution in the lab! 🏗️🤖✨
|
87 |
+
|
88 |
+
-----------------------------------------------------------------
|
89 |
+
|
90 |
+
🎵 **Music Playing**
|
91 |
+
|
92 |
+
👋 **Welcome!** Today, we’re learning about **Activation Functions** and **Max Pooling**! 🚀🔢
|
93 |
+
|
94 |
+
## 🤖 What is an Activation Function?
|
95 |
+
|
96 |
+
Activation functions help a neural network **decide** what’s important! 🧠
|
97 |
+
They change the values in the activation map to **help the model learn better**.
|
98 |
+
|
99 |
+
---
|
100 |
+
|
101 |
+
## 🔥 Example: ReLU Activation Function
|
102 |
+
|
103 |
+
1️⃣ We take an **input image** 🖼️
|
104 |
+
2️⃣ We apply **convolution** to create an **activation map** 📊
|
105 |
+
3️⃣ We apply **ReLU (Rectified Linear Unit)**:
|
106 |
+
- **If a value is negative** ➝ Change it to **0** ❌
|
107 |
+
- **If a value is positive** ➝ Keep it ✅
|
108 |
+
|
109 |
+
### 🛠 Example Calculation
|
110 |
+
|
111 |
+
| Before ReLU | After ReLU |
|
112 |
+
|-------------|------------|
|
113 |
+
| -4 | 0 |
|
114 |
+
| 0 | 0 |
|
115 |
+
| 4 | 4 |
|
116 |
+
|
117 |
+
All **negative numbers** become **zero**! ✨
|
118 |
+
|
119 |
+
In PyTorch, we apply the ReLU function **after convolution**:
|
120 |
+
|
121 |
+
```python
|
122 |
+
import torch.nn.functional as F
|
123 |
+
|
124 |
+
output = F.relu(conv_output)
|
125 |
+
```
|
126 |
+
|
127 |
+
---
|
128 |
+
|
129 |
+
## 🌊 What is Max Pooling?
|
130 |
+
|
131 |
+
Max Pooling helps the network **focus on important details** while making images **smaller**! 📏🔍
|
132 |
+
|
133 |
+
### 🏗 How It Works
|
134 |
+
|
135 |
+
1️⃣ We **divide** the image into small regions (e.g., **2×2** squares)
|
136 |
+
2️⃣ We **keep only the largest value** in each region
|
137 |
+
3️⃣ We **move the window** and repeat until we’ve covered the whole image
|
138 |
+
|
139 |
+
### 📊 Example: 2×2 Max Pooling
|
140 |
+
|
141 |
+
| Before Pooling | After Pooling |
|
142 |
+
|--------------|--------------|
|
143 |
+
| 1, **6**, 2, 3 | **6**, **8** |
|
144 |
+
| 5, **8**, 7, 4 | **9**, **7** |
|
145 |
+
| **9**, 2, 3, **7** | |
|
146 |
+
|
147 |
+
**Only the biggest number** in each section is kept! ✅
|
148 |
+
|
149 |
+
---
|
150 |
+
|
151 |
+
## 🏆 Why Use Max Pooling?
|
152 |
+
|
153 |
+
✅ **Reduces image size** ➝ Makes training faster! 🚀
|
154 |
+
✅ **Ignores small changes** in images ➝ More stable results! 🔄
|
155 |
+
✅ **Helps find important features** in the picture! 🖼️
|
156 |
+
|
157 |
+
In PyTorch, we apply **Max Pooling** like this:
|
158 |
+
|
159 |
+
```python
|
160 |
+
import torch.nn.functional as F
|
161 |
+
|
162 |
+
output = F.max_pool2d(activation_map, kernel_size=2, stride=2)
|
163 |
+
```
|
164 |
+
|
165 |
+
---
|
166 |
+
|
167 |
+
🎉 **Great job!** Now, let’s try using activation functions and max pooling in our own models! 🏗️🤖✨
|
168 |
+
|
169 |
+
------------------------------------------------------------------------------------------------------
|
170 |
+
🎵 **Music Playing**
|
171 |
+
|
172 |
+
👋 **Welcome!** Today, we’re learning about **Convolution with Multiple Channels**! 🖼️🤖
|
173 |
+
|
174 |
+
## 🤔 What’s a Channel?
|
175 |
+
A **channel** is like a layer of an image! 🌈
|
176 |
+
- **Black & White Images** ➝ **1 channel** (grayscale) 🏳️
|
177 |
+
- **Color Images** ➝ **3 channels** (Red, Green, Blue - RGB) 🎨
|
178 |
+
|
179 |
+
Neural networks **see** images by looking at these channels separately! 👀
|
180 |
+
|
181 |
+
---
|
182 |
+
|
183 |
+
## 🎯 1. Multiple Output Channels
|
184 |
+
|
185 |
+
Usually, we use **one kernel** to create **one activation map** 📊
|
186 |
+
But what if we want to detect **different things** in an image? 🤔
|
187 |
+
- **Solution:** Use **multiple kernels**! Each kernel **finds different features**! 🔍
|
188 |
+
|
189 |
+
### 🔥 Example: Detecting Lines
|
190 |
+
1️⃣ A **vertical line kernel** finds **vertical edges** 📏
|
191 |
+
2️⃣ A **horizontal line kernel** finds **horizontal edges** 📐
|
192 |
+
|
193 |
+
**More kernels = More ways to see the image!** 👀✅
|
194 |
+
|
195 |
+
---
|
196 |
+
|
197 |
+
## 🎨 2. Multiple Input Channels
|
198 |
+
|
199 |
+
Color images have **3 channels** (Red, Green, Blue).
|
200 |
+
To process them, we use **a separate kernel for each channel**! 🎨
|
201 |
+
|
202 |
+
1️⃣ Apply a **Red kernel** to the Red part of the image 🔴
|
203 |
+
2️⃣ Apply a **Green kernel** to the Green part of the image 🟢
|
204 |
+
3️⃣ Apply a **Blue kernel** to the Blue part of the image 🔵
|
205 |
+
4️⃣ **Add the results together** to get one activation map!
|
206 |
+
|
207 |
+
This helps the neural network understand **colors and patterns**! 🌈
|
208 |
+
|
209 |
+
---
|
210 |
+
|
211 |
+
## 🔄 3. Multiple Input & Output Channels
|
212 |
+
|
213 |
+
Now, let’s **combine everything**! 🚀
|
214 |
+
- **Multiple input channels** (like RGB images)
|
215 |
+
- **Multiple output channels** (different filters detecting different things)
|
216 |
+
|
217 |
+
Each output channel gets its own **set of kernels** for each input channel.
|
218 |
+
We **apply the kernels, add the results**, and get multiple **activation maps**! 🎯
|
219 |
+
|
220 |
+
---
|
221 |
+
|
222 |
+
## 🏗 Example in PyTorch
|
223 |
+
|
224 |
+
```python
|
225 |
+
import torch.nn as nn
|
226 |
+
|
227 |
+
conv = nn.Conv2d(in_channels=3, out_channels=5, kernel_size=3)
|
228 |
+
```
|
229 |
+
|
230 |
+
This means:
|
231 |
+
✅ **3 input channels** (Red, Green, Blue)
|
232 |
+
✅ **5 output channels** (5 different filters detecting different things)
|
233 |
+
|
234 |
+
---
|
235 |
+
|
236 |
+
## 🏆 Why is This Important?
|
237 |
+
|
238 |
+
✅ Helps the neural network find **different patterns** 🎨
|
239 |
+
✅ Works for **color images** and **complex features** 🤖
|
240 |
+
✅ Makes the network **more powerful**! 💪
|
241 |
+
|
242 |
+
---
|
243 |
+
|
244 |
+
🎉 **Great job!** Now, let’s try convolution with multiple channels in our own models! 🏗️🤖✨
|
245 |
+
-----------------------------------------------------------------------------------------------
|
246 |
+
🎵 **Music Playing**
|
247 |
+
|
248 |
+
👋 **Welcome!** Today, we’re building a **CNN for MNIST**! 🏗️🔢
|
249 |
+
MNIST is a dataset of **handwritten numbers (0-9)**. ✍️🖼️
|
250 |
+
|
251 |
+
---
|
252 |
+
|
253 |
+
## 🏗 CNN Structure
|
254 |
+
|
255 |
+
📏 **Image Size:** 16×16 (to make training faster)
|
256 |
+
🔄 **Layers:**
|
257 |
+
- **First Convolution Layer** ➝ 16 output channels
|
258 |
+
- **Second Convolution Layer** ➝ 32 output channels
|
259 |
+
- **Final Layer** ➝ 10 output neurons (one for each digit)
|
260 |
+
|
261 |
+
---
|
262 |
+
|
263 |
+
## 🛠 Building the CNN in PyTorch
|
264 |
+
|
265 |
+
### 📌 Step 1: Define the CNN
|
266 |
+
|
267 |
+
```python
|
268 |
+
import torch.nn as nn
|
269 |
+
|
270 |
+
class CNN(nn.Module):
|
271 |
+
def __init__(self):
|
272 |
+
super(CNN, self).__init__()
|
273 |
+
self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, padding=2)
|
274 |
+
self.pool = nn.MaxPool2d(kernel_size=2)
|
275 |
+
self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2)
|
276 |
+
self.fc = nn.Linear(32 * 4 * 4, 10) # Fully connected layer (512 inputs, 10 outputs)
|
277 |
+
|
278 |
+
def forward(self, x):
|
279 |
+
x = self.pool(nn.ReLU()(self.conv1(x))) # First layer: Conv + ReLU + Pool
|
280 |
+
x = self.pool(nn.ReLU()(self.conv2(x))) # Second layer: Conv + ReLU + Pool
|
281 |
+
x = x.view(-1, 512) # Flatten the 4x4x32 output to 1D (512 elements)
|
282 |
+
x = self.fc(x) # Fully connected layer for classification
|
283 |
+
return x
|
284 |
+
```
|
285 |
+
|
286 |
+
---
|
287 |
+
|
288 |
+
## 🔍 Understanding the Output Shape
|
289 |
+
|
290 |
+
After **Max Pooling**, the image shrinks to **4×4 pixels**.
|
291 |
+
Since we have **32 channels**, the total output is:
|
292 |
+
```
|
293 |
+
4 × 4 × 32 = 512 elements
|
294 |
+
```
|
295 |
+
Each neuron in the final layer gets **512 inputs**, and since we have **10 digits (0-9)**, we use **10 neurons**.
|
296 |
+
|
297 |
+
---
|
298 |
+
|
299 |
+
## 🔄 Forward Step
|
300 |
+
|
301 |
+
1️⃣ **Apply First Convolution Layer** ➝ Activation ➝ Max Pooling
|
302 |
+
2️⃣ **Apply Second Convolution Layer** ➝ Activation ➝ Max Pooling
|
303 |
+
3️⃣ **Flatten the Output (4×4×32 → 512)**
|
304 |
+
4️⃣ **Apply the Final Output Layer (10 Neurons for 10 Digits)**
|
305 |
+
|
306 |
+
---
|
307 |
+
|
308 |
+
## 🏋️♂️ Training the Model
|
309 |
+
|
310 |
+
Check the **lab** to see how we train the CNN using:
|
311 |
+
✅ **Backpropagation**
|
312 |
+
✅ **Stochastic Gradient Descent (SGD)**
|
313 |
+
✅ **Loss Function & Accuracy Check**
|
314 |
+
|
315 |
+
---
|
316 |
+
|
317 |
+
🎉 **Great job!** Now, let’s train our CNN to recognize handwritten digits! 🏗️🔢🤖
|
318 |
+
------------------------------------------------------------------------------------
|
319 |
+
🎵 **Music Playing**
|
320 |
+
|
321 |
+
👋 **Welcome!** Today, we’re learning about **Convolutional Neural Networks (CNNs)!** 🤖🖼️
|
322 |
+
|
323 |
+
## 🤔 What is a CNN?
|
324 |
+
A **Convolutional Neural Network (CNN)** is a special type of neural network that **understands images!** 🎨
|
325 |
+
It learns to find patterns, like:
|
326 |
+
✅ **Edges** (lines & shapes)
|
327 |
+
✅ **Textures** (smooth or rough areas)
|
328 |
+
✅ **Objects** (faces, animals, letters)
|
329 |
+
|
330 |
+
---
|
331 |
+
|
332 |
+
## 🏗 How Does a CNN Work?
|
333 |
+
|
334 |
+
A CNN is made of **three main steps**:
|
335 |
+
|
336 |
+
1️⃣ **Convolution Layer** 🖼️➝🔍
|
337 |
+
- Uses **kernels** (small filters) to **detect patterns** in an image
|
338 |
+
- Creates an **activation map** that highlights important features
|
339 |
+
|
340 |
+
2️⃣ **Pooling Layer** 🔄➝📏
|
341 |
+
- **Shrinks** the activation map to keep only the most important parts
|
342 |
+
- **Max Pooling** picks the **biggest** values in each small region
|
343 |
+
|
344 |
+
3️⃣ **Fully Connected Layer** 🏗️➝🎯
|
345 |
+
- The final layer makes a **decision** (like cat 🐱 or dog 🐶)
|
346 |
+
|
347 |
+
---
|
348 |
+
|
349 |
+
## 🎨 Example: Detecting Lines
|
350 |
+
|
351 |
+
We train a CNN to recognize **horizontal** and **vertical** lines:
|
352 |
+
|
353 |
+
1️⃣ **Input Image (X)**
|
354 |
+
2️⃣ **First Convolution Layer**
|
355 |
+
- Uses **two kernels** to create two **activation maps**
|
356 |
+
- Applies **ReLU** (activation function) to remove negative values
|
357 |
+
- Uses **Max Pooling** to make learning easier
|
358 |
+
|
359 |
+
3️⃣ **Second Convolution Layer**
|
360 |
+
- Takes **two input channels** from the first layer
|
361 |
+
- Uses **two new kernels** to create **one activation map**
|
362 |
+
- Again, applies **ReLU + Max Pooling**
|
363 |
+
|
364 |
+
4️⃣ **Flattening** ➝ Turns the 2D image into **1D data**
|
365 |
+
5️⃣ **Final Prediction** ➝ Uses a **fully connected layer** to decide:
|
366 |
+
- `0` = **Vertical Line**
|
367 |
+
- `1` = **Horizontal Line**
|
368 |
+
|
369 |
+
---
|
370 |
+
|
371 |
+
## 🔄 How to Build a CNN in PyTorch
|
372 |
+
|
373 |
+
### 🏗 CNN Constructor
|
374 |
+
```python
|
375 |
+
import torch.nn as nn
|
376 |
+
|
377 |
+
class CNN(nn.Module):
|
378 |
+
def __init__(self):
|
379 |
+
super(CNN, self).__init__()
|
380 |
+
self.conv1 = nn.Conv2d(in_channels=1, out_channels=2, kernel_size=3, padding=1)
|
381 |
+
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
|
382 |
+
self.conv2 = nn.Conv2d(in_channels=2, out_channels=1, kernel_size=3, padding=1)
|
383 |
+
self.fc = nn.Linear(49, 2) # Fully connected layer (49 inputs, 2 outputs)
|
384 |
+
|
385 |
+
def forward(self, x):
|
386 |
+
x = self.pool(nn.ReLU()(self.conv1(x))) # First layer: Conv + ReLU + Pool
|
387 |
+
x = self.pool(nn.ReLU()(self.conv2(x))) # Second layer: Conv + ReLU + Pool
|
388 |
+
x = x.view(-1, 49) # Flatten to 1D
|
389 |
+
x = self.fc(x) # Fully connected layer
|
390 |
+
return x
|
391 |
+
```
|
392 |
+
|
393 |
+
---
|
394 |
+
|
395 |
+
## 🏋️♂️ Training the CNN
|
396 |
+
|
397 |
+
We train the CNN using **backpropagation** and **gradient descent**:
|
398 |
+
|
399 |
+
1️⃣ **Load the dataset** (images of lines) 📊
|
400 |
+
2️⃣ **Create a CNN model** 🏗️
|
401 |
+
3️⃣ **Define a loss function** (to measure mistakes) ❌
|
402 |
+
4️⃣ **Choose an optimizer** (to improve learning) 🔄
|
403 |
+
5️⃣ **Train the model** until it **gets better**! 🚀
|
404 |
+
|
405 |
+
As training progresses:
|
406 |
+
📉 **Loss goes down** ➝ Model makes fewer mistakes!
|
407 |
+
📈 **Accuracy goes up** ➝ Model gets better at predictions!
|
408 |
+
|
409 |
+
---
|
410 |
+
|
411 |
+
## 🏆 Why Use CNNs?
|
412 |
+
|
413 |
+
✅ **Finds patterns** in images 🔍
|
414 |
+
✅ **Works with real-world data** (faces, animals, objects) 🖼️
|
415 |
+
✅ **More efficient** than regular neural networks 💡
|
416 |
+
|
417 |
+
---
|
418 |
+
|
419 |
+
🎉 **Great job!** Now, let’s build and train our own CNN! 🏗️🤖✨
|
420 |
+
----------------------------------------------------------------------
|
421 |
+
|
422 |
+
🎵 **Music Playing**
|
423 |
+
|
424 |
+
👋 **Welcome!** Today, we’re building a **CNN for MNIST**! 🏗️🖼️
|
425 |
+
MNIST is a dataset of **handwritten numbers (0-9)**. ✍️🔢
|
426 |
+
|
427 |
+
---
|
428 |
+
|
429 |
+
## 🏗 CNN Structure
|
430 |
+
|
431 |
+
📏 **Image Size:** 16×16 (to make training faster)
|
432 |
+
🔄 **Layers:**
|
433 |
+
- **First Convolution Layer** ➝ 16 output channels
|
434 |
+
- **Second Convolution Layer** ➝ 32 output channels
|
435 |
+
- **Final Layer** ➝ 10 output neurons (one for each digit)
|
436 |
+
|
437 |
+
---
|
438 |
+
|
439 |
+
## 🛠 Building the CNN in PyTorch
|
440 |
+
|
441 |
+
### 🔹 Step 1: Define the CNN
|
442 |
+
|
443 |
+
```python
|
444 |
+
import torch.nn as nn
|
445 |
+
|
446 |
+
class CNN(nn.Module):
|
447 |
+
def __init__(self):
|
448 |
+
super(CNN, self).__init__()
|
449 |
+
self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, padding=2)
|
450 |
+
self.pool = nn.MaxPool2d(kernel_size=2)
|
451 |
+
self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2)
|
452 |
+
self.fc = nn.Linear(32 * 4 * 4, 10) # Fully connected layer (512 inputs, 10 outputs)
|
453 |
+
|
454 |
+
def forward(self, x):
|
455 |
+
x = self.pool(nn.ReLU()(self.conv1(x))) # First layer: Conv + ReLU + Pool
|
456 |
+
x = self.pool(nn.ReLU()(self.conv2(x))) # Second layer: Conv + ReLU + Pool
|
457 |
+
x = x.view(-1, 512) # Flatten the 4x4x32 output to 1D (512 elements)
|
458 |
+
x = self.fc(x) # Fully connected layer for classification
|
459 |
+
return x
|
460 |
+
```
|
461 |
+
|
462 |
+
---
|
463 |
+
|
464 |
+
## 🔍 Understanding the Output Shape
|
465 |
+
|
466 |
+
After **Max Pooling**, the image shrinks to **4×4 pixels**.
|
467 |
+
Since we have **32 channels**, the total output is:
|
468 |
+
```
|
469 |
+
4 × 4 × 32 = 512 elements
|
470 |
+
```
|
471 |
+
Each neuron in the final layer gets **512 inputs**, and since we have **10 digits (0-9)**, we use **10 neurons**.
|
472 |
+
|
473 |
+
---
|
474 |
+
|
475 |
+
## 🔄 Forward Step
|
476 |
+
|
477 |
+
1️⃣ **Apply First Convolution Layer** ➝ Activation ➝ Max Pooling
|
478 |
+
2️⃣ **Apply Second Convolution Layer** ➝ Activation ➝ Max Pooling
|
479 |
+
3️⃣ **Flatten the Output (4×4×32 → 512)**
|
480 |
+
4️⃣ **Apply the Final Output Layer (10 Neurons for 10 Digits)**
|
481 |
+
|
482 |
+
---
|
483 |
+
|
484 |
+
## 🏋️♂️ Training the Model
|
485 |
+
|
486 |
+
Check the **lab** to see how we train the CNN using:
|
487 |
+
✅ **Backpropagation**
|
488 |
+
✅ **Stochastic Gradient Descent (SGD)**
|
489 |
+
✅ **Loss Function & Accuracy Check**
|
490 |
+
|
491 |
+
---
|
492 |
+
|
493 |
+
🎉 **Great job!** Now, let’s train our CNN to recognize handwritten digits! 🏗️🔢🤖
|
494 |
+
------------------------------------------------------------------------------------
|
495 |
+
🎵 **Music Playing**
|
496 |
+
|
497 |
+
👋 **Welcome!** Today, we’re learning how to use **Pretrained TorchVision Models**! 🤖🖼️
|
498 |
+
|
499 |
+
## 🤔 What is a Pretrained Model?
|
500 |
+
|
501 |
+
A **pretrained model** is a neural network that has already been **trained by experts** on a large dataset.
|
502 |
+
✅ **Saves time** (no need to train from scratch) ⏳
|
503 |
+
✅ **Works better** (already optimized) 🎯
|
504 |
+
✅ **We only train the final layer** for our own images! 🔄
|
505 |
+
|
506 |
+
---
|
507 |
+
|
508 |
+
## 🔄 Using ResNet18 (A Pretrained Model)
|
509 |
+
|
510 |
+
We will use **ResNet18**, a powerful model trained on **color images**. 🎨
|
511 |
+
It has **skip connections** (we won’t go into details, but it helps learning).
|
512 |
+
|
513 |
+
We only **replace the last layer** to match our dataset! 🔁
|
514 |
+
|
515 |
+
---
|
516 |
+
|
517 |
+
## 🛠 Steps to Use a Pretrained Model
|
518 |
+
|
519 |
+
### 📌 Step 1: Load the Pretrained Model
|
520 |
+
```python
|
521 |
+
import torchvision.models as models
|
522 |
+
|
523 |
+
model = models.resnet18(pretrained=True) # Load pretrained ResNet18
|
524 |
+
```
|
525 |
+
|
526 |
+
### 📌 Step 2: Normalize Images (Required for ResNet18)
|
527 |
+
```python
|
528 |
+
import torchvision.transforms as transforms
|
529 |
+
|
530 |
+
transform = transforms.Compose([
|
531 |
+
transforms.Resize((224, 224)), # Resize image
|
532 |
+
transforms.ToTensor(), # Convert to tensor
|
533 |
+
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) # Normalize
|
534 |
+
])
|
535 |
+
```
|
536 |
+
|
537 |
+
### 📌 Step 3: Prepare the Dataset
|
538 |
+
Create a **dataset object** for your own images with **training and testing data**. 📊
|
539 |
+
|
540 |
+
### 📌 Step 4: Replace the Output Layer
|
541 |
+
- The **last hidden layer** has **512 neurons**
|
542 |
+
- We create a **new output layer** for **our dataset**
|
543 |
+
|
544 |
+
Example: **If we have 7 classes**, we create a layer with **7 outputs**:
|
545 |
+
```python
|
546 |
+
import torch.nn as nn
|
547 |
+
|
548 |
+
for param in model.parameters():
|
549 |
+
param.requires_grad = False # Freeze pretrained layers
|
550 |
+
|
551 |
+
model.fc = nn.Linear(512, 7) # Replace output layer (512 inputs → 7 outputs)
|
552 |
+
```
|
553 |
+
|
554 |
+
---
|
555 |
+
|
556 |
+
## 🏋️♂️ Training the Model
|
557 |
+
|
558 |
+
### 📌 Step 5: Create Data Loaders
|
559 |
+
```python
|
560 |
+
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=15, shuffle=True)
|
561 |
+
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=10, shuffle=False)
|
562 |
+
```
|
563 |
+
|
564 |
+
### 📌 Step 6: Set Up Training
|
565 |
+
```python
|
566 |
+
import torch.optim as optim
|
567 |
+
|
568 |
+
criterion = nn.CrossEntropyLoss() # Loss function
|
569 |
+
optimizer = optim.Adam(model.fc.parameters(), lr=0.001) # Optimizer (only for last layer)
|
570 |
+
```
|
571 |
+
|
572 |
+
### 📌 Step 7: Train the Model
|
573 |
+
1️⃣ **Set model to training mode** 🏋️
|
574 |
+
```python
|
575 |
+
model.train()
|
576 |
+
```
|
577 |
+
2️⃣ Train for **20 epochs**
|
578 |
+
3️⃣ **Set model to evaluation mode** when predicting 📊
|
579 |
+
```python
|
580 |
+
model.eval()
|
581 |
+
```
|
582 |
+
|
583 |
+
---
|
584 |
+
|
585 |
+
## 🏆 Why Use Pretrained Models?
|
586 |
+
|
587 |
+
✅ **Saves time** (no need to train from scratch)
|
588 |
+
✅ **Works better** (pretrained on millions of images)
|
589 |
+
✅ **We only change one layer** for our dataset!
|
590 |
+
|
591 |
+
---
|
592 |
+
|
593 |
+
🎉 **Great job!** Now, try using a pretrained model for your own images! 🏗️🤖✨
|
594 |
+
---------------------------------------------------------------------------------
|
595 |
+
🎵 **Music Playing**
|
596 |
+
|
597 |
+
👋 **Welcome!** Today, we’re learning how to use **GPUs in PyTorch**! 🚀💻
|
598 |
+
|
599 |
+
## 🤔 Why Use a GPU?
|
600 |
+
A **Graphics Processing Unit (GPU)** can **train models MUCH faster** than a CPU!
|
601 |
+
✅ Faster computation ⏩
|
602 |
+
✅ Better for large datasets 📊
|
603 |
+
✅ Helps train deep learning models efficiently 🤖
|
604 |
+
|
605 |
+
---
|
606 |
+
|
607 |
+
## 🔥 What is CUDA?
|
608 |
+
CUDA is a **special tool** made by **NVIDIA** that allows us to use **GPUs for AI tasks**. 🎮🚀
|
609 |
+
In **PyTorch**, we use **torch.cuda** to work with GPUs.
|
610 |
+
|
611 |
+
---
|
612 |
+
|
613 |
+
## 🛠 Step 1: Check if a GPU is Available
|
614 |
+
|
615 |
+
```python
|
616 |
+
import torch
|
617 |
+
|
618 |
+
# Check if a GPU is available
|
619 |
+
torch.cuda.is_available() # Returns True if a GPU is detected
|
620 |
+
```
|
621 |
+
|
622 |
+
---
|
623 |
+
|
624 |
+
## 🎯 Step 2: Set Up the GPU
|
625 |
+
|
626 |
+
```python
|
627 |
+
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
|
628 |
+
```
|
629 |
+
|
630 |
+
- `"cuda:0"` = First available GPU 🎮
|
631 |
+
- `"cpu"` = Use the CPU if no GPU is found
|
632 |
+
|
633 |
+
---
|
634 |
+
|
635 |
+
## 🏗 Step 3: Sending Tensors to the GPU
|
636 |
+
|
637 |
+
In PyTorch, **data is stored in Tensors**.
|
638 |
+
To move data to the GPU, use `.to(device)`.
|
639 |
+
|
640 |
+
```python
|
641 |
+
tensor = torch.randn(3, 3) # Create a random tensor
|
642 |
+
tensor = tensor.to(device) # Move it to the GPU
|
643 |
+
```
|
644 |
+
|
645 |
+
✅ **Faster processing on the GPU!** ⚡
|
646 |
+
|
647 |
+
---
|
648 |
+
|
649 |
+
## 🔄 Step 4: Using a GPU with a CNN
|
650 |
+
|
651 |
+
You **don’t need to change** your CNN code! Just **move the model to the GPU** after creating it:
|
652 |
+
|
653 |
+
```python
|
654 |
+
model = CNN() # Create CNN model
|
655 |
+
model.to(device) # Move the model to the GPU
|
656 |
+
```
|
657 |
+
|
658 |
+
This **converts** all layers to **CUDA tensors** for GPU computation! 🎮
|
659 |
+
|
660 |
+
---
|
661 |
+
|
662 |
+
## 🏋️♂️ Step 5: Training a Model on a GPU
|
663 |
+
|
664 |
+
Training is the same, but **you must send your data to the GPU**!
|
665 |
+
|
666 |
+
```python
|
667 |
+
for images, labels in train_loader:
|
668 |
+
images, labels = images.to(device), labels.to(device) # Move data to GPU
|
669 |
+
optimizer.zero_grad() # Clear gradients
|
670 |
+
outputs = model(images) # Forward pass (on GPU)
|
671 |
+
loss = criterion(outputs, labels) # Compute loss
|
672 |
+
loss.backward() # Backpropagation
|
673 |
+
optimizer.step() # Update weights
|
674 |
+
```
|
675 |
+
|
676 |
+
✅ **The model trains much faster!** 🚀
|
677 |
+
|
678 |
+
---
|
679 |
+
|
680 |
+
## 🎯 Step 6: Testing the Model
|
681 |
+
|
682 |
+
For testing, **only move the images** (not the labels) to the GPU:
|
683 |
+
|
684 |
+
```python
|
685 |
+
for images, labels in test_loader:
|
686 |
+
images = images.to(device) # Move images to GPU
|
687 |
+
outputs = model(images) # Get predictions
|
688 |
+
```
|
689 |
+
|
690 |
+
✅ **Saves memory and speeds up testing!** ⚡
|
691 |
+
|
692 |
+
---
|
693 |
+
|
694 |
+
## 🏆 Summary
|
695 |
+
|
696 |
+
✅ **GPUs make training faster** 🎮
|
697 |
+
✅ Use **torch.cuda** to work with GPUs
|
698 |
+
✅ Move **data & models** to the GPU with `.to(device)`
|
699 |
+
✅ Training & testing are the same, but data **must be on the GPU**
|
700 |
+
|
701 |
+
---
|
702 |
+
|
703 |
+
🎉 **Great job!** Now, try training a model using a GPU in PyTorch! 🏗️🚀
|
convolutional_neural_networks.py
ADDED
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import torch
|
2 |
+
import torch.nn as nn
|
3 |
+
import torch.optim as optim
|
4 |
+
|
5 |
+
# Convolutional Neural Network Model
|
6 |
+
class ConvolutionalNeuralNetwork(nn.Module):
|
7 |
+
def __init__(self):
|
8 |
+
super(ConvolutionalNeuralNetwork, self).__init__()
|
9 |
+
self.conv1 = nn.Conv2d(1, 16, kernel_size=5)
|
10 |
+
self.conv2 = nn.Conv2d(16, 32, kernel_size=5)
|
11 |
+
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
|
12 |
+
self.fc1 = nn.Linear(32 * 4 * 4, 120)
|
13 |
+
self.fc2 = nn.Linear(120, 84)
|
14 |
+
self.fc3 = nn.Linear(84, 10)
|
15 |
+
|
16 |
+
def forward(self, x):
|
17 |
+
x = self.pool(torch.relu(self.conv1(x)))
|
18 |
+
x = self.pool(torch.relu(self.conv2(x)))
|
19 |
+
x = x.view(-1, 32 * 4 * 4)
|
20 |
+
x = torch.relu(self.fc1(x))
|
21 |
+
x = torch.relu(self.fc2(x))
|
22 |
+
return x # Return raw output without applying softmax
|
23 |
+
|
24 |
+
|
25 |
+
# Training Function
|
26 |
+
def train_model(model, criterion, optimizer, x_train, y_train, epochs=100):
|
27 |
+
for epoch in range(epochs):
|
28 |
+
model.train()
|
29 |
+
optimizer.zero_grad()
|
30 |
+
y_pred = model(x_train)
|
31 |
+
loss = criterion(y_pred, y_train)
|
32 |
+
loss.backward()
|
33 |
+
optimizer.step()
|
34 |
+
if (epoch+1) % 10 == 0:
|
35 |
+
print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')
|
36 |
+
|
37 |
+
import matplotlib.pyplot as plt # Added import for plotting
|
38 |
+
|
39 |
+
# Example Usage
|
40 |
+
if __name__ == "__main__":
|
41 |
+
# Sample Data
|
42 |
+
x_train = torch.randn(100, 1, 28, 28) # Example for MNIST
|
43 |
+
y_train = torch.randint(0, 10, (100,))
|
44 |
+
|
45 |
+
# Plotting the input data
|
46 |
+
for i in range(6):
|
47 |
+
plt.subplot(2, 3, i + 1)
|
48 |
+
plt.imshow(x_train[i].squeeze(), cmap='gray')
|
49 |
+
plt.title(f'Label: {y_train[i].item()}')
|
50 |
+
plt.axis('off')
|
51 |
+
plt.show()
|
52 |
+
|
53 |
+
|
54 |
+
model = ConvolutionalNeuralNetwork()
|
55 |
+
|
56 |
+
# Plotting the predictions
|
57 |
+
y_pred = model(x_train).detach().numpy()
|
58 |
+
plt.figure(figsize=(12, 6))
|
59 |
+
for i in range(6):
|
60 |
+
plt.subplot(2, 3, i + 1)
|
61 |
+
plt.imshow(x_train[i].squeeze(), cmap='gray')
|
62 |
+
plt.title(f'Predicted: {np.argmax(y_pred[i])}')
|
63 |
+
plt.axis('off')
|
64 |
+
plt.show()
|
65 |
+
|
66 |
+
criterion = nn.CrossEntropyLoss()
|
67 |
+
optimizer = optim.SGD(model.parameters(), lr=0.01)
|
68 |
+
|
69 |
+
train_model(model, criterion, optimizer, x_train, y_train)
|
dataset_loader.py
ADDED
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import numpy as np
|
3 |
+
import gzip
|
4 |
+
from PIL import Image
|
5 |
+
from torchvision import transforms
|
6 |
+
|
7 |
+
class CustomMNISTDataset:
|
8 |
+
def __init__(self, dataset_path, transform=None):
|
9 |
+
self.dataset_path = dataset_path
|
10 |
+
self.transform = transform
|
11 |
+
self.images, self.labels = self.load_dataset()
|
12 |
+
|
13 |
+
def load_dataset(self):
|
14 |
+
image_paths = []
|
15 |
+
label_paths = []
|
16 |
+
|
17 |
+
# Assuming the dataset consists of images and labels in the dataset path
|
18 |
+
for file in os.listdir(self.dataset_path):
|
19 |
+
if 'train-images-idx3-ubyte.gz' in file:
|
20 |
+
image_paths.append(os.path.join(self.dataset_path, file))
|
21 |
+
elif 'train-labels-idx1-ubyte.gz' in file:
|
22 |
+
label_paths.append(os.path.join(self.dataset_path, file))
|
23 |
+
|
24 |
+
if not image_paths or not label_paths:
|
25 |
+
raise ValueError(f"❌ Missing image or label files in {self.dataset_path}")
|
26 |
+
|
27 |
+
images = []
|
28 |
+
labels = []
|
29 |
+
|
30 |
+
# Assuming one image file and one label file
|
31 |
+
for img_path, label_path in zip(image_paths, label_paths):
|
32 |
+
images_data, labels_data = self.load_mnist_data(img_path, label_path)
|
33 |
+
images.extend(images_data)
|
34 |
+
labels.extend(labels_data)
|
35 |
+
|
36 |
+
return images, labels
|
37 |
+
|
38 |
+
def load_mnist_data(self, img_path, label_path):
|
39 |
+
"""Load MNIST data from .gz files."""
|
40 |
+
with gzip.open(img_path, 'rb') as f:
|
41 |
+
# Skip the magic number and metadata
|
42 |
+
f.read(16)
|
43 |
+
# Read the image data
|
44 |
+
img_data = np.frombuffer(f.read(), dtype=np.uint8)
|
45 |
+
img_data = img_data.reshape(-1, 28, 28) # Reshape to 28x28 images
|
46 |
+
|
47 |
+
with gzip.open(label_path, 'rb') as f:
|
48 |
+
# Skip the magic number and metadata
|
49 |
+
f.read(8)
|
50 |
+
# Read the label data
|
51 |
+
label_data = np.frombuffer(f.read(), dtype=np.uint8)
|
52 |
+
|
53 |
+
images = [Image.fromarray(img) for img in img_data] # Convert each image to a PIL Image
|
54 |
+
|
55 |
+
# If you have any transformation, apply it here
|
56 |
+
if self.transform:
|
57 |
+
images = [self.transform(img) for img in images]
|
58 |
+
|
59 |
+
return images, label_data
|
60 |
+
|
61 |
+
def __len__(self):
|
62 |
+
"""Return the total number of images in the dataset."""
|
63 |
+
return len(self.images)
|
64 |
+
|
65 |
+
def __getitem__(self, idx):
|
66 |
+
"""Return a single image and its label at the given index."""
|
67 |
+
image = self.images[idx]
|
68 |
+
label = self.labels[idx]
|
69 |
+
return image, label
|
deep_networks.md
ADDED
@@ -0,0 +1,356 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
🎵 **Music Playing**
|
2 |
+
|
3 |
+
👋 **Welcome!** Today, we’re learning about **Deep Neural Networks**—a cool way computers learn! 🧠💡
|
4 |
+
|
5 |
+
## 🤖 What is a Neural Network?
|
6 |
+
Imagine a brain made of tiny switches called **neurons**. These neurons work together to make smart decisions!
|
7 |
+
|
8 |
+
### 🟢 Input Layer
|
9 |
+
This is where we give the network information, like pictures or numbers.
|
10 |
+
|
11 |
+
### 🔵 Hidden Layers
|
12 |
+
These layers are like **magic helpers** that figure out patterns!
|
13 |
+
- More neurons = better learning 🤓
|
14 |
+
- Too many neurons = can be **confusing** (overfitting) 😵
|
15 |
+
|
16 |
+
### 🔴 Output Layer
|
17 |
+
This is where the network **gives us answers!** 🏆
|
18 |
+
|
19 |
+
---
|
20 |
+
|
21 |
+
## 🏗 Building a Deep Neural Network in PyTorch
|
22 |
+
|
23 |
+
We can **build a deep neural network** using PyTorch, a tool that helps computers learn. 🖥️
|
24 |
+
|
25 |
+
### 🛠 Layers of Our Network
|
26 |
+
1️⃣ **First Hidden Layer:** Has `H1` neurons.
|
27 |
+
2️⃣ **Second Hidden Layer:** Has `H2` neurons.
|
28 |
+
3️⃣ **Output Layer:** Decides the final answer! 🎯
|
29 |
+
|
30 |
+
---
|
31 |
+
|
32 |
+
## 🔄 How Does It Work?
|
33 |
+
1️⃣ **Start with an input (x).**
|
34 |
+
2️⃣ **Pass through each layer:**
|
35 |
+
- Apply **math functions** (like `sigmoid`, `tanh`, or `ReLU`).
|
36 |
+
- These help the network understand better! 🧩
|
37 |
+
3️⃣ **Get the final answer!** ✅
|
38 |
+
|
39 |
+
---
|
40 |
+
|
41 |
+
## 🎨 Different Activation Functions
|
42 |
+
Activation functions help the network **think better!** 🧠
|
43 |
+
- **Sigmoid** ➝ Good for small problems 🤏
|
44 |
+
- **Tanh** ➝ Works better for deeper networks 🌊
|
45 |
+
- **ReLU** ➝ Super strong for big tasks! 🚀
|
46 |
+
|
47 |
+
---
|
48 |
+
|
49 |
+
## 🔢 Example: Recognizing Handwritten Numbers
|
50 |
+
We train the network with **MNIST**, a dataset of handwritten numbers. 📝🔢
|
51 |
+
- **Input:** 784 pixels (28x28 images) 📸
|
52 |
+
- **Hidden Layers:** 50 neurons each 🤖
|
53 |
+
- **Output:** 10 neurons (digits 0-9) 🔟
|
54 |
+
|
55 |
+
---
|
56 |
+
|
57 |
+
## 🚀 Training the Network
|
58 |
+
We use **Stochastic Gradient Descent (SGD)** to teach the network! 📚
|
59 |
+
- **Loss Function:** Helps the network learn from mistakes. ❌➡✅
|
60 |
+
- **Validation Accuracy:** Checks how well the network is doing! 🎯
|
61 |
+
|
62 |
+
---
|
63 |
+
|
64 |
+
## 🏆 What We Learned
|
65 |
+
✅ Deep Neural Networks have **many hidden layers**.
|
66 |
+
✅ Different **activation functions** help improve performance.
|
67 |
+
✅ The more layers we add, the **smarter** the network becomes! 💡
|
68 |
+
|
69 |
+
---
|
70 |
+
|
71 |
+
🎉 **Great job!** Now, let's build and train our own deep neural networks! 🏗️🤖✨
|
72 |
+
-----------------------------------------------------------------------------------
|
73 |
+
🎵 **Music Playing**
|
74 |
+
|
75 |
+
👋 **Welcome!** Today, we’ll learn how to **build a deep neural network** in PyTorch using `nn.ModuleList`. 🧠💡
|
76 |
+
|
77 |
+
## 🤖 Why Use `nn.ModuleList`?
|
78 |
+
Instead of adding layers **one by one** (which takes a long time ⏳), we can **automate** the process! 🚀
|
79 |
+
|
80 |
+
---
|
81 |
+
|
82 |
+
## 🏗 Building the Neural Network
|
83 |
+
|
84 |
+
We create a **list** called `layers` 📋:
|
85 |
+
- **First item:** Input size (e.g., `2` features).
|
86 |
+
- **Second item:** Neurons in the **first hidden layer** (e.g., `3`).
|
87 |
+
- **Third item:** Neurons in the **second hidden layer** (e.g., `4`).
|
88 |
+
- **Fourth item:** Output size (number of classes, e.g., `3`).
|
89 |
+
|
90 |
+
---
|
91 |
+
|
92 |
+
## 🔄 Constructing the Network
|
93 |
+
|
94 |
+
### 🔹 Step 1: Create Layers
|
95 |
+
- We loop through the list, taking **two elements at a time**:
|
96 |
+
- **First element:** Input size 🎯
|
97 |
+
- **Second element:** Output size (number of neurons) 🧩
|
98 |
+
|
99 |
+
### 🔹 Step 2: Connecting Layers
|
100 |
+
- First **hidden layer** ➝ Input size = `2`, Neurons = `3`
|
101 |
+
- Second **hidden layer** ➝ Input size = `3`, Neurons = `4`
|
102 |
+
- **Output layer** ➝ Input size = `4`, Output size = `3`
|
103 |
+
|
104 |
+
---
|
105 |
+
|
106 |
+
## ⚡ Forward Function
|
107 |
+
|
108 |
+
We **pass data** through the network:
|
109 |
+
1️⃣ **Apply linear transformation** to each layer ➝ Makes calculations 🧮
|
110 |
+
2️⃣ **Apply activation function** (`ReLU`) ➝ Helps the network learn 📈
|
111 |
+
3️⃣ **For the last layer**, we only apply **linear transformation** (since it's a classification task 🎯).
|
112 |
+
|
113 |
+
---
|
114 |
+
|
115 |
+
## 🎯 Training the Network
|
116 |
+
|
117 |
+
The **training process** is similar to before! We:
|
118 |
+
- Use a **dataset** 📊
|
119 |
+
- Try **different combinations** of neurons and layers 🤖
|
120 |
+
- See which setup gives the **best performance**! 🏆
|
121 |
+
|
122 |
+
---
|
123 |
+
|
124 |
+
🎉 **Awesome!** Now, let’s explore ways to make these networks even **better!** 🚀
|
125 |
+
-----------------------------------------------------------------------------------
|
126 |
+
🎵 **Music Playing**
|
127 |
+
|
128 |
+
👋 **Welcome!** Today, we’re learning about **weight initialization** in Neural Networks! 🧠⚡
|
129 |
+
|
130 |
+
## 🤔 Why Does Weight Initialization Matter?
|
131 |
+
If we **don’t** choose good starting weights, our neural network **won’t learn properly**! 🚨
|
132 |
+
Sometimes, **all neurons** in a layer get the **same weights**, which causes problems.
|
133 |
+
|
134 |
+
---
|
135 |
+
|
136 |
+
## 🚀 How PyTorch Handles Weights
|
137 |
+
PyTorch **automatically** picks starting weights, but we can also set them **ourselves**! 🔧
|
138 |
+
Let’s see what happens when we:
|
139 |
+
- Set **all weights to 1** and **bias to 0** ➝ ❌ **Bad idea!**
|
140 |
+
- Randomly choose weights from a **uniform distribution** ➝ ✅ **Better!**
|
141 |
+
|
142 |
+
---
|
143 |
+
|
144 |
+
## 🔄 The Problem with Random Weights
|
145 |
+
We use a **uniform distribution** (random values between -1 and 1). But:
|
146 |
+
- **Too small?** ➝ Weights don’t change much 🤏
|
147 |
+
- **Too large?** ➝ **Vanishing gradient** problem 😵
|
148 |
+
|
149 |
+
### 📉 What’s a Vanishing Gradient?
|
150 |
+
If weights are **too big**, activations get **too large**, and the **gradient shrinks to zero**.
|
151 |
+
That means the network **stops learning**! 🚫
|
152 |
+
|
153 |
+
---
|
154 |
+
|
155 |
+
## 🛠 Fixing the Problem
|
156 |
+
|
157 |
+
### 🎯 Solution: Scale Weights Based on Neurons
|
158 |
+
We scale the weight range based on **how many neurons** we have:
|
159 |
+
- **2 neurons?** ➝ Scale by **1/2**
|
160 |
+
- **4 neurons?** ➝ Scale by **1/4**
|
161 |
+
- **100 neurons?** ➝ Scale by **1/100**
|
162 |
+
|
163 |
+
This prevents the vanishing gradient issue! ✅
|
164 |
+
|
165 |
+
---
|
166 |
+
|
167 |
+
## 🔬 Different Weight Initialization Methods
|
168 |
+
|
169 |
+
### 🏗 **1. Default PyTorch Method**
|
170 |
+
- PyTorch **automatically** picks a range:
|
171 |
+
- **Lower bound:** `-1 / sqrt(L_in)`
|
172 |
+
- **Upper bound:** `+1 / sqrt(L_in)`
|
173 |
+
|
174 |
+
### 🔵 **2. Xavier Initialization**
|
175 |
+
- Best for **tanh** activation
|
176 |
+
- Uses the **number of input and output neurons**
|
177 |
+
- We apply `xavier_uniform_()` to set the weights
|
178 |
+
|
179 |
+
### 🔴 **3. He Initialization**
|
180 |
+
- Best for **ReLU** activation
|
181 |
+
- Uses the **He initialization method**
|
182 |
+
- We apply `he_uniform_()` to set the weights
|
183 |
+
|
184 |
+
---
|
185 |
+
|
186 |
+
## 🏆 Which One is Best?
|
187 |
+
We compare:
|
188 |
+
✅ **PyTorch Default**
|
189 |
+
✅ **Xavier Method** (tanh)
|
190 |
+
✅ **He Method** (ReLU)
|
191 |
+
|
192 |
+
The **Xavier and He methods** help the network **learn faster**! 🚀
|
193 |
+
|
194 |
+
---
|
195 |
+
|
196 |
+
🎉 **Great job!** Now, let’s try different weight initializations and see what works best! 🏗️🔬
|
197 |
+
------------------------------------------------------------------------------------------------
|
198 |
+
🎵 **Music Playing**
|
199 |
+
|
200 |
+
👋 **Welcome!** Today, we’re learning about **Gradient Descent with Momentum**! 🚀🔄
|
201 |
+
|
202 |
+
## 🤔 What’s the Problem?
|
203 |
+
Sometimes, when training a neural network, the model can get **stuck**:
|
204 |
+
- **Saddle Points** ➝ Flat areas where learning stops 🏔️
|
205 |
+
- **Local Minima** ➝ Not the best solution, but we get trapped 😞
|
206 |
+
|
207 |
+
---
|
208 |
+
|
209 |
+
## 🏃♂️ What is Momentum?
|
210 |
+
Momentum helps the model **keep moving** even when it gets stuck! 💨
|
211 |
+
It’s like rolling a ball downhill:
|
212 |
+
- **Gradient (Force)** ➝ Tells us where to go 🏀
|
213 |
+
- **Momentum (Mass)** ➝ Helps us keep moving even on flat surfaces ⚡
|
214 |
+
|
215 |
+
---
|
216 |
+
|
217 |
+
## 🔄 How Does It Work?
|
218 |
+
|
219 |
+
### 🔹 Step 1: Compute Velocity
|
220 |
+
- Velocity (`v`) = Old velocity (`v_k`) + Learning step (`gradient * learning rate`)
|
221 |
+
- The **momentum term** (𝜌) controls how much we keep from the past.
|
222 |
+
|
223 |
+
### 🔹 Step 2: Update Weights
|
224 |
+
- New weight (`w_k+1`) = Old weight (`w_k`) - Learning rate * Velocity
|
225 |
+
|
226 |
+
The bigger the **momentum**, the harder it is to stop moving! 🏃♂️💨
|
227 |
+
|
228 |
+
---
|
229 |
+
|
230 |
+
## ⚠️ Why Does It Help?
|
231 |
+
|
232 |
+
### 🏔️ **Saddle Points**
|
233 |
+
- **Without Momentum** ➝ Model **stops** moving in flat areas ❌
|
234 |
+
- **With Momentum** ➝ Keeps moving **past** the flat spots ✅
|
235 |
+
|
236 |
+
### ⬇ **Local Minima**
|
237 |
+
- **Without Momentum** ➝ Gets **stuck** in a bad spot 😖
|
238 |
+
- **With Momentum** ➝ Pushes through and **finds a better solution!** 🎯
|
239 |
+
|
240 |
+
---
|
241 |
+
|
242 |
+
## 🏆 Picking the Right Momentum
|
243 |
+
|
244 |
+
- **Too Small?** ➝ Model gets **stuck** 😕
|
245 |
+
- **Too Large?** ➝ Model **overshoots** the best answer 🚀
|
246 |
+
- **Best Choice?** ➝ We test different values and pick what works! 🔬
|
247 |
+
|
248 |
+
---
|
249 |
+
|
250 |
+
## 🛠 Using Momentum in PyTorch
|
251 |
+
Just add the **momentum** value to the optimizer!
|
252 |
+
|
253 |
+
```python
|
254 |
+
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
|
255 |
+
```
|
256 |
+
|
257 |
+
In the lab, we test **different momentum values** on a dataset and see how they affect learning! 📊
|
258 |
+
|
259 |
+
---
|
260 |
+
|
261 |
+
🎉 **Great job!** Now, let’s experiment with momentum and see how it helps our model! 🏗️⚡
|
262 |
+
------------------------------------------------------------------------------------------
|
263 |
+
🎵 **Music Playing**
|
264 |
+
|
265 |
+
👋 **Welcome!** Today, we’re learning about **Batch Normalization**! 🚀🔄
|
266 |
+
|
267 |
+
## 🤔 What’s the Problem?
|
268 |
+
When training a neural network, the activations (outputs) can vary a lot, making learning **slower** and **unstable**. 😖
|
269 |
+
Batch Normalization **fixes this** by:
|
270 |
+
✅ Making activations more consistent
|
271 |
+
✅ Helping the network learn faster
|
272 |
+
✅ Reducing problems like vanishing gradients
|
273 |
+
|
274 |
+
---
|
275 |
+
|
276 |
+
## 🔄 How Does Batch Normalization Work?
|
277 |
+
|
278 |
+
### 🏗 Step 1: Normalize Each Mini-Batch
|
279 |
+
For each neuron in a layer:
|
280 |
+
1️⃣ Compute the **mean** and **standard deviation** of its activations. 📊
|
281 |
+
2️⃣ Normalize the outputs using:
|
282 |
+
\[
|
283 |
+
z' = \frac{z - \text{mean}}{\text{std dev} + \epsilon}
|
284 |
+
\]
|
285 |
+
(We add a **small** value `ε` to avoid division by zero.)
|
286 |
+
|
287 |
+
### 🏗 Step 2: Scale and Shift
|
288 |
+
- Instead of leaving activations at 0 and 1, we **scale** and **shift** them:
|
289 |
+
\[
|
290 |
+
z'' = \gamma \cdot z' + \beta
|
291 |
+
\]
|
292 |
+
- **γ (scale) and β (shift)** are **learned** during training! 🏋️♂️
|
293 |
+
|
294 |
+
---
|
295 |
+
|
296 |
+
## 🔬 Example: Normalizing Activations
|
297 |
+
|
298 |
+
- **First Mini-Batch (X1)** ➝ Compute mean & std for each neuron, normalize, then scale & shift
|
299 |
+
- **Second Mini-Batch (X2)** ➝ Repeat for new batch! ♻
|
300 |
+
- **Next Layer** ➝ Apply batch normalization again! 🔄
|
301 |
+
|
302 |
+
### 🏆 Prediction Time
|
303 |
+
- During **training**, we compute the mean & std for **each batch**.
|
304 |
+
- During **testing**, we use the **population mean & std** instead. 📊
|
305 |
+
|
306 |
+
---
|
307 |
+
|
308 |
+
## 🛠 Using Batch Normalization in PyTorch
|
309 |
+
|
310 |
+
```python
|
311 |
+
import torch.nn as nn
|
312 |
+
|
313 |
+
class NeuralNetwork(nn.Module):
|
314 |
+
def __init__(self):
|
315 |
+
super(NeuralNetwork, self).__init__()
|
316 |
+
self.fc1 = nn.Linear(10, 3) # First layer (10 inputs, 3 neurons)
|
317 |
+
self.bn1 = nn.BatchNorm1d(3) # Batch Norm for first layer
|
318 |
+
self.fc2 = nn.Linear(3, 4) # Second layer (3 inputs, 4 neurons)
|
319 |
+
self.bn2 = nn.BatchNorm1d(4) # Batch Norm for second layer
|
320 |
+
|
321 |
+
def forward(self, x):
|
322 |
+
x = self.bn1(self.fc1(x)) # Apply Batch Norm
|
323 |
+
x = self.bn2(self.fc2(x)) # Apply Batch Norm again
|
324 |
+
return x
|
325 |
+
```
|
326 |
+
|
327 |
+
- **Training?** Set the model to **train mode** 🏋️♂️
|
328 |
+
```python
|
329 |
+
model.train()
|
330 |
+
```
|
331 |
+
- **Predicting?** Use **evaluation mode** 📈
|
332 |
+
```python
|
333 |
+
model.eval()
|
334 |
+
```
|
335 |
+
|
336 |
+
---
|
337 |
+
|
338 |
+
## 🚀 Why Does Batch Normalization Work?
|
339 |
+
|
340 |
+
### ✅ Helps Gradient Descent Work Better
|
341 |
+
- Normalized data = **smoother** loss function 🎯
|
342 |
+
- Gradients point in the **right** direction = Faster learning! 🚀
|
343 |
+
|
344 |
+
### ✅ Reduces Vanishing Gradient Problem
|
345 |
+
- Sigmoid & Tanh activations suffer from small gradients 😢
|
346 |
+
- Normalization **keeps activations in a good range** 📊
|
347 |
+
|
348 |
+
### ✅ Allows Higher Learning Rates
|
349 |
+
- Networks can **train faster** without getting unstable ⏩
|
350 |
+
|
351 |
+
### ✅ Reduces Need for Dropout
|
352 |
+
- Some studies show **Batch Norm can replace Dropout** 🤯
|
353 |
+
|
354 |
+
---
|
355 |
+
|
356 |
+
🎉 **Great job!** Now, let’s try batch normalization in our own models! 🏗️📈
|
deep_networks.py
ADDED
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import torch
|
2 |
+
import torch.nn as nn
|
3 |
+
import torch.optim as optim
|
4 |
+
import matplotlib.pyplot as plt
|
5 |
+
import numpy as np
|
6 |
+
|
7 |
+
# 🔥 Deep Neural Network Model
|
8 |
+
class DeepNeuralNetwork(nn.Module):
|
9 |
+
def __init__(self, input_size, hidden_sizes, output_size):
|
10 |
+
super(DeepNeuralNetwork, self).__init__()
|
11 |
+
layers = []
|
12 |
+
in_size = input_size
|
13 |
+
|
14 |
+
for hidden_size in hidden_sizes:
|
15 |
+
layers.append(nn.Linear(in_size, hidden_size))
|
16 |
+
layers.append(nn.ReLU())
|
17 |
+
in_size = hidden_size
|
18 |
+
|
19 |
+
layers.append(nn.Linear(in_size, output_size))
|
20 |
+
self.model = nn.Sequential(*layers)
|
21 |
+
|
22 |
+
def forward(self, x):
|
23 |
+
return self.model(x) # ✅ Apply the model layers properly
|
24 |
+
|
25 |
+
|
26 |
+
# 🔥 Training Function
|
27 |
+
def train_model(model, criterion, optimizer, x_train, y_train, epochs=100):
|
28 |
+
model.train()
|
29 |
+
for epoch in range(epochs):
|
30 |
+
optimizer.zero_grad()
|
31 |
+
|
32 |
+
# Forward pass
|
33 |
+
y_pred = model(x_train)
|
34 |
+
|
35 |
+
# Loss calculation
|
36 |
+
loss = criterion(y_pred, y_train)
|
37 |
+
|
38 |
+
# Backward pass
|
39 |
+
loss.backward()
|
40 |
+
optimizer.step()
|
41 |
+
|
42 |
+
if (epoch + 1) % 10 == 0:
|
43 |
+
print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss.item():.4f}')
|
44 |
+
|
45 |
+
|
46 |
+
# ✅ Example Usage
|
47 |
+
if __name__ == "__main__":
|
48 |
+
# 🔥 Sample Data
|
49 |
+
x_train = torch.randn(100, 10, requires_grad=True) # ✅ Require gradient tracking
|
50 |
+
y_train = torch.randint(0, 2, (100,), dtype=torch.long) # ✅ Ensure LongTensor for CrossEntropyLoss
|
51 |
+
|
52 |
+
# Plotting the input data
|
53 |
+
plt.scatter(x_train[:, 0].detach().numpy(), x_train[:, 1].detach().numpy(), c=y_train.numpy(), cmap='viridis')
|
54 |
+
plt.title('Deep Neural Network Input Data')
|
55 |
+
plt.xlabel('Input Feature 1')
|
56 |
+
plt.ylabel('Input Feature 2')
|
57 |
+
plt.colorbar(label='Output Class')
|
58 |
+
plt.show()
|
59 |
+
|
60 |
+
# Initialize Model
|
61 |
+
model = DeepNeuralNetwork(input_size=10, hidden_sizes=[20, 10], output_size=2)
|
62 |
+
|
63 |
+
# Criterion and Optimizer
|
64 |
+
criterion = nn.CrossEntropyLoss()
|
65 |
+
optimizer = optim.SGD(model.parameters(), lr=0.01)
|
66 |
+
|
67 |
+
# Train the model
|
68 |
+
train_model(model, criterion, optimizer, x_train, y_train, epochs=100)
|
69 |
+
|
70 |
+
# ✅ Plotting the predictions with softmax
|
71 |
+
model.eval()
|
72 |
+
with torch.no_grad():
|
73 |
+
y_pred = torch.softmax(model(x_train), dim=1).detach().numpy()
|
74 |
+
|
75 |
+
plt.scatter(x_train[:, 0].detach().numpy(), x_train[:, 1].detach().numpy(), c=np.argmax(y_pred, axis=1), cmap='viridis')
|
76 |
+
plt.title('Deep Neural Network Predictions')
|
77 |
+
plt.xlabel('Input Feature 1')
|
78 |
+
plt.ylabel('Input Feature 2')
|
79 |
+
plt.colorbar(label='Predicted Class')
|
80 |
+
plt.show()
|
final_project.md
ADDED
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# 🧠 **Simple Summary of the Program**
|
2 |
+
|
3 |
+
1. **Loads and Prepares Data:**
|
4 |
+
- Uses the **MNIST dataset**, which contains images of handwritten digits (0-9).
|
5 |
+
- Resizes the images and converts them to tensors.
|
6 |
+
- Creates a **data loader** to batch the images and shuffle them for training.
|
7 |
+
|
8 |
+
2. **Defines a CNN Model:**
|
9 |
+
- The **FinalCNN** model processes the images through layers:
|
10 |
+
- **Conv1:** Finds simple features like edges.
|
11 |
+
- **Pool1:** Reduces the size to focus on important features.
|
12 |
+
- **Conv2:** Finds more complex patterns.
|
13 |
+
- **Pool2:** Reduces the size again.
|
14 |
+
- **Flattening:** Converts the features into a single line of numbers.
|
15 |
+
- **Fully Connected Layers:** Makes predictions about what digit is in the image.
|
16 |
+
|
17 |
+
3. **Trains the Model:**
|
18 |
+
- Uses the **Cross-Entropy Loss** to measure how far the predictions are from the real digit labels.
|
19 |
+
- Uses **Stochastic Gradient Descent (SGD)** to adjust the model parameters and make better predictions.
|
20 |
+
- Runs the training for **32 epochs**, slowly improving the accuracy.
|
21 |
+
|
22 |
+
4. **Displays Predictions:**
|
23 |
+
- Shows **6 sample images** with the model's predictions and the actual labels.
|
24 |
+
- Prints the accuracy and loss for each epoch.
|
25 |
+
|
26 |
+
5. **GPU Acceleration:**
|
27 |
+
- Uses **CUDA** if available, making the training faster by running on the GPU.
|
28 |
+
|
29 |
+
✅ This program is like a smart detective that learns to recognize handwritten numbers by studying lots of examples and gradually improving its guesses.
|
final_project.py
ADDED
@@ -0,0 +1,260 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import torch
|
2 |
+
import torch.nn as nn
|
3 |
+
import torch.optim as optim
|
4 |
+
from torchvision import datasets, transforms
|
5 |
+
from torch.utils.data import DataLoader
|
6 |
+
import matplotlib.pyplot as plt
|
7 |
+
from tqdm import tqdm
|
8 |
+
from dataset_loader import CustomMNISTDataset
|
9 |
+
import os
|
10 |
+
import matplotlib.font_manager as fm
|
11 |
+
# CNN Model
|
12 |
+
# CNN Model with output layer for 62 categories
|
13 |
+
class FinalCNN(nn.Module):
|
14 |
+
def __init__(self):
|
15 |
+
super(FinalCNN, self).__init__()
|
16 |
+
self.conv1 = nn.Conv2d(1, 16, kernel_size=5, stride=1, padding=0)
|
17 |
+
self.conv2 = nn.Conv2d(16, 32, kernel_size=5, stride=1, padding=0)
|
18 |
+
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
|
19 |
+
self.fc1 = nn.Linear(32 * 4 * 4, 120)
|
20 |
+
self.fc2 = nn.Linear(120, 84)
|
21 |
+
self.fc3 = nn.Linear(84, 62) # Output layer with 62 units for (0-9, a-z, A-Z)
|
22 |
+
|
23 |
+
def forward(self, x):
|
24 |
+
x = torch.relu(self.conv1(x))
|
25 |
+
x = self.pool(x)
|
26 |
+
x = torch.relu(self.conv2(x))
|
27 |
+
x = self.pool(x)
|
28 |
+
x = x.view(-1, 32 * 4 * 4)
|
29 |
+
x = torch.relu(self.fc1(x))
|
30 |
+
x = torch.relu(self.fc2(x))
|
31 |
+
x = self.fc3(x) # Final output
|
32 |
+
return x
|
33 |
+
def plot_loss_accuracy(losses, accuracies):
|
34 |
+
"""Plots Loss vs Accuracy on the same graph."""
|
35 |
+
plt.figure(figsize=(10, 6))
|
36 |
+
|
37 |
+
# Plot Loss
|
38 |
+
plt.plot(losses, color='red', label='Loss (Cost)', linestyle='-', marker='o')
|
39 |
+
|
40 |
+
# Plot Accuracy
|
41 |
+
plt.plot(accuracies, color='blue', label='Accuracy', linestyle='-', marker='x')
|
42 |
+
|
43 |
+
plt.title('Training Loss and Accuracy', fontsize=14)
|
44 |
+
plt.xlabel('Epochs', fontsize=12)
|
45 |
+
plt.ylabel('Value', fontsize=12)
|
46 |
+
plt.legend(loc='best')
|
47 |
+
plt.grid(True)
|
48 |
+
|
49 |
+
# Show the plot
|
50 |
+
plt.savefig("plot.svg")
|
51 |
+
|
52 |
+
# 🔥 Function to choose the dataset dynamically
|
53 |
+
def choose_dataset(dataset_name):
|
54 |
+
"""Choose and load a custom dataset dynamically."""
|
55 |
+
|
56 |
+
# ✅ Dynamic path generation
|
57 |
+
base_path = './data'
|
58 |
+
dataset_path = os.path.join(base_path, dataset_name, 'raw')
|
59 |
+
|
60 |
+
# Validate dataset path
|
61 |
+
if not os.path.exists(dataset_path):
|
62 |
+
raise ValueError(f"❌ Dataset {dataset_name} not found at {dataset_path}")
|
63 |
+
|
64 |
+
# ✅ Locate image and label files dynamically
|
65 |
+
image_file = None
|
66 |
+
label_file = None
|
67 |
+
|
68 |
+
for file in os.listdir(dataset_path):
|
69 |
+
if 'images' in file:
|
70 |
+
image_file = os.path.join(dataset_path, file)
|
71 |
+
elif 'labels' in file:
|
72 |
+
label_file = os.path.join(dataset_path, file)
|
73 |
+
|
74 |
+
# Ensure both image and label files are found
|
75 |
+
if not image_file or not label_file:
|
76 |
+
raise ValueError(f"❌ Missing image or label files in {dataset_path}")
|
77 |
+
|
78 |
+
transform = transforms.Compose([
|
79 |
+
transforms.ToTensor(),
|
80 |
+
transforms.Normalize((0.5,), (0.5,)) # Normalize between -1 and 1
|
81 |
+
])
|
82 |
+
|
83 |
+
# ✅ Load the custom dataset with file paths
|
84 |
+
dataset = CustomMNISTDataset(dataset_path=dataset_path, transform=transform)
|
85 |
+
|
86 |
+
return dataset
|
87 |
+
|
88 |
+
|
89 |
+
# Print activation details once
|
90 |
+
def print_activation_details(model, sample_batch):
|
91 |
+
"""Print activation map sizes once before training."""
|
92 |
+
with torch.no_grad():
|
93 |
+
x = sample_batch
|
94 |
+
print("\n--- CNN Activation Details (One-time) ---")
|
95 |
+
|
96 |
+
x = model.conv1(x)
|
97 |
+
print(f"Conv1: {x.shape}")
|
98 |
+
|
99 |
+
x = model.pool(x)
|
100 |
+
print(f"Pool1: {x.shape}")
|
101 |
+
|
102 |
+
x = model.conv2(x)
|
103 |
+
print(f"Conv2: {x.shape}")
|
104 |
+
|
105 |
+
x = model.pool(x)
|
106 |
+
print(f"Pool2: {x.shape}")
|
107 |
+
|
108 |
+
x = x.view(-1, 32 * 4 * 4)
|
109 |
+
print(f"Flattened: {x.shape}")
|
110 |
+
|
111 |
+
x = model.fc1(x)
|
112 |
+
print(f"FC1: {x.shape}")
|
113 |
+
|
114 |
+
x = model.fc2(x)
|
115 |
+
print(f"FC2: {x.shape}")
|
116 |
+
|
117 |
+
x = model.fc3(x)
|
118 |
+
print(f"Output (Logits): {x.shape}\n")
|
119 |
+
|
120 |
+
|
121 |
+
# Training Function
|
122 |
+
def train_final_model(model, criterion, optimizer, train_loader, epochs=256):
|
123 |
+
losses = []
|
124 |
+
accuracies = []
|
125 |
+
|
126 |
+
# Print activation details once before training
|
127 |
+
sample_batch, _ = next(iter(train_loader))
|
128 |
+
print_activation_details(model, sample_batch)
|
129 |
+
|
130 |
+
model.train()
|
131 |
+
|
132 |
+
for epoch in range(epochs):
|
133 |
+
epoch_loss = 0.0
|
134 |
+
correct, total = 0, 0
|
135 |
+
|
136 |
+
# tqdm progress bar
|
137 |
+
with tqdm(train_loader, desc=f'Epoch {epoch + 1}/{epochs}', unit='batch') as t:
|
138 |
+
for images, labels in t:
|
139 |
+
optimizer.zero_grad()
|
140 |
+
outputs = model(images)
|
141 |
+
loss = criterion(outputs, labels)
|
142 |
+
loss.backward()
|
143 |
+
optimizer.step()
|
144 |
+
|
145 |
+
# Update metrics
|
146 |
+
epoch_loss += loss.item()
|
147 |
+
_, predicted = torch.max(outputs, 1)
|
148 |
+
total += labels.size(0)
|
149 |
+
correct += (predicted == labels).sum().item()
|
150 |
+
|
151 |
+
t.set_postfix(loss=loss.item())
|
152 |
+
|
153 |
+
# Store epoch loss and accuracy
|
154 |
+
losses.append(epoch_loss / len(train_loader))
|
155 |
+
accuracy = 100 * correct / total
|
156 |
+
accuracies.append(accuracy)
|
157 |
+
|
158 |
+
print(f"Epoch [{epoch+1}/{epochs}], Loss: {epoch_loss / len(train_loader):.4f}, Accuracy: {accuracy:.2f}%")
|
159 |
+
|
160 |
+
# After training, plot the loss and accuracy
|
161 |
+
plot_loss_accuracy(losses, accuracies)
|
162 |
+
|
163 |
+
return losses, accuracies
|
164 |
+
|
165 |
+
|
166 |
+
# Display sample predictions
|
167 |
+
|
168 |
+
def get_dataset_options(base_path='./data'):
|
169 |
+
"""List all subdirectories in the data directory."""
|
170 |
+
try:
|
171 |
+
# List all subdirectories in the base_path (data folder)
|
172 |
+
options = [folder for folder in os.listdir(base_path) if os.path.isdir(os.path.join(base_path, folder))]
|
173 |
+
return options
|
174 |
+
except FileNotFoundError:
|
175 |
+
print(f"❌ Directory {base_path} not found!")
|
176 |
+
return []
|
177 |
+
|
178 |
+
def number_to_char(number):
|
179 |
+
if 0 <= number <= 9:
|
180 |
+
return str(number) # 0-9
|
181 |
+
elif 10 <= number <= 35:
|
182 |
+
return chr(number + 87) # a-z (10 -> 'a', 35 -> 'z')
|
183 |
+
elif 36 <= number <= 61:
|
184 |
+
return chr(number + 65) # A-Z (36 -> 'A', 61 -> 'Z')
|
185 |
+
else:
|
186 |
+
return ''
|
187 |
+
|
188 |
+
|
189 |
+
def display_predictions(model, data_loader, output_name, num_samples=6, font_path='./Daemon.otf'):
|
190 |
+
"""Displays sample images with predicted labels"""
|
191 |
+
model.eval()
|
192 |
+
|
193 |
+
# Load custom font
|
194 |
+
prop = fm.FontProperties(fname=font_path)
|
195 |
+
|
196 |
+
images, labels = next(iter(data_loader))
|
197 |
+
with torch.no_grad():
|
198 |
+
outputs = model(images)
|
199 |
+
_, predictions = torch.max(outputs, 1)
|
200 |
+
|
201 |
+
# Displaying 6 samples
|
202 |
+
plt.figure(figsize=(12, 6))
|
203 |
+
|
204 |
+
for i in range(num_samples):
|
205 |
+
plt.subplot(2, 3, i + 1)
|
206 |
+
plt.imshow(images[i].squeeze(), cmap='gray')
|
207 |
+
|
208 |
+
# Convert predicted number to corresponding character
|
209 |
+
predicted_char = number_to_char(predictions[i].item())
|
210 |
+
actual_char = number_to_char(labels[i].item())
|
211 |
+
|
212 |
+
# Title with 'Predicted' and 'Actual' both in custom font
|
213 |
+
if(predicted_char == actual_char):
|
214 |
+
plt.title(f'{predicted_char} = {actual_char}', fontsize=84, fontproperties=prop)
|
215 |
+
else:
|
216 |
+
plt.title(f'{predicted_char} != {actual_char}', fontsize=84, fontproperties=prop)
|
217 |
+
|
218 |
+
plt.axis('off')
|
219 |
+
|
220 |
+
plt.savefig(output_name)
|
221 |
+
|
222 |
+
|
223 |
+
if __name__ == "__main__":
|
224 |
+
# Choose Dataset
|
225 |
+
dataset_options = get_dataset_options()
|
226 |
+
|
227 |
+
if dataset_options:
|
228 |
+
# Dynamically display dataset options
|
229 |
+
print("Available datasets:")
|
230 |
+
for i, option in enumerate(dataset_options, 1):
|
231 |
+
print(f"{i}. {option}")
|
232 |
+
|
233 |
+
# User input to choose a dataset
|
234 |
+
dataset_index = int(input(f"Enter the number corresponding to the dataset (1-{len(dataset_options)}): ")) - 1
|
235 |
+
|
236 |
+
# Ensure valid selection
|
237 |
+
if 0 <= dataset_index < len(dataset_options):
|
238 |
+
dataset_name = dataset_options[dataset_index]
|
239 |
+
print(f"You selected: {dataset_name}")
|
240 |
+
else:
|
241 |
+
print("❌ Invalid selection.")
|
242 |
+
dataset_name = None
|
243 |
+
else:
|
244 |
+
print("❌ No datasets found in the data folder.")
|
245 |
+
dataset_name = None
|
246 |
+
|
247 |
+
train_dataset = choose_dataset(dataset_name)
|
248 |
+
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
|
249 |
+
|
250 |
+
# Model, Criterion, and Optimizer
|
251 |
+
model = FinalCNN()
|
252 |
+
criterion = nn.CrossEntropyLoss()
|
253 |
+
optimizer = optim.SGD(model.parameters(), lr=0.005)
|
254 |
+
|
255 |
+
display_predictions(model, train_loader, output_name="before.svg")
|
256 |
+
# Train the Model
|
257 |
+
losses, accuracies = train_final_model(model, criterion, optimizer, train_loader, epochs=256)
|
258 |
+
|
259 |
+
# Display sample predictions
|
260 |
+
display_predictions(model, train_loader, output_name="after.svg")
|
font.py
ADDED
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import struct
|
3 |
+
import numpy as np
|
4 |
+
import torch
|
5 |
+
import gzip
|
6 |
+
from PIL import Image, ImageFont, ImageDraw
|
7 |
+
import cv2
|
8 |
+
import random
|
9 |
+
import string
|
10 |
+
|
11 |
+
# 📝 Define the HandwrittenFontDataset class
|
12 |
+
class HandwrittenFontDataset(torch.utils.data.Dataset):
|
13 |
+
def __init__(self, font_path, num_samples):
|
14 |
+
self.font_path = font_path
|
15 |
+
self.num_samples = num_samples
|
16 |
+
self.font = ImageFont.truetype(self.font_path, 32) # Font size
|
17 |
+
self.characters = string.digits + string.ascii_uppercase + string.ascii_lowercase
|
18 |
+
|
19 |
+
def __len__(self):
|
20 |
+
return self.num_samples
|
21 |
+
|
22 |
+
def __getitem__(self, index):
|
23 |
+
# Randomly choose a character
|
24 |
+
char = random.choice(self.characters)
|
25 |
+
# Proceed with image creation and processing...
|
26 |
+
|
27 |
+
# Create image with that character
|
28 |
+
img = Image.new('L', (64, 64), color=255) # Create a blank image (grayscale)
|
29 |
+
draw = ImageDraw.Draw(img)
|
30 |
+
draw.text((10, 10), char, font=self.font, fill=0) # Draw the character
|
31 |
+
|
32 |
+
# Convert image to numpy array (resize to 28x28 for MNIST format)
|
33 |
+
img = np.array(img)
|
34 |
+
img = preprocess_for_mnist(img)
|
35 |
+
|
36 |
+
# Convert character to label (integer)
|
37 |
+
label = self.characters.index(char)
|
38 |
+
|
39 |
+
return torch.tensor(img, dtype=torch.uint8), label
|
40 |
+
|
41 |
+
# 📄 Resize and preprocess images for MNIST format
|
42 |
+
def preprocess_for_mnist(img):
|
43 |
+
"""Resize image to 28x28 and normalize to 0-255 range."""
|
44 |
+
img = cv2.resize(img, (28, 28), interpolation=cv2.INTER_AREA)
|
45 |
+
img = img.astype(np.uint8) # Convert to unsigned byte
|
46 |
+
return img
|
47 |
+
|
48 |
+
# 📄 Write images to idx3-ubyte format
|
49 |
+
def write_idx3_ubyte(images, file_path):
|
50 |
+
"""Write images to idx3-ubyte format."""
|
51 |
+
with open(file_path, 'wb') as f:
|
52 |
+
# Magic number (0x00000801 for image files)
|
53 |
+
f.write(struct.pack(">IIII", 2051, len(images), 28, 28))
|
54 |
+
|
55 |
+
# Write image data as unsigned bytes (each pixel in range [0, 255])
|
56 |
+
for image in images:
|
57 |
+
f.write(image.tobytes())
|
58 |
+
|
59 |
+
# 📄 Write labels to idx1-ubyte format
|
60 |
+
def write_idx1_ubyte(labels, file_path):
|
61 |
+
"""Write labels to idx1-ubyte format."""
|
62 |
+
with open(file_path, 'wb') as f:
|
63 |
+
# Magic number (0x00000801 for label files)
|
64 |
+
f.write(struct.pack(">II", 2049, len(labels)))
|
65 |
+
|
66 |
+
# Write each label as a byte
|
67 |
+
for label in labels:
|
68 |
+
f.write(struct.pack("B", label))
|
69 |
+
|
70 |
+
# 📄 Compress file to .gz format
|
71 |
+
def compress_file(input_path, output_path):
|
72 |
+
"""Compress the idx3 and idx1 files to .gz format."""
|
73 |
+
with open(input_path, 'rb') as f_in:
|
74 |
+
with gzip.open(output_path, 'wb') as f_out:
|
75 |
+
f_out.writelines(f_in)
|
76 |
+
|
77 |
+
# 📊 Save dataset in MNIST format
|
78 |
+
def save_mnist_format(images, labels, output_dir):
|
79 |
+
"""Save the dataset in MNIST format to raw/ directory."""
|
80 |
+
raw_dir = os.path.join(output_dir, "raw")
|
81 |
+
os.makedirs(raw_dir, exist_ok=True)
|
82 |
+
|
83 |
+
# Prepare file paths
|
84 |
+
train_images_path = os.path.join(raw_dir, "train-images-idx3-ubyte")
|
85 |
+
train_labels_path = os.path.join(raw_dir, "train-labels-idx1-ubyte")
|
86 |
+
|
87 |
+
# Write uncompressed idx3 and idx1 files
|
88 |
+
write_idx3_ubyte(images, train_images_path)
|
89 |
+
write_idx1_ubyte(labels, train_labels_path)
|
90 |
+
|
91 |
+
# Compress idx3 and idx1 files into .gz format
|
92 |
+
compress_file(train_images_path, f"{train_images_path}.gz")
|
93 |
+
compress_file(train_labels_path, f"{train_labels_path}.gz")
|
94 |
+
|
95 |
+
print(f"Dataset saved in MNIST format at {raw_dir}")
|
96 |
+
|
97 |
+
# ✅ Generate and save the dataset
|
98 |
+
def create_mnist_dataset(font_path, num_samples=4096):
|
99 |
+
"""Generate dataset and save in MNIST format."""
|
100 |
+
# Get font name without extension
|
101 |
+
font_name = os.path.splitext(os.path.basename(font_path))[0]
|
102 |
+
output_dir = os.path.join("./data", font_name)
|
103 |
+
|
104 |
+
# Ensure the directory exists
|
105 |
+
os.makedirs(output_dir, exist_ok=True)
|
106 |
+
|
107 |
+
dataset = HandwrittenFontDataset(font_path, num_samples)
|
108 |
+
|
109 |
+
images = []
|
110 |
+
labels = []
|
111 |
+
|
112 |
+
for i in range(num_samples):
|
113 |
+
img, label = dataset[i]
|
114 |
+
images.append(img.numpy())
|
115 |
+
labels.append(label)
|
116 |
+
|
117 |
+
# Save in MNIST format
|
118 |
+
save_mnist_format(images, labels, output_dir)
|
119 |
+
|
120 |
+
# 🔥 Example usage
|
121 |
+
def choose_font_and_create_dataset():
|
122 |
+
# List all TTF and OTF files in the root directory
|
123 |
+
font_files = [f for f in os.listdir("./") if f.endswith(".ttf") or f.endswith(".otf")]
|
124 |
+
|
125 |
+
# Display available fonts for user to choose
|
126 |
+
print("Available fonts:")
|
127 |
+
for i, font_file in enumerate(font_files):
|
128 |
+
print(f"{i+1}. {font_file}")
|
129 |
+
|
130 |
+
# Get user's choice
|
131 |
+
choice = int(input(f"Choose a font (1-{len(font_files)}): "))
|
132 |
+
chosen_font = font_files[choice - 1]
|
133 |
+
|
134 |
+
print(f"Creating dataset using font: {chosen_font}")
|
135 |
+
create_mnist_dataset(chosen_font)
|
136 |
+
|
137 |
+
# Run the font selection and dataset creation
|
138 |
+
choose_font_and_create_dataset()
|
install.bat
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
@echo off
|
2 |
+
REM Create a virtual environment
|
3 |
+
python -m venv .pytorchenv
|
4 |
+
REM Activate the virtual environment
|
5 |
+
call .pytorchenv\Scripts\activate
|
6 |
+
REM Install required Python packages
|
7 |
+
pip install -r requirements.txt
|
install.sh
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/bin/bash
|
2 |
+
# Create a virtual environment
|
3 |
+
python -m venv .pytorchenv
|
4 |
+
# Activate the virtual environment
|
5 |
+
source .pytorchenv/bin/activate
|
6 |
+
# Install required Python packages
|
7 |
+
pip install -r requirements.txt
|
logistic_regression.py
ADDED
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import torch
|
2 |
+
import torch.nn as nn
|
3 |
+
import torch.optim as optim
|
4 |
+
import numpy as np
|
5 |
+
from torch.utils.data import DataLoader, TensorDataset
|
6 |
+
|
7 |
+
|
8 |
+
# Logistic Regression Model
|
9 |
+
class LogisticRegressionModel(nn.Module):
|
10 |
+
def __init__(self, input_size):
|
11 |
+
super(LogisticRegressionModel, self).__init__()
|
12 |
+
self.linear = nn.Linear(input_size, 1)
|
13 |
+
|
14 |
+
def forward(self, x):
|
15 |
+
return torch.sigmoid(self.linear(x))
|
16 |
+
|
17 |
+
# Cross-Entropy Loss Function
|
18 |
+
def cross_entropy_loss(y_pred, y_true):
|
19 |
+
return -torch.mean(y_true * torch.log(y_pred) + (1 - y_true) * torch.log(1 - y_pred))
|
20 |
+
|
21 |
+
# Training Function
|
22 |
+
def train_model(model, criterion, optimizer, x_train, y_train, epochs=100):
|
23 |
+
print("Initial model parameters:", list(model.parameters()))
|
24 |
+
for epoch in range(epochs):
|
25 |
+
print(f'Epoch {epoch+1}/{epochs}')
|
26 |
+
|
27 |
+
model.train()
|
28 |
+
optimizer.zero_grad()
|
29 |
+
y_pred = model(x_train)
|
30 |
+
loss = criterion(y_pred, y_train)
|
31 |
+
loss.backward()
|
32 |
+
optimizer.step()
|
33 |
+
# Removed cross_entropy calculation as it is redundant
|
34 |
+
|
35 |
+
|
36 |
+
|
37 |
+
|
38 |
+
print(f'Loss: {loss.item():.4f}, Learning Rate: {optimizer.param_groups[0]["lr"]}')
|
39 |
+
|
40 |
+
if (epoch+1) % 10 == 0:
|
41 |
+
|
42 |
+
print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')
|
43 |
+
|
44 |
+
# Example Usage
|
45 |
+
if __name__ == "__main__":
|
46 |
+
# Sample Data
|
47 |
+
x_data = torch.tensor(np.random.rand(100, 2), dtype=torch.float32)
|
48 |
+
y_data = torch.tensor(np.random.randint(0, 2, (100, 1)), dtype=torch.float32)
|
49 |
+
|
50 |
+
# Create Dataset and DataLoader
|
51 |
+
dataset = TensorDataset(x_data, y_data)
|
52 |
+
train_loader = DataLoader(dataset, batch_size=10, shuffle=True)
|
53 |
+
|
54 |
+
model = LogisticRegressionModel(input_size=2)
|
55 |
+
|
56 |
+
# Initialize criterion and optimizer
|
57 |
+
criterion = nn.BCELoss()
|
58 |
+
optimizer = optim.SGD(model.parameters(), lr=0.01)
|
59 |
+
|
60 |
+
# Call the training function
|
61 |
+
train_model(model, criterion, optimizer, x_data, y_data, epochs=100)
|
62 |
+
|
63 |
+
with torch.no_grad():
|
64 |
+
sample_predictions = model(x_data)
|
65 |
+
print("Sample predictions after training:", sample_predictions)
|
metrics.png
ADDED
![]() |