Building autograd engine tinytorch 01
This blog used to live on pythonstuff.com , This is new home for this blog 🤗,
I am gonna show you how to build your own tiny pytorch
Yes for fr.
Its weekend, and I am going to build my own autograd engine. I have done this ones before so I shoudnt be problem. started with empty git repo, I will keep commiting without any rebase so in case someone go back they can see eveything.
this is dev blog, I'll try to write explaination to you can recreate it
I'm not used to writing tutorials to DM me on @shxf0072 twitter (X now), If I f something up
all code will be at github.com/joey00072/tinytorch. I'll leave commits in blog so do git checkout COMMIT_ID
to go to that point.
tinytorch
Lets start with Tensor warpper around numpy
import numpy as np
class Tensor:
def __init__(self,data):
self.data = data if isinstance(data,np.ndarray) else np.array(data)
def __add__(self,other):
return Tensor(self.data + other.data)
def __mul__(self,other):
return Tensor(self.data + other.data)
def __repr__(self):
return f"tensor({self.data})"
if __name__ == "__main__":
x = Tensor([8])
y = Tensor([5])
z = x+y
print(z)
yay now we can add two tenors now add adds + to Tensor class if you dont know about this, search about "magic mathod in python"
and my friends called me up for valorant so I'll be back in 2hr or so (20:02 19-08-2023 )
Back (23:02 19-08-2023 )
Add & MUL
don't worry about math code is easy
derivatve of anything with itsed is 1 for for
Let's start with addition. If you have the function
its derivative will be 1, since the derivative of a constant is 0, and the derivative of (10) is . So .
For two variables: with respect to (x): with respect to (y):
so if (x =10) & (y = 20)
noice adding give equial graidnt back to both node
since z has grident 1, x and y got both grident 1 this will be usefull in residual connections in transformers
MUL
Now, let's consider multiplication. If you have the function its derivative will be 10, since (\frac{{d(x)}}{{dx}} = 10) and the derivative of 10 is 0. So (10 \cdot 1 = 10).
(f(x,y) = x \cdot y)
For two variables: with respect to (x): with respect to (y):
Noice in this case derivative or x have value of y (20) and derivate of y have value of x (10)
Lets code
we will create Add MUL and Funtion class move operation login in foward methoed of each class and store value of args in Function.args for backward
class Function:
def __init__(self,op,*args):
self.op = op
self.args = args
class Add:
@staticmethod
def forward(x,y):
return Tensor(x.data + y.data)
@staticmethod
def backward(ctx,grad):
x,y = ctx.args
return Tensor([1]) ,Tensor([1])
class Mul:
@staticmethod
def forward(x,y):
return Tensor(x.data * y.data) # z = x*y
@staticmethod
def backward(ctx,grad):
x,y = ctx.args
return Tensor(y.data), Tensor(x.data) # dz/dx, dz/dy
Functions class is to store will funtion/operation that we have applied so if we add x=10 and y = 20, funtion will have fn.op = Add abd fn.args = (10,20)
we pass function object as context to backward when we will get original args back when we doing backward pass
Lets modify add and mul
class Tensor:
def __init__(self,data):
self.data = data if isinstance(data,np.ndarray) else np.array(data)
self._ctx = None
def __add__(self,other):
fn = Function(Add,self,other)
result = Add.forward(self,other)
result._ctx = fn
return result
def __mul__(self,other):
fn = Function(Mul,self,other)
result = Mul.forward(self,other)
result._ctx = fn
return result
def __repr__(self):
return f"tensor({self.data})"
So when you do some op
- fist store all info related to that op in Function Object
- than do the op.forward
- store all information in result node
- return result
If you want to see this this graph
creare new visualize.py
file
pip install graphviz
sudo apt-get install -y graphviz # IDK what to do for windows I use wsl
import graphviz
from tinytorch import *
G = graphviz.Digraph(format='png')
G.clear()
def visit_nodes(G:graphviz.Digraph,node:Tensor):
uid = str(id(node))
G.node(uid,f"Tensor: {str(node.data) } ")
if node._ctx:
ctx_uid = str(id(node._ctx))
G.node(ctx_uid,f"Context: {str(node._ctx.op.__name__)}")
G.edge(uid,ctx_uid)
for child in node._ctx.args:
G.edge(ctx_uid,str(id(child)))
visit_nodes(G,child)
if __name__ == "__main__":
x = Tensor([8])
y = Tensor([5])
z = x+y
visit_nodes(G,z)
G.render(directory="vis",view=True)
print(z)
print(len(G.body))
import numpy as np
class Tensor:
def __init__(self,data):
self.data = data if isinstance(data,np.ndarray) else np.array(data)
self._ctx = None
def __add__(self,other):
fn = Function(Add,self,other)
result = Add.forward(self,other)
result._ctx = fn
return result
def __mul__(self,other):
fn = Function(Mul,self,other)
result = Mul.forward(self,other)
result._ctx = fn
return result
def __repr__(self):
return f"tensor({self.data})"
class Function:
def __init__(self,op,*args):
self.op = op
self.args = args
class Add:
@staticmethod
def forward(x,y):
return Tensor(x.data + y.data)
@staticmethod
def backward(ctx,grad):
x,y = ctx.args
return Tensor([1]),Tensor([1])
class Mul:
@staticmethod
def forward(x,y):
return Tensor(x.data * y.data) # z = x*y
@staticmethod
def backward(ctx,grad):
x,y = ctx.args
return Tensor(y.data), Tensor(x.data) # dz/dx, dz/dy
if __name__ == "__main__":
x = Tensor([8])
y = Tensor([5])
z = x*y
print(z)
Till commit dc11629
https://github.com/joey00072/tinytorch
sleeping now backprop tomorrow