Build a Deep Learning Library

(zekcrates.quarto.pub)

124 points | by butanyways 1 day ago

8 comments

  • megadragon9 23 hours ago
    Thanks for sharing! It's inspiring to see more people "reinventing for insight" in the age of AI. This reminds me of my similar previous project a year ago when I built an entire PyTorch-style machine learning library [1] from scratch, using nothing but Python and NumPy. I started with a tiny autograd engine, then gradually created layer modules, optimizers, data loaders etc... I simply wanted to learn machine learning from first principles. Along the way I attempted to reproduce classical convnets [2] all the way to a toy GPT-2 [3] using the library I built. It definitely helped me understand how machine learning worked underneath the hood without all the fancy abstractions that PyTorch/TensorFlow provides. I eventually wrote a blog post [4] of this journey.

    [1] https://github.com/workofart/ml-by-hand

    [2] https://github.com/workofart/ml-by-hand/blob/main/examples/c...

    [3] https://github.com/workofart/ml-by-hand/blob/main/examples/g...

    [4] https://www.henrypan.com/blog/2025-02-06-ml-by-hand/

    • butanyways 10 hours ago
      Yes i've read your blogposts way back then. Nice work with the gpt-2!!
    • RestartKernel 22 hours ago
      During my Bachelor's, I wrote a small "immutable" algebraic machine learning library based on just NumPy. This made it easy to play around with combining weights by simply summing two networks by whatever operations are supported on normal NumPy arrays.

      ... turns out, this is only useful in some very specific scenarios, and it's probably not worth the extreme memory overhead.

  • amitav1 1 day ago
    This is cool! This summer I made something similar but in C++. The goal was to build an entire LLM, but I only got to neural networks. GitHub repo here: https://github.com/amitav-krishna/llm-from-scratch. I have a few blogs on this project on my website (https://amitav.net/building-lists.html, https://amitav.net/building-vectors.html, https://amitav.net/building-matrices.html (incomplete)). I hope to finish that series eventually, but some other projects have stolen the spotlight! It probably would have made more sense to write it in Python because I had no C++ experience.
  • csantini 1 day ago
    Did something similar a while back [1], best way to learn neural nets and backprop. Just using Numpy also makes sure you get the math right without having to deal with higher level frameworks or c++ libraries.

    [1] https://github.com/santinic/claudioflow

    • butanyways 1 day ago
      Its nice! Yeah a lot of the heavy lifting is done by Numpy.
  • silentsea90 1 day ago
    Isn't this what Karpathy does as well in the Zero to Hero lecture series on YT? I am sure this is great as well!
    • butanyways 1 day ago
      If you are asking about the "micrograd" video then yes a little bit. "micrograd" is for scalars and we use tensors in the book. If you are reading the book I would recommend to first complete the series or atleast the "micrograd" video.
  • yunnpp 1 day ago
    It's alright, but a C version would be even better to fully grasp the implementation details of tensors etc. Shelling out to numpy isn't particularly exciting.
    • butanyways 1 day ago
      I agree! What NumPy is doing is actually quite beautiful. I was thinking of writing a custom c++ backend for this thing. Lets see what happens this year.
      • p1esk 1 day ago
        If someone is interested in low level tensor implementation details they could benefit from a course/book “let’s build numpy in C”. No need to complicate DL library design discussion with that stuff.
  • grandimam 1 day ago
    This is good. Its well positioned for software engineers to understand DL stuff beyond the frameworks.
  • yazide 1 day ago
    [flagged]
  • opan 1 day ago
    Perhaps obvious to some, but this does not seem to be about learning in the traditional sense, nor a library in the book sense, unfortunately.