March 1, 2026
During my PhD I worked on probabilistic computing and other hardware Monte Carlo algorithms and wanted to understand what different hardware can deliver on a well-defined workload. Here we go through twelve CUDA versions, starting from a single-threaded kernel doing about 30 million samples per second and ending at around 550 billion samples per second on a Tesla T4, an 18,200× improvement over the baseline. All source code is on GitHub.
Read more →December 12, 2024
In this blog post, we want to find out how fast we can run a 2D Ising model on a laptop. Here, we are interested in the process of optimizing as well as finding a minimal implementation that does the trick. All code can be found on GitHub. The codes will be described on a high level where each code is based on the one described in the previous section but will have added optimization.
Read more →November 23, 2024
In the months after finishing my PhD, I spent a lot of time thinking about the state of computing and where it's headed. During this time, I wrote down my thoughts to help organize my view of the computing landscape, and what I ended up with turned out a little longer than I had initially anticipated. Now I'm sharing it here because I think it would have helped me if I had seen it laid out clearly before I started my PhD journey.
Read more →