On generative LLMs

I’m going to detox from generative LLMs for a while. To be clear, I think generative LLMs (erroneously referred to as “AI”) are incre...

Oct 01, 2024

An Undergraduate Understanding of System Security

Apr 20, 2024

It works on my machine

Apr 01, 2024

Making Models Smaller

Dec 27, 2023

Accelerating Network Training

This post is a (non-exhaustive) summary of the techniques I’ve used to accelerate neural network training. I’m going to focus on the ...

On generative LLMs

I’m going to detox from generative LLMs for a while. To be clear, I think generative LLMs (erroneously referred to as “AI”) are incredible tools, and I do believe there is merit in kn...

Oct 01, 2024

An Undergraduate Understanding of System Security

Before I get too far removed from the classroom, I’m going to reflect on the fundamentals of information security that I learned in undergrad and applied in a system security course I...

Apr 20, 2024

It works on my machine

“But it worked on my machine!” is usually the fault of the programmer or some quirk of configuration working behind the scenes to foil what would otherwise be smoothly executing softw...

Apr 01, 2024

Making Models Smaller

When we think about solving problems using machine learning, usually our first concern is building a model that is good at whatever it is we want it to do. For object detection system...

Dec 27, 2023

Accelerating Network Training

This post is a (non-exhaustive) summary of the techniques I’ve used to accelerate neural network training. I’m going to focus on the most fundamental adaptations to improve the traini...

Dec 25, 2023

Unconventional Training

The “conventional” approach to supervised machine learning usually follows this recipe: Some fixed dataset \(\mathcal{D}\) of labeled data \((x, y)\) is split into train, validation,...

Dec 23, 2023

ML Without Neural Networks

Neural networks have taken over ML, and I don’t see that changing in the near-term. With everyone focused on transformers and generative AI, it’s worthwhile to take a step back and ex...

Dec 15, 2023

Papers I like

MotionLM: Multi-Agent Motion Forecasting as Language Modeling

Using LLMs to forecast driver behavior
Llama 2: Open Foundation and Fine-Tuned Chat Models

Open source LLM courtesy of Meta
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Vision transformers!
Communication-Efficient Learning of Deep Networks from Decentralized Data

Cool federated learning application

Accelerating Network Training

Other Posts

Papers I like