blog
July 16, 2025
At ADI, AI means developing world-class models that overcome latency, power, and cost constraints to deliver insights. Nothing beats reading Arxiv's latest or leafing through a stats tome at the beach, the pool, or lakeside. At least for us. Here are the papers members of our team are recommending for July about all things physical intelligence:
Stochastic Resetting Mitigates Latent Gradient Bias of SGD from Label Noise š https://bit.ly/bc-0725-8 šš¼ Recommended by: Yakov Shkolnikov
Why itās worth reading: If you're interested in how to make deep neural networks (DNNs) more robust when dealing with messy, unreliable data, this paper shows that periodically resetting your training process can actually help your model generalize better and avoid overfitting. The authors break down why this works, linking it to hidden biases in the training process and drawing inspiration from statistical physics. Whatās great is that their method is simple to implement and works well alongside other techniques for handling noisy data.
KernelBench: Can LLMs Write Efficient GPU Kernels? š https://bit.ly/bc-0725-1 šš¼Recommended by: Solomon Garber
Why itās worth reading: In this paper, Stanford & Princeton researchers test LLMs whether they can generate GPU kernels that match or beat standard PyTorch kernels on 250 diverse workloadsāranging from single operators to full ML architectures. They present a new metric, fastp, to measure both correctness and speed. Their findings are eye-opening: out-of-the-box LLMs offered improvement on fewer than 20% of tasks. However, by leveraging execution and profiling feedback, performance jumps dramaticallyādemonstrating real potential to automate a once labor-intensive, expertise-driven workflow.
Build a Large Language Model (From Scratch) š https://bit.ly/bc-0725-2-5 šš¼Recommended by: Marc Light
Why itās worth reading: U of Wisconsin Professor Sebastian Raschka, PhD's book goes through the code involved in building an LLM. As you progress through the bookās chapters, you get an intimate view of whatās going on under the hood of these models. I especially enjoyed the gradual walkthrough of attention mechanisms, from basic ones into trainable implementation. Not an easy read but especially illuminating if you can go through the book in as close to one sitting. The recent edition of his code also moves from GPT-2 to Qwen3 ā making it especially relevant.
LLM & Transformer Interview Essentials AāZ, Part 1: Architecture Fundamentals š https://bit.ly/bc-0725-5 šš¼ Recommended by: Adam Rowell
Why itās worth reading: Written by my grad school friend, Xavier Fang, the book goes through the LLM foundational concepts, distilling them into 4-6 page chapters. A good mix of breadth vs. depth. Math and core concepts for AI software engineers ā covering advanced LLM techniques. Wrapped in a guise of an interview guide, the book provides the foundations you need to work efficiently with LLMs.
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search š https://bit.ly/bc-0725-7 šš¼ Recommended by: Debanjan Ghosh
Why itās worth reading: Can an AI-authored workshop paper be accepted into ICLR? The authors created an agentic algorithm with that goal in mind. The system automated scientific discovery, and demonstrated end-to-end hypothesis formation, experiment design and execution (e.g., code generation), data analysis and manuscript writing. A long-term project, the Sakana team moved from a linear technique to the tree search-based planning. This meant expanding and evaluating the ābranching spaceā of candidate hypotheses and experiment plans. A worthy read for anyone working on agentic architectures.
This is the second instalment of our listāif you're working on edge AI - talk to us. If youāve read something recently that aligns with these themes, feel free to share it with us in the comments!