Home¶
nGPT Normalized Transformer with Representation Learning on the Hypersphere
Published at: 10/10/24, 10:09 AM
Revealing the Utilized Rank of Subspaces of Learning in Neural Networks
Published at: 8/11/24, 2:26 PM
Memorization Through the Lens of Curvature of Loss Function Around Samples
Published at: 8/11/24, 12:21 PM
AWQ Activation aware Weight Quantization for LLM Compression and Acceleration
Published at: 8/2/24, 6:49 PM
Exact Conversion of In Context Learning to Model Weights in Linearized Attention Transformers
Published at: 8/2/24, 6:49 PM
Hydra Bidirectional State Space Models Through Generalized Matrix Mixers
Published at: 8/2/24, 6:49 PM
Battle of the Backbones A Large Scale Comparison of Pretrained Models across Computer Vision Tasks
Published at: 7/4/24, 6:39 AM
On Good Practices for Task Specific Distillation of Large Pretrained Visual Models
Published at: 7/4/24, 6:39 AM
ViDT An Efficient and Effective Fully Transformer based Object Detector
Published at: 7/4/24, 6:39 AM
LRP QViT Mixed Precision Vision Transformer Quantization via Layer wise Relevance Propagation
Published at: 7/1/24, 3:29 PM
SimPLR A Simple and Plain Transformer for Scaling Efficient Object Detection and Segmentation
Published at: 7/1/24, 3:29 PM
A survey of quantization methods for efficient neural network inference
Published at: 6/29/24, 3:18 PM
Building on Efficient Foundations Effectively Training LLMs with Structured Feedforward Layers
Published at: 6/29/24, 3:18 PM
EfficientViT SAM Accelerated Segment Anything Model Without Accuracy Loss
Published at: 6/29/24, 3:18 PM
Grokked Transformers are Implicit Reasoners A Mechanistic Journey to the Edge of Generalization
Published at: 6/29/24, 3:18 PM
Model Compression in Practice Lessons Learned from Practitioners Creating On device Machine Learning Experiences
Published at: 6/29/24, 3:18 PM
ProxylessNAS Direct Neural Architecture Search on Target Task and Hardware
Published at: 6/29/24, 3:18 PM
Using Degeneracy in the Loss Landscape for Mechanistic Interpretability
Published at: 6/29/24, 3:18 PM
An Image is Worth More Than 16x16 Patches Exploring Transformers on Individual Pixels
Published at: 6/15/24, 5:50 PM
MobileCLIP Fast Image Text Models through Multi Modal Reinforced Training
Published at: 6/15/24, 5:50 PM
Retrospective EIE Efficient Inference Engine onSparse and Compressed Neural Network
Published at: 6/15/24, 5:50 PM
Parameter Efficient Fine tuning of Self supervised ViTs without Catastrophic Forgetting
Published at: 6/13/24, 1:28 PM
Parameter Efficient Fine Tuning for Pre Trained Vision Models A Survey
Published at: 6/13/24, 1:28 PM
SAM CLIP Merging Vision Foundation Models towards Semantic and Spatial Understanding
Published at: 6/13/24, 1:28 PM
Simultaneous linear connectivity of neural networks modulo permutation
Published at: 6/13/24, 1:28 PM
Surgical DINO Adapter Learning of Foundation Models for Depth Estimation in Endoscopic Surgery
Published at: 6/13/24, 1:28 PM
Talaria Interactively Optimizing Machine Learning Models for Efficient Inference
Published at: 6/13/24, 1:28 PM
Block Transformer Global to Local Language Modeling for Fast Inference
Published at: 6/6/24, 10:01 AM
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution
Published at: 6/5/24, 12:46 PM
A Hierarchy of Graph Neural Networks Based on Learnable Local Features
Published at: 6/4/24, 2:11 PM
G SGD Optimizing ReLU Neural Networks in its Positively Scale Invariant Space
Published at: 6/4/24, 2:11 PM
Harmonics of Learning Universal Fourier Features Emerge in Invariant Networks
Published at: 6/4/24, 2:11 PM
Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task specific Models
Published at: 6/4/24, 2:11 PM
Neural Mechanics Symmetry and Broken Conservation Laws in Deep Learning Dynamics
Published at: 6/4/24, 2:11 PM
On the Symmetries of Deep Learning Models and their Internal Representations
Published at: 6/4/24, 2:11 PM
OpenELM An Efficient Language Model Family with Open source Training and Inference Framework
Published at: 6/4/24, 2:11 PM
Relaxed Octahedral Group Convolution for Learning Symmetry Breaking in 3D Physical Systems
Published at: 6/4/24, 2:11 PM
Vision Mamba Efficient Visual Representation Learning with Bidirectional State Space Model
Published at: 6/4/24, 2:11 PM
Scaling (Down) CLIP A Comprehensive Analysis of Data, Architecture, and Training Strategies
Published at: 4/15/24, 7:15 AM
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Published at: 4/13/24, 8:57 PM
An image is worth 16x16 words Transformers for image recognition at scale
Published at: 4/13/24, 8:57 PM
Approximation Generalization Trade offs under (Approximate) Group Equivariance
Published at: 4/13/24, 8:57 PM
ConViT Improving Vision Transformers with Soft Convolutional Inductive Biases
Published at: 4/13/24, 8:57 PM
Fast, Expressive SE(n) Equivariant Networks through Weight Sharing in Position Orientation Space
Published at: 4/13/24, 8:57 PM
MobileViT light weight, general purpose, and mobile friendly vision transformer
Published at: 4/13/24, 8:57 PM
Relaxing Equivariance Constraints with Non stationary Continuous Filters
Published at: 4/13/24, 8:57 PM
Self Supervised Detection of Perfect and Partial Input Dependent Symmetries
Published at: 4/13/24, 8:57 PM
Exploiting Redundancy Separable Group Convolutional Networks on Lie Groups
Published at: 4/11/24, 3:52 PM