Skip to content

Home

Residual stream

Published at: 8/11/24, 2:26 PM

A Mathematical Framework for Transformer Circuits

Published at: 8/11/24, 2:26 PM

Revealing the Utilized Rank of Subspaces of Learning in Neural Networks

Published at: 8/11/24, 2:26 PM

Apple Intelligence Foundation Language Models

Published at: 8/11/24, 12:21 PM

Memorization Through the Lens of Curvature of Loss Function Around Samples

Published at: 8/11/24, 12:21 PM

Linear Quantization

Published at: 8/2/24, 6:49 PM

Neural Network Quantization

Published at: 8/2/24, 6:49 PM

Positive Logic Programs

Published at: 8/2/24, 6:49 PM

AWQ Activation aware Weight Quantization for LLM Compression and Acceleration

Published at: 8/2/24, 6:49 PM

Exact Conversion of In Context Learning to Model Weights in Linearized Attention Transformers

Published at: 8/2/24, 6:49 PM

Are less inductive biases better or worse?

Published at: 7/2/24, 11:31 AM

Masked Image Modelling

Published at: 7/2/24, 11:31 AM

Non translationally equivariant convolutions

Published at: 7/2/24, 11:31 AM

CKConv Continuous Kernel Convolution For Sequential Data

Published at: 7/2/24, 11:31 AM

DINOv2 Learning Robust Visual Features without Supervision

Published at: 7/2/24, 11:31 AM

Emerging Properties in Self Supervised Vision Transformers

Published at: 7/2/24, 11:31 AM

FlexiViT One Model for All Patch Sizes

Published at: 7/2/24, 11:31 AM

Learning with Unmasked Tokens Drives Stronger Vision Learners

Published at: 7/2/24, 11:31 AM

Do Vision Foundation models exist?

Published at: 7/1/24, 3:29 PM

BoxeR Box Attention for 2D and 3D Transformers

Published at: 7/1/24, 3:29 PM

Hardware specific structured pruning

Published at: 6/15/24, 5:50 PM

Maximal pruning and functional recovery

Published at: 6/15/24, 5:50 PM

A Brief Review of Hypernetworks in Deep Learning

Published at: 6/15/24, 5:50 PM

An Image is Worth More Than 16x16 Patches Exploring Transformers on Individual Pixels

Published at: 6/15/24, 5:50 PM

Learning both Weights and Connections for Efficient Neural Networks

Published at: 6/15/24, 5:50 PM

MobileCLIP Fast Image Text Models through Multi Modal Reinforced Training

Published at: 6/15/24, 5:50 PM

Optimal Brain Damage

Published at: 6/15/24, 5:50 PM

Retrospective EIE Efficient Inference Engine onSparse and Compressed Neural Network

Published at: 6/15/24, 5:50 PM

Bit Palettization

Published at: 6/13/24, 1:28 PM

Block Expansion

Published at: 6/13/24, 1:28 PM

Grokking

Published at: 6/13/24, 1:28 PM

K Means based Quantization

Published at: 6/13/24, 1:28 PM

KV Cache

Published at: 6/13/24, 1:28 PM

LoRa Adapter

Published at: 6/13/24, 1:28 PM

LoRA Low Rank Adaptation of Large Language Models

Published at: 6/13/24, 1:28 PM

Parameter Efficient Fine tuning of Self supervised ViTs without Catastrophic Forgetting

Published at: 6/13/24, 1:28 PM

Parameter Efficient Fine Tuning for Pre Trained Vision Models A Survey

Published at: 6/13/24, 1:28 PM

SAM CLIP Merging Vision Foundation Models towards Semantic and Spatial Understanding

Published at: 6/13/24, 1:28 PM

Simultaneous linear connectivity of neural networks modulo permutation

Published at: 6/13/24, 1:28 PM

Surgical Fine Tuning Improves Adaptation to Distribution Shifts

Published at: 6/13/24, 1:28 PM

Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution

Published at: 6/5/24, 12:46 PM

Optimization Dynamics of Equivariant and Augmented Neural Networks

Published at: 6/5/24, 12:46 PM

Understanding Deep Learning Chapter 10

Published at: 6/5/24, 12:46 PM

Representation (Group Theory)

Published at: 6/4/24, 2:11 PM

A ConvNet for the 2020s

Published at: 6/4/24, 2:11 PM

A Hierarchy of Graph Neural Networks Based on Learnable Local Features

Published at: 6/4/24, 2:11 PM

A general theory of correct, incorrect, and extrinsic equivariance

Published at: 6/4/24, 2:11 PM

Approximately equivariant networks for imperfectly symmetric dynamics

Published at: 6/4/24, 2:11 PM

Deep Learning Book

Published at: 6/4/24, 2:11 PM

Efficient Modulation for Vision Networks

Published at: 6/4/24, 2:11 PM

Mamba Linear Time Sequence Modeling with Selective State Spaces

Published at: 4/13/24, 8:57 PM

MobileViT light weight, general purpose, and mobile friendly vision transformer

Published at: 4/13/24, 8:57 PM

On the Relationship between Self Attention and Convolutional Layers

Published at: 4/13/24, 8:57 PM

Relaxing Equivariance Constraints with Non stationary Continuous Filters

Published at: 4/13/24, 8:57 PM

Self Supervised Detection of Perfect and Partial Input Dependent Symmetries

Published at: 4/13/24, 8:57 PM

Stand Alone Self Attention in Vision Models

Published at: 4/13/24, 8:57 PM

Convergence rate and Hessian spectra

Published at: 4/11/24, 3:52 PM

Depthwise separable convolutions

Published at: 4/11/24, 3:52 PM

Group Axioms

Published at: 4/11/24, 3:52 PM

Group direct product

Published at: 4/11/24, 3:52 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14
Total 137 posts.