Skip to yearly menu bar Skip to main content


Show Detail
Timezone: Europe/Vienna
 
Filter Rooms:  

SUN 18 JUL
2 p.m.
Expo Workshop:
(ends 6:00 PM)
3 p.m.
6 p.m.
Expo Demonstration:
(ends 7:00 PM)
7 p.m.
Expo Talk Panel:
(ends 8:00 PM)

TUE 20 JUL
2 a.m.
3 a.m.
Affinity Poster Session:
(ends 5:00 AM)
5 a.m.
Tutorial:
(ends 7:45 AM)
2 p.m.
Orals 2:00-2:20
[2:00] BORE: Bayesian Optimization by Density-Ratio Estimation
Spotlights 2:20-2:45
[2:20] AutoSampling: Search for Effective Data Sampling Schedules
[2:25] HardCoRe-NAS: Hard Constrained diffeRentiable Neural Architecture Search
[2:30] Bias-Robust Bayesian Optimization via Dueling Bandits
[2:35] Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging
[2:40] Sparsifying Networks via Subdifferential Inclusion
Q&As 2:45-2:50
[2:45] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Relative Positional Encoding for Transformers with Linear Complexity
Spotlights 2:20-2:50
[2:20] A Free Lunch From ANN: Towards Efficient, Accurate Spiking Neural Networks Calibration
[2:25] A Unified Lottery Ticket Hypothesis for Graph Neural Networks
[2:30] Generative Adversarial Transformers
[2:35] Evolving Attention with Residual Convolutions
[2:40] Zoo-Tuning: Adaptive Transfer from A Zoo of Models
[2:45] UnICORNN: A recurrent model for learning very long time dependencies
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Attention is not all you need: pure attention loses rank doubly exponentially with depth
Spotlights 2:20-2:50
[2:20] Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation
[2:25] Efficient Generative Modelling of Protein Structure Fragments using a Deep Markov Model
[2:30] Exploiting structured data for learning contagious diseases under incomplete testing
[2:35] Strategic Classification Made Practical
[2:40] Large-Margin Contrastive Learning with Distance Polarization Regularizer
[2:45] SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Size-Invariant Graph Representations for Graph Classification Extrapolations
Spotlights 2:20-2:50
[2:20] Consistent Nonparametric Methods for Network Assisted Covariate Estimation
[2:25] Explainable Automated Graph Representation Learning with Hyperparameter Importance
[2:30] Breaking the Limits of Message Passing Graph Neural Networks
[2:35] From Local Structures to Size Generalization in Graph Neural Networks
[2:40] Interpretable Stability Bounds for Spectral Graph Filters
[2:45] Learning Node Representations Using Stationary Flow Prediction on Large Payment and Cash Transaction Networks
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot
Spotlights 2:20-2:45
[2:20] UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning
[2:25] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
[2:30] Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
[2:35] PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration
[2:40] Imitation by Predicting Observations
Q&As 2:45-2:50
[2:45] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Scalable Computations of Wasserstein Barycenter via Input Convex Neural Networks
Spotlights 2:20-2:50
[2:20] Outlier-Robust Optimal Transport
[2:25] Dataset Dynamics via Gradient Flows in Probability Space
[2:30] Sliced Iterative Normalizing Flows
[2:35] Low-Rank Sinkhorn Factorization
[2:40] Unbalanced minibatch Optimal Transport; applications to Domain Adaptation
[2:45] Making transport more robust and interpretable by moving data through a small number of anchor points
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Optimal Complexity in Decentralized Training
Spotlights 2:20-2:50
[2:20] Stochastic Sign Descent Methods: New Algorithms and Better Theory
[2:25] Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning
[2:30] A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization
[2:35] Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction
[2:40] Newton Method over Networks is Fast up to the Statistical Precision
[2:45] Federated Learning under Arbitrary Communication Patterns
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Phasic Policy Gradient
Spotlights 2:20-2:50
[2:20] Reinforcement Learning with Prototypical Representations
[2:25] Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration
[2:30] Muesli: Combining Improvements in Policy Optimization
[2:35] Unsupervised Learning of Visual 3D Keypoints for Control
[2:40] Learning Task Informed Abstractions
[2:45] State Entropy Maximization with Random Encoders for Efficient Exploration
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Deeply-Debiased Off-Policy Interval Estimation
Spotlights 2:20-2:45
[2:20] Offline Contextual Bandits with Overparameterized Models
[2:25] Demonstration-Conditioned Reinforcement Learning for Few-Shot Imitation
[2:30] A New Representation of Successor Features for Transfer across Dissimilar Environments
[2:35] Preferential Temporal Difference Learning
[2:40] On the Optimality of Batch Policy Optimization Algorithms
Q&As 2:45-2:50
[2:45] Q&A
(ends 3:00 PM)
3 p.m.
Orals 3:00-3:20
[3:00] Neural Architecture Search without Training
Spotlights 3:20-3:50
[3:20] Is Space-Time Attention All You Need for Video Understanding?
[3:25] A Probabilistic Approach to Neural Network Pruning
[3:30] KNAS: Green Neural Architecture Search
[3:35] Efficient Lottery Ticket Finding: Less Data is More
[3:40] ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
[3:45] Provably Strict Generalisation Benefit for Equivariant Models
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Leveraging Sparse Linear Layers for Debuggable Deep Networks
Spotlights 3:20-3:50
[3:20] Voice2Series: Reprogramming Acoustic Models for Time Series Classification
[3:25] Self-Tuning for Data-Efficient Deep Learning
[3:30] How Framelets Enhance Graph Neural Networks
[3:35] Federated Continual Learning with Weighted Inter-client Transfer
[3:40] Self Normalizing Flows
[3:45] Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] What Are Bayesian Neural Network Posteriors Really Like?
Spotlights 3:20-3:50
[3:20] Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning
[3:25] Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation
[3:30] Deep kernel processes
[3:35] Global inducing point variational posteriors for Bayesian neural networks and deep Gaussian processes
[3:40] Bayesian Deep Learning via Subnetwork Inference
[3:45] Generative Particle Variational Inference via Estimation of Functional Gradients
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Principled Simplicial Neural Networks for Trajectory Prediction
Spotlights 3:20-3:50
[3:20] Efficient Differentiable Simulation of Articulated Bodies
[3:25] On Monotonic Linear Interpolation of Neural Network Parameters
[3:30] Connecting Sphere Manifolds Hierarchically for Regularization
[3:35] Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks
[3:40] Thinking Like Transformers
[3:45] Federated Learning of User Verification Models Without Sharing Embeddings
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Oops I Took A Gradient: Scalable Sampling for Discrete Distributions
Spotlights 3:20-3:50
[3:20] Multiscale Invertible Generative Networks for High-Dimensional Bayesian Inference
[3:25] GraphDF: A Discrete Flow Model for Molecular Graph Generation
[3:30] Hierarchical VAEs Know What They Don’t Know
[3:35] Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation
[3:40] Generative Video Transformer: Can Objects be the Words?
[3:45] Poisson-Randomised DirBN: Large Mutation is Needed in Dirichlet Belief Networks
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Let's Agree to Degree: Comparing Graph Convolutional Networks in the Message-Passing Framework
Spotlights 3:20-3:50
[3:20] Fundamental Tradeoffs in Distributionally Adversarial Training
[3:25] Towards Understanding Learning in Neural Networks with Linear Teachers
[3:30] Continual Learning in the Teacher-Student Setup: Impact of Task Similarity
[3:35] A Functional Perspective on Learning Symmetric Functions with Neural Networks
[3:40] Weisfeiler and Lehman Go Topological: Message Passing Simplicial Networks
[3:45] On the Random Conjugate Kernel and Neural Tangent Kernel
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization
Spotlights 3:20-3:50
[3:20] Projection Robust Wasserstein Barycenters
[3:25] Efficient Message Passing for 0–1 ILPs with Binary Decision Diagrams
[3:30] Distributionally Robust Optimization with Markovian Data
[3:35] Acceleration via Fractal Learning Rate Schedules
[3:40] A Novel Sequential Coreset Method for Gradient Descent Algorithms
[3:45] Scalable Optimal Transport in High Dimensions for Graph Distances, Embedding Alignment, and More
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums
Spotlights 3:20-3:50
[3:20] Dueling Convex Optimization
[3:25] Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs
[3:30] Parameter-free Locally Accelerated Conditional Gradients
[3:35] Principal Component Hierarchy for Sparse Quadratic Programs
[3:40] One-sided Frank-Wolfe algorithms for saddle problems
[3:45] ConvexVST: A Convex Optimization Approach to Variance-stabilizing Transformation
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach
Spotlights 3:20-3:50
[3:20] Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity
[3:25] Neuro-algorithmic Policies Enable Fast Combinatorial Generalization
[3:30] PID Accelerated Value Iteration Algorithm
[3:35] Provably Efficient Learning of Transferable Rewards
[3:40] Reinforcement Learning for Cost-Aware Markov Decision Processes
[3:45] Value Alignment Verification
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
4 p.m.
Orals 4:00-4:20
[4:00] OmniNet: Omnidirectional Representations from Transformers
Spotlights 4:20-4:45
[4:20] Boosting the Throughput and Accelerator Utilization of Specialized CNN Inference Beyond Increasing Batch Size
[4:25] E(n) Equivariant Graph Neural Networks
[4:30] Grid-Functioned Neural Networks
[4:35] MSA Transformer
[4:40] Parallelizing Legendre Memory Unit Training
Q&As 4:45-4:50
[4:45] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
Spotlights 4:20-4:50
[4:20] Learning Curves for Analysis of Deep Networks
[4:25] GLSearch: Maximum Common Subgraph Detection via Learning to Search
[4:30] Learning Intra-Batch Connections for Deep Metric Learning
[4:35] Simultaneous Similarity-based Self-Distillation for Deep Metric Learning
[4:40] Unifying Vision-and-Language Tasks via Text Generation
[4:45] DeepWalking Backwards: From Embeddings Back to Graphs
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Spectral Smoothing Unveils Phase Transitions in Hierarchical Variational Autoencoders
Spotlights 4:20-4:50
[4:20] Riemannian Convex Potential Maps
[4:25] Autoencoding Under Normalization Constraints
[4:30] PixelTransformer: Sample Conditioned Signal Generation
[4:35] Generative Adversarial Networks for Markovian Temporal Dynamics: Stochastic Continuous Data Generation
[4:40] Autoencoder Image Interpolation by Shaping the Latent Space
[4:45] Improved Denoising Diffusion Probabilistic Models
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Directional Graph Networks
Spotlights 4:20-4:50
[4:20] Winograd Algorithm for AdderNet
[4:25] LieTransformer: Equivariant Self-Attention for Lie Groups
[4:30] "Hey, that's not an ODE": Faster ODE Adjoints via Seminorms
[4:35] Graph Mixture Density Networks
[4:40] Momentum Residual Neural Networks
[4:45] Better Training using Weight-Constrained Stochastic Dynamics
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition
Spotlights 4:20-4:50
[4:20] Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
[4:25] A New Formalism, Method and Open Issues for Zero-Shot Coordination
[4:30] Targeted Data Acquisition for Evolving Negotiation Agents
[4:35] Inverse Constrained Reinforcement Learning
[4:40] Counterfactual Credit Assignment in Model-Free Reinforcement Learning
[4:45] Interactive Learning from Activity Description
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness
Spotlights 4:20-4:50
[4:20] Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization
[4:25] Variational Data Assimilation with a Learned Inverse Observation Operator
[4:30] Fast Projection Onto Convex Smooth Constraints
[4:35] Decomposable Submodular Function Minimization via Maximum Flow
[4:40] Multiplicative Noise and Heavy Tails in Stochastic Optimization
[4:45] Distributed Second Order Methods with Fast Rates and Compressed Communication
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Not All Memories are Created Equal: Learning to Forget by Expiring
Spotlights 4:20-4:50
[4:20] Learning Bounds for Open-Set Learning
[4:25] Perceiver: General Perception with Iterative Attention
[4:30] Synthesizer: Rethinking Self-Attention for Transformer Models
[4:35] Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks
[4:40] What's in the Box? Exploring the Inner Life of Neural Networks with Robust Rules
[4:45] Neural-Pull: Learning Signed Distance Function from Point clouds by Learning to Pull Space onto Surface
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] World Model as a Graph: Learning Latent Landmarks for Planning
Spotlights 4:20-4:45
[4:20] Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research
[4:25] Deep Reinforcement Learning amidst Continual Structured Non-Stationarity
[4:30] Offline Reinforcement Learning with Pseudometric Learning
[4:35] EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
[4:40] Decision-Making Under Selective Labels: Optimal Finite-Domain Policies and Beyond
Q&As 4:45-4:50
[4:45] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Skill Discovery for Exploration and Planning using Deep Skill Graphs
Spotlights 4:20-4:50
[4:20] Learning Routines for Effective Off-Policy Reinforcement Learning
[4:25] PODS: Policy Optimization via Differentiable Simulation
[4:30] Learning and Planning in Complex Action Spaces
[4:35] Model-Based Reinforcement Learning via Latent-Space Collocation
[4:40] Vector Quantized Models for Planning
[4:45] LTL2Action: Generalizing LTL Instructions for Multi-Task RL
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
5 p.m.
Affinity Workshop:
(ends 2:00 AM)
Invited Talk:
Daphne Koller
(ends 6:00 PM)
6 p.m.
Posters 6:00-8:00
(ends 8:00 PM)
8 p.m.
Town Hall:
(ends 9:00 PM)

WED 21 JUL
2 a.m.
Orals 2:00-2:20
[2:00] A Tale of Two Efficient and Informative Negative Sampling Distributions
Spotlights 2:20-2:50
[2:20] TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
[2:25] Quantization Algorithms for Random Fourier Features
[2:30] Rethinking Neural vs. Matrix-Factorization Collaborative Filtering: the Theoretical Perspectives
[2:35] Concentric mixtures of Mallows models for top-$k$ rankings: sampling and identifiability
[2:40] Heterogeneity for the Win: One-Shot Federated Clustering
[2:45] Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups
Spotlights 2:20-2:50
[2:20] Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework
[2:25] The Earth Mover's Pinball Loss: Quantiles for Histogram-Valued Regression
[2:30] Signatured Deep Fictitious Play for Mean Field Games with Common Noise
[2:35] Equivariant message passing for the prediction of tensorial properties and molecular spectra
[2:40] Improving Breadth-Wise Backpropagation in Graph Neural Networks Helps Learning Long-Range Dependencies.
[2:45] LARNet: Lie Algebra Residual Network for Face Recognition
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Spotlights 2:20-2:50
[2:20] Safe Reinforcement Learning with Linear Function Approximation
[2:25] Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks
[2:30] Offline Reinforcement Learning with Fisher Divergence Critic Regularization
[2:35] Recomposing the Reinforcement Learning Building Blocks with Hypernetworks
[2:40] OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
[2:45] Discovering symbolic policies with deep reinforcement learning
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Characterizing Structural Regularities of Labeled Data in Overparameterized Models
Spotlights 2:20-2:50
[2:20] Stabilizing Equilibrium Models by Jacobian Regularization
[2:25] On the Predictability of Pruning Across Scales
[2:30] Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not?
[2:35] LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
[2:40] Dense for the Price of Sparse: Improved Performance of Sparsely Initialized Networks via a Subspace Offset
[2:45] Learning Neural Network Subspaces
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] On the price of explainability for some clustering problems
Spotlights 2:20-2:50
[2:20] Instance Specific Approximations for Submodular Maximization
[2:25] Adapting to Delays and Data in Adversarial Multi-Armed Bandits
[2:30] Structured Convolutional Kernel Networks for Airline Crew Scheduling
[2:35] Online Graph Dictionary Learning
[2:40] Stochastic Iterative Graph Matching
[2:45] Training Quantized Neural Networks to Global Optimality via Semidefinite Programming
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Robust Asymmetric Learning in POMDPs
Spotlights 2:20-2:50
[2:20] Differentiable Spatial Planning using Transformers
[2:25] Convex Regularization in Monte-Carlo Tree Search
[2:30] On-Policy Deep Reinforcement Learning for the Average-Reward Criterion
[2:35] Multi-Task Reinforcement Learning with Context-based Representations
[2:40] High Confidence Generalization for Reinforcement Learning
[2:45] Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning
Spotlights 2:20-2:50
[2:20] Re-understanding Finite-State Representations of Recurrent Policy Networks
[2:25] Emergent Social Learning via Multi-agent Reinforcement Learning
[2:30] From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization
[2:35] Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills
[2:40] Trajectory Diversity for Zero-Shot Coordination
[2:45] FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] NeRF-VAE: A Geometry Aware 3D Scene Generative Model
Spotlights 2:20-2:50
[2:20] Quantitative Understanding of VAE as a Non-linearly Scaled Isometric Embedding
[2:25] Soft then Hard: Rethinking the Quantization in Neural Image Compression
[2:30] Improved Contrastive Divergence Training of Energy-Based Models
[2:35] Deep Generative Learning via Schrödinger Bridge
[2:40] Partially Observed Exchangeable Modeling
[2:45] Understanding Failures in Out-of-Distribution Detection with Deep Generative Models
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] CATE: Computation-aware Neural Architecture Encoding with Transformers
Spotlights 2:20-2:50
[2:20] What Does Rotation Prediction Tell Us about Classifier Accuracy under Varying Testing Environments?
[2:25] Towards Domain-Agnostic Contrastive Learning
[2:30] Joining datasets via data augmentation in the label space for neural networks
[2:35] Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision
[2:40] Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
[2:45] Poolingformer: Long Document Modeling with Pooling Attention
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
3 a.m.
Orals 3:00-3:20
[3:00] Network Inference and Influence Maximization from Samples
Spotlights 3:20-3:50
[3:20] Regularized Submodular Maximization at Scale
[3:25] Marginal Contribution Feature Importance - an Axiomatic Approach for Explaining Data
[3:30] Connecting Interpretability and Robustness in Decision Trees through Separation
[3:35] Light RUMs
[3:40] Submodular Maximization subject to a Knapsack Constraint: Combinatorial Algorithms with Near-optimal Adaptive Complexity
[3:45] CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] A Wasserstein Minimax Framework for Mixed Linear Regression
Spotlights 3:20-3:50
[3:20] Weight-covariance alignment for adversarially robust neural networks
[3:25] Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss
[3:30] Communication-Efficient Distributed SVD via Local Power Iterations
[3:35] A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance
[3:40] Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
[3:45] Leveraging Language to Learn Program Abstractions and Search Heuristics
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Decoupling Value and Policy for Generalization in Reinforcement Learning
Spotlights 3:20-3:50
[3:20] Prioritized Level Replay
[3:25] SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
[3:30] GMAC: A Distributional Perspective on Actor-Critic Framework
[3:35] Goal-Conditioned Reinforcement Learning with Imagined Subgoals
[3:40] Policy Gradient Bayesian Robust Optimization for Imitation Learning
[3:45] Reinforcement Learning of Implicit and Explicit Control Flow Instructions
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies
Spotlights 3:20-3:45
[3:20] EfficientNetV2: Smaller Models and Faster Training
[3:25] Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning
[3:30] LAMDA: Label Matching Deep Domain Adaptation
[3:35] Temporally Correlated Task Scheduling for Sequence Learning
[3:40] Information Obfuscation of Graph Neural Networks
Q&As 3:45-3:50
[3:45] Q&A
(ends 4:00 AM)
Spotlights 3:00-3:15
[3:00] iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients
[3:05] Accurate Post Training Quantization With Small Calibration Sets
[3:10] Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search
Orals 3:15-3:35
[3:15] Few-Shot Neural Architecture Search
Spotlights 3:35-3:50
[3:35] AutoAttend: Automated Attention Representation Search
[3:40] Think Global and Act Local: Bayesian Optimisation over High-Dimensional Categorical and Mixed Search Spaces
[3:45] Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] The Emergence of Individuality
Spotlights 3:20-3:45
[3:20] DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning
[3:25] From Local to Global Norm Emergence: Dissolving Self-reinforcing Substructures with Incremental Social Instruments
[3:30] Learning While Playing in Mean-Field Games: Convergence and Optimality
[3:35] Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
[3:40] Augmented World Models Facilitate Zero-Shot Dynamics Generalization From a Single Offline Environment
Q&As 3:45-3:50
[3:45] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training
Spotlights 3:20-3:50
[3:20] Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
[3:25] Keyframe-Focused Visual Imitation Learning
[3:30] Learning and Planning in Average-Reward Markov Decision Processes
[3:35] Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing
[3:40] Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision
[3:45] Emphatic Algorithms for Deep Reinforcement Learning
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] The Power of Adaptivity for Stochastic Submodular Cover
Spotlights 3:20-3:50
[3:20] The Heavy-Tail Phenomenon in SGD
[3:25] Federated Composite Optimization
[3:30] On Estimation in Latent Variable Models
[3:35] Asynchronous Distributed Learning : Adapting to Gradient Delays without Prior Knowledge
[3:40] Randomized Algorithms for Submodular Function Maximization with a $k$-System Constraint
[3:45] BASGD: Buffered Asynchronous SGD for Byzantine Learning
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Generating images with sparse representations
Spotlights 3:20-3:50
[3:20] An Identifiable Double VAE For Disentangled Representations
[3:25] A Unified Generative Adversarial Network Training via Self-Labeling and Self-Attention
[3:30] On Characterizing GAN Convergence Through Proximal Duality Gap
[3:35] Scalable Normalizing Flows for Permutation Invariant Densities
[3:40] Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics
[3:45] Zero-Shot Text-to-Image Generation
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
4 a.m.
Orals 4:00-4:20
[4:00] Cooperative Exploration for Multi-Agent Deep Reinforcement Learning
Spotlights 4:20-4:50
[4:20] A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
[4:25] Learning to Weight Imperfect Demonstrations
[4:30] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning
[4:35] MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning
[4:40] RRL: Resnet as representation for Reinforcement Learning
[4:45] SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] AlphaNet: Improved Training of Supernets with Alpha-Divergence
Spotlights 4:20-4:50
[4:20] Catformer: Designing Stable Transformers via Sensitivity Analysis
[4:25] A Receptor Skeleton for Capsule Neural Networks
[4:30] Explore Visual Concept Formation for Image Classification
[4:35] K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets
[4:40] High-Performance Large-Scale Image Recognition Without Normalization
[4:45] Lipschitz normalization for self-attention layers with application to graph neural networks
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Accelerated Algorithms for Smooth Convex-Concave Minimax Problems with O(1/k^2) Rate on Squared Gradient Norm
Spotlights 4:20-4:50
[4:20] Communication-Efficient Distributed Optimization with Quantized Preconditioners
[4:25] Optimal regret algorithm for Pseudo-1d Bandit Convex Optimization
[4:30] Fast Stochastic Bregman Gradient Methods: Sharp Analysis and Variance Reduction
[4:35] Moreau-Yosida $f$-divergences
[4:40] Affine Invariant Analysis of Frank-Wolfe on Strongly Convex Sets
[4:45] On a Combination of Alternating Minimization and Nesterov's Momentum
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Inverse Decision Modeling: Learning Interpretable Representations of Behavior
Spotlights 4:20-4:50
[4:20] On Proximal Policy Optimization's Heavy-tailed Gradients
[4:25] Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
[4:30] Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning
[4:35] Is Pessimism Provably Efficient for Offline RL?
[4:40] Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization
[4:45] Density Constrained Reinforcement Learning
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Sequential Domain Adaptation by Synthesizing Distributionally Robust Experts
Spotlights 4:20-4:50
[4:20] Oblivious Sketching-based Central Path Method for Linear Programming
[4:25] Bayesian Optimization over Hybrid Spaces
[4:30] Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models
[4:35] Compositional Video Synthesis with Action Graphs
[4:40] Neural Pharmacodynamic State Space Modeling
[4:45] Three Operator Splitting with a Nonconvex Loss Function
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Spotlights 4:20-4:50
[4:20] Householder Sketch for Accurate and Accelerated Least-Mean-Squares Solvers
[4:25] Accumulated Decoupled Learning with Gradient Staleness Mitigation for Convolutional Neural Networks
[4:30] Training Graph Neural Networks with 1000 Layers
[4:35] 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
[4:40] Federated Deep AUC Maximization for Hetergeneous Data with a Constant Communication Complexity
[4:45] Ditto: Fair and Robust Federated Learning Through Personalization
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Out-of-Distribution Generalization via Risk Extrapolation (REx)
Spotlights 4:20-4:50
[4:20] What Makes for End-to-End Object Detection?
[4:25] On Explainability of Graph Neural Networks via Subgraph Explorations
[4:30] Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks
[4:35] Data Augmentation for Meta-Learning
[4:40] Understanding Invariance via Feedforward Inversion of Discriminatively Trained Classifiers
[4:45] Neural Symbolic Regression that scales
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Hyperparameter Selection for Imitation Learning
Spotlights 4:20-4:50
[4:20] Revisiting Peng's Q($\lambda$) for Modern Reinforcement Learning
[4:25] Monotonic Robust Policy Optimization with Model Discrepancy
[4:30] Taylor Expansion of Discount Factors
[4:35] Generalizable Episodic Memory for Deep Reinforcement Learning
[4:40] Representation Matters: Offline Pretraining for Sequential Decision Making
[4:45] Reinforcement Learning Under Moral Uncertainty
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Just Train Twice: Improving Group Robustness without Training Group Information
Spotlights 4:20-4:50
[4:20] Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
[4:25] GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
[4:30] A Bit More Bayesian: Domain-Invariant Learning with Uncertainty
[4:35] Neural Rough Differential Equations for Long Time Series
[4:40] Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization
[4:45] Data augmentation for deep learning based accelerated MRI reconstruction with limited data
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
5 a.m.
Invited Talk:
Xiao Cunde · Qin Dahe
(ends 6:00 AM)
6 a.m.
Posters 6:00-8:00
(ends 8:00 AM)
2 p.m.
Orals 2:00-2:20
[2:00] Cross-domain Imitation from Observations
Spotlights 2:20-2:50
[2:20] SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
[2:25] Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices
[2:30] Active Feature Acquisition with Generative Surrogate Models
[2:35] Characterizing the Gap Between Actor-Critic and Policy Gradient
[2:40] Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective
[2:45] Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Near Optimal Reward-Free Reinforcement Learning
Spotlights 2:20-2:50
[2:20] Batch Value-function Approximation with Only Realizability
[2:25] Adversarial Combinatorial Bandits with General Non-linear Reward Functions
[2:30] Model-Free and Model-Based Policy Evaluation when Causality is Uncertain
[2:35] Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
[2:40] On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting
[2:45] Spectral vertex sparsifiers and pair-wise spanners over distributed graphs
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] On Energy-Based Models with Overparametrized Shallow Neural Networks
Spotlights 2:20-2:50
[2:20] Uncertainty Principles of Encoding GANs
[2:25] On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths
[2:30] Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks
[2:35] Functional Space Analysis of Local GAN Convergence
[2:40] Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models
[2:45] Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] APS: Active Pretraining with Successor Features
Spotlights 2:20-2:50
[2:20] Guided Exploration with Proximal Policy Optimization using a Single Demonstration
[2:25] Self-Paced Context Evaluation for Contextual Reinforcement Learning
[2:30] Unsupervised Skill Discovery with Bottleneck Option Learning
[2:35] TeachMyAgent: a Benchmark for Automatic Curriculum Learning in Deep RL
[2:40] Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning
[2:45] Data-efficient Hindsight Off-policy Option Learning
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] The Limits of Min-Max Optimization Algorithms: Convergence to Spurious Non-Critical Sets
Spotlights 2:20-2:50
[2:20] Theory of Spectral Method for Union of Subspaces-Based Random Geometry Graph
[2:25] Approximating a Distribution Using Weight Queries
[2:30] Estimating $\alpha$-Rank from A Few Entries with Low Rank Matrix Completion
[2:35] Revenue-Incentive Tradeoffs in Dynamic Reserve Pricing
[2:40] Towards the Unification and Robustness of Perturbation and Gradient Based Explanations
[2:45] Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Optimizing persistent homology based functions
Spotlights 2:20-2:50
[2:20] Debiasing a First-order Heuristic for Approximate Bi-level Optimization
[2:25] SMG: A Shuffling Gradient-Based Method with Momentum
[2:30] Regret Minimization in Stochastic Non-Convex Learning via a Proximal-Gradient Approach
[2:35] MARINA: Faster Non-Convex Distributed Learning with Compression
[2:40] Bilevel Optimization: Convergence Analysis and Enhanced Design
[2:45] Learning from History for Byzantine Robust Optimization
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] When All We Need is a Piece of the Pie: A Generic Framework for Optimizing Two-way Partial AUC
Spotlights 2:20-2:50
[2:20] SiameseXML: Siamese Networks meet Extreme Classifiers with 100M Labels
[2:25] Disentangling Sampling and Labeling Bias for Learning in Large-output Spaces
[2:30] Learning Randomly Perturbed Structured Predictors for Direct Loss Minimization
[2:35] Improving Molecular Graph Neural Network Explainability with Orthonormalization and Induced Sparsity
[2:40] Evaluating Robustness of Predictive Uncertainty Estimation: Are Dirichlet-based Models Reliable?
[2:45] Meta-learning Hyperparameter Performance Prediction with Neural Processes
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding
Spotlights 2:20-2:50
[2:20] Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network
[2:25] Kernel Continual Learning
[2:30] XOR-CD: Linearly Convergent Constrained Structure Generation
[2:35] ARMS: Antithetic-REINFORCE-Multi-Sample Gradient for Binary Variables
[2:40] Composing Normalizing Flows for Inverse Problems
[2:45] Nonparametric Hamiltonian Monte Carlo
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Robust Density Estimation from Batches: The Best Things in Life are (Nearly) Free
Spotlights 2:20-2:50
[2:20] Generalization Bounds in the Presence of Outliers: a Median-of-Means Study
[2:25] Meta Learning for Support Recovery in High-dimensional Precision Matrix Estimation
[2:30] Robust Inference for High-Dimensional Linear Models via Residual Randomization
[2:35] Don’t Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification
[2:40] Generalization Guarantees for Neural Architecture Search with Train-Validation Split
[2:45] Optimal Estimation of High Dimensional Smooth Additive Function Based on Noisy Observations
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
3 p.m.
Orals 3:00-3:20
[3:00] Regret and Cumulative Constraint Violation Analysis for Online Convex Optimization with Long Term Constraints
Spotlights 3:20-3:50
[3:20] Near-Optimal Confidence Sequences for Bounded Random Variables
[3:25] Joint Online Learning and Decision-making via Dual Mirror Descent
[3:30] Online A-Optimal Design and Active Linear Regression
[3:35] Fairness and Bias in Online Selection
[3:40] ChaCha for Online AutoML
[3:45] An Algorithm for Stochastic and Adversarial Bandits with Switching Costs
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Elastic Graph Neural Networks
Spotlights 3:20-3:50
[3:20] How could Neural Networks understand Programs?
[3:25] ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations
[3:30] How Do Adam and Training Strategies Help BNNs Optimization
[3:35] Quantifying and Reducing Bias in Maximum Likelihood Estimation of Structured Anomalies
[3:40] Learning from Nested Data with Ornstein Auto-Encoders
[3:45] Kernel-Based Reinforcement Learning: A Finite-Time Analysis
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins
Spotlights 3:20-3:50
[3:20] Two-way kernel matrix puncturing: towards resource-efficient PCA and spectral clustering
[3:25] A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning
[3:30] Estimation and Quantization of Expected Persistence Diagrams
[3:35] Post-selection inference with HSIC-Lasso
[3:40] Provable Robustness of Adversarial Training for Learning Halfspaces with Noise
[3:45] Distribution-Free Calibration Guarantees for Histogram Binning without Sample Splitting
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Reserve Price Optimization for First Price Auctions in Display Advertising
Spotlights 3:20-3:50
[3:20] Align, then memorise: the dynamics of learning with feedback alignment
[3:25] Connecting Optimal Ex-Ante Collusion in Teams to Extensive-Form Correlation: Faster Algorithms and Positive Complexity Results
[3:30] Learning to Price Against a Moving Target
[3:35] Fast Algorithms for Stackelberg Prediction Game with Least Squares Loss
[3:40] Approximate Group Fairness for Clustering
[3:45] Incentivizing Compliance with Algorithmic Instruments
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Bilinear Classes: A Structural Framework for Provable Generalization in RL
Spotlights 3:20-3:50
[3:20] Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning
[3:25] Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret
[3:30] Reward Identification in Inverse Reinforcement Learning
[3:35] Online Optimization in Games via Control Theory: Connecting Regret, Passivity and Poincaré Recurrence
[3:40] Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations
[3:45] Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Spotlights 3:20-3:50
[3:20] Megaverse: Simulating Embodied Agents at One Million Experiences per Second
[3:25] Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing
[3:30] Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning
[3:35] Off-Belief Learning
[3:40] On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP
[3:45] Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Dynamic Game Theoretic Neural Optimizer
Spotlights 3:20-3:50
[3:20] Zero-Shot Knowledge Distillation from a Decision-Based Black-Box Model
[3:25] A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network
[3:30] Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
[3:35] Tractable structured natural-gradient descent using local parameterizations
[3:40] Towards Rigorous Interpretations: a Formalisation of Feature Attribution
[3:45] Distributed Nystr\"{o}m Kernel Learning with Communications
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
Spotlights 3:20-3:50
[3:20] Principled Exploration via Optimistic Bootstrapping and Backward Induction
[3:25] Ensemble Bootstrapping for Q-Learning
[3:30] Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
[3:35] A Regret Minimization Approach to Iterative Learning Control
[3:40] TempoRL: Learning When to Act
[3:45] State Relevance for Off-Policy Evaluation
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Understanding Instance-Level Label Noise: Disparate Impacts and Treatments
Spotlights 3:20-3:50
[3:20] Selecting Data Augmentation for Simulating Interventions
[3:25] Training Data Subset Selection for Regression with Controlled Generalization Error
[3:30] Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics
[3:35] Learning from Noisy Labels with No Change to the Training Process
[3:40] What does LIME really see in images?
[3:45] Narrow Margins: Classification, Margins and Fat Tails
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
3:40 p.m.
Affinity Workshop:
(ends 5:45 AM)
4 p.m.
Orals 4:00-4:20
[4:00] High-dimensional Experimental Design and Kernel Bandits
Spotlights 4:20-4:50
[4:20] Dichotomous Optimistic Search to Quantify Human Perception
[4:25] Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits
[4:30] Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions
[4:35] Deciding What to Learn: A Rate-Distortion Approach
[4:40] No-regret Algorithms for Capturing Events in Poisson Point Processes
[4:45] Parametric Graph for Unimodal Ranking Bandit
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:40
[4:00] The Logical Options Framework
[4:20] On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game
Spotlights 4:40-4:50
[4:40] Adversarial Option-Aware Hierarchical Imitation Learning
[4:45] Value Iteration in Continuous Actions, States and Time
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] PAC-Learning for Strategic Classification
Spotlights 4:20-4:50
[4:20] Learning from Biased Data: A Semi-Parametric Approach
[4:25] Learning in Nonzero-Sum Stochastic Games with Potentials
[4:30] Guarantees for Tuning the Step Size using a Learning-to-Learn Approach
[4:35] Large-Scale Multi-Agent Deep FBSDEs
[4:40] Multi-group Agnostic PAC Learnability
[4:45] One for One, or All for All: Equilibria and Optimality of Collaboration in Federated Learning
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Spotlights 4:00-4:45
[4:00] Instabilities of Offline RL with Pre-Trained Neural Representation
[4:05] Path Planning using Neural A* Search
[4:10] Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings
[4:15] Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning
[4:20] Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement Learning
[4:25] Continuous-time Model-based Reinforcement Learning
[4:30] Bayesian Optimistic Optimisation with Exponentially Decaying Regret
[4:35] Best Model Identification: A Rested Bandit Formulation
[4:40] Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time
Q&As 4:45-4:50
[4:45] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Modelling Behavioural Diversity for Learning in Open-Ended Games
Spotlights 4:20-4:50
[4:20] Follow-the-Regularized-Leader Routes to Chaos in Routing Games
[4:25] How to Learn when Data Reacts to Your Model: Performative Gradient Descent
[4:30] Continuous Coordination As a Realistic Scenario for Lifelong Learning
[4:35] Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games
[4:40] Collaborative Bayesian Optimization with Fair Regret
[4:45] Exponentially Many Local Minima in Quantum Neural Networks
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent
Spotlights 4:20-4:50
[4:20] A statistical perspective on distillation
[4:25] The Lipschitz Constant of Self-Attention
[4:30] Revealing the Structure of Deep Neural Networks via Convex Duality
[4:35] Representational aspects of depth and conditioning in normalizing flows
[4:40] Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
[4:45] The Hintons in your Neural Network: a Quantum Field Theory View of Deep Learning
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Inferring Latent Dynamics Underlying Neural Population Activity via Neural Differential Equations
Spotlights 4:20-4:50
[4:20] Learning Queueing Policies for Organ Transplantation Allocation using Interpretable Counterfactual Survival Analysis
[4:25] Deep Continuous Networks
[4:30] SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks
[4:35] Factor-analytic inverse regression for high-dimension, small-sample dimensionality reduction
[4:40] On-Off Center-Surround Receptive Fields for Accurate and Robust Image Classification
[4:45] AGENT: A Benchmark for Core Psychological Reasoning
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Inferring serial correlation with dynamic backgrounds
Spotlights 4:20-4:50
[4:20] Variance Reduced Training with Stratified Sampling for Forecasting Models
[4:25] Necessary and sufficient conditions for causal feature selection in time series with latent common causes
[4:30] Multiplying Matrices Without Multiplying
[4:35] The Power of Log-Sum-Exp: Sequential Density Ratio Matrix Estimation for Speed-Accuracy Optimization
[4:40] Data-driven Prediction of General Hamiltonian Dynamics via Learning Exactly-Symplectic Maps
[4:45] Learning Stochastic Behaviour from Aggregate Data
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Kernel Stein Discrepancy Descent
Spotlights 4:20-4:50
[4:20] Conditional Distributional Treatment Effect with Kernel Conditional Mean Embeddings and U-Statistic Regression
[4:25] Generalised Lipschitz Regularisation Equals Distributional Robustness
[4:30] Interpretable Stein Goodness-of-fit Tests on Riemannian Manifold
[4:35] An exact solver for the Weston-Watkins SVM subproblem
[4:40] Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction
[4:45] Faster Kernel Matrix Algebra via Density Estimation
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Measuring Robustness in Deep Learning Based Compressive Sensing
Spotlights 4:20-4:50
[4:20] Instance-Optimal Compressed Sensing via Posterior Sampling
[4:25] A Nullspace Property for Subspace-Preserving Recovery
[4:30] Homomorphic Sensing: Sparsity and Noise
[4:35] Active Deep Probabilistic Subsampling
[4:40] Prior Image-Constrained Reconstruction using Style-Based Generative Models
[4:45] Intermediate Layer Optimization for Inverse Problems using Deep Generative Models
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
6 p.m.
Posters 6:00-8:00
(ends 8:00 PM)
10 p.m.

THU 22 JUL
12:30 a.m.
2 a.m.
Orals 2:00-2:20
[2:00] The Symmetry between Arms and Knapsacks: A Primal-Dual Approach for Bandits with Knapsacks
Spotlights 2:20-2:50
[2:20] Dynamic Planning and Learning under Recovering Rewards
[2:25] Best Arm Identification in Graphical Bilinear Bandits
[2:30] Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
[2:35] Incentivized Bandit Learning with Self-Reinforcing User Preferences
[2:40] Approximation Theory Based Methods for RKHS Bandits
[2:45] Dynamic Balancing for Model Selection in Bandits and RL
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks
Spotlights 2:20-2:50
[2:20] Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections
[2:25] Understanding Noise Injection in GANs
[2:30] FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Analysis
[2:35] Improved OOD Generalization via Adversarial Training and Pretraing
[2:40] WGAN with an Infinitely Wide Generator Has No Spurious Stationary Points
[2:45] Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Online Unrelated Machine Load Balancing with Predictions Revisited
Spotlights 2:20-2:50
[2:20] MOTS: Minimax Optimal Thompson Sampling
[2:25] Regularized Online Allocation Problems: Fairness and Beyond
[2:30] Near-Optimal Representation Learning for Linear Bandits and Linear RL
[2:35] Improved Corruption Robust Algorithms for Episodic Reinforcement Learning
[2:40] DriftSurf: Stable-State / Reactive-State Learning under Concept Drift
[2:45] Online Submodular Resource Allocation with Applications to Rebalancing Shared Mobility Systems
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Understanding self-supervised learning dynamics without contrastive pairs
Spotlights 2:20-2:50
[2:20] Learning by Turning: Neural Architecture Aware Optimisation
[2:25] Consensus Control for Decentralized Deep Learning
[2:30] Selfish Sparse RNN Training
[2:35] Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization
[2:40] Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data
[2:45] Understanding the Dynamics of Gradient Flow in Overparameterized Linear models
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can be Exponentially Harder than Online RL
Spotlights 2:20-2:50
[2:20] Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping
[2:25] Confidence-Budget Matching for Sequential Budgeted Learning
[2:30] Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
[2:35] Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient
[2:40] Robust Policy Gradient against Strong Data Corruption
[2:45] Logarithmic Regret for Reinforcement Learning with Linear Function Approximation
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] UCB Momentum Q-learning: Correcting the bias without forgetting
Spotlights 2:20-2:45
[2:20] Non-Exponentially Weighted Aggregation: Regret Bounds for Unbounded Loss Functions
[2:25] Adversarial Dueling Bandits
[2:30] Fast active learning for pure exploration in reinforcement learning
[2:35] Leveraging Non-uniformity in First-order Non-convex Optimization
[2:40] Probabilistic Programs with Stochastic Conditioning
Q&As 2:45-2:50
[2:45] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Label Distribution Learning Machine
Spotlights 2:20-2:50
[2:20] Representation Matters: Assessing the Importance of Subgroup Allocations in Training Data
[2:25] Heterogeneous Risk Minimization
[2:30] Optimizing Black-box Metrics with Iterative Example Weighting
[2:35] A theory of high dimensional regression with arbitrary correlations between input features and target functions: sample complexity, multiple descent curves and a hierarchy of phase transitions
[2:40] How Does Loss Function Affect Generalization Performance of Deep Learning? Application to Human Age Estimation
[2:45] Implicit rate-constrained optimization of non-decomposable objectives
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Learning Optimal Auctions with Correlated Valuations from Samples
Spotlights 2:20-2:50
[2:20] Alternative Microfoundations for Strategic Classification
[2:25] Multi-Receiver Online Bayesian Persuasion
[2:30] Online Learning for Load Balancing of Unknown Monotone Resource Allocation Games
[2:35] Compressed Maximum Likelihood
[2:40] Consistent regression when oblivious outliers overwhelm
[2:45] Asymptotics of Ridge Regression in Convolutional Models
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning
Spotlights 2:20-2:50
[2:20] Near-Optimal Linear Regression under Distribution Shift
[2:25] Detection of Signal in the Spiked Rectangular Models
[2:30] A Distribution-dependent Analysis of Meta Learning
[2:35] How Important is the Train-Validation Split in Meta-Learning?
[2:40] Robust Unsupervised Learning via L-statistic Minimization
[2:45] A Theory of Label Propagation for Subpopulation Shift
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
2:15 a.m.
3 a.m.
Orals 3:00-3:20
[3:00] Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism
Spotlights 3:20-3:50
[3:20] Optimal Streaming Algorithms for Multi-Armed Bandits
[3:25] Top-k eXtreme Contextual Bandits with Arm Hierarchy
[3:30] Improved Regret Bounds of Bilinear Bandits using Action Space Analysis
[3:35] Interaction-Grounded Learning
[3:40] Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits
[3:45] Pure Exploration and Regret Minimization in Matching Bandits
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Dissecting Supervised Constrastive Learning
Spotlights 3:20-3:50
[3:20] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
[3:25] Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks
[3:30] Scaling Properties of Deep Residual Networks
[3:35] Contrastive Learning Inverts the Data Generating Process
[3:40] Tensor Programs IIb: Architectural Universality Of Neural Tangent Kernel Training Dynamics
[3:45] Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Provably Efficient Algorithms for Multi-Objective Competitive RL
Spotlights 3:20-3:45
[3:20] Online Learning in Unknown Markov Games
[3:25] A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
[3:30] Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
[3:35] Towards Tight Bounds on the Sample Complexity of Average-reward MDPs
[3:40] Finding the Stochastic Shortest Path with Low Regret: the Adversarial Cost and Unknown Transition Case
Q&As 3:45-3:50
[3:45] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Task-Optimal Exploration in Linear Dynamical Systems
Spotlights 3:20-3:45
[3:20] Gaussian Process-Based Real-Time Learning for Safety Critical Applications
[3:25] CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
[3:30] Randomized Exploration in Reinforcement Learning with General Value Function Approximation
[3:35] Deep Coherent Exploration for Continuous Control
[3:40] Towards Distraction-Robust Active Visual Tracking
Q&As 3:45-3:50
[3:45] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Discriminative Complementary-Label Learning with Weighted Loss
Spotlights 3:20-3:50
[3:20] GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient Deep Model Training
[3:25] Fair Classification with Noisy Protected Attributes: A Framework with Provable Guarantees
[3:30] Learning Deep Neural Networks under Agnostic Corrupted Supervision
[3:35] Trees with Attention for Set Prediction Tasks
[3:40] Model Performance Scaling with Multiple Data Sources
[3:45] Solving Inverse Problems with a Flow-based Noise Model
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Cyclically Equivariant Neural Decoders for Cyclic Codes
Spotlights 3:20-3:50
[3:20] KO codes: inventing nonlinear encoding and decoding for reliable wireless communication via deep-learning
[3:25] An Information-Geometric Distance on the Space of Tasks
[3:30] On Perceptual Lossy Compression: The Cost of Perceptual Reconstruction and An Optimal Training Framework
[3:35] Discrete-Valued Latent Preference Matrix Estimation with Graph Side Information
[3:40] A Novel Method to Solve Neural Knapsack Problems
[3:45] Chebyshev Polynomial Codes: Task Entanglement-based Coding for Distributed Matrix Multiplication
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] RATT: Leveraging Unlabeled Data to Guarantee Generalization
Spotlights 3:20-3:50
[3:20] Approximation Theory of Convolutional Architectures for Time Series Modelling
[3:25] On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models
[3:30] Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances
[3:35] On the Explicit Role of Initialization on the Convergence and Implicit Bias of Overparametrized Linear Networks
[3:40] Relative Deviation Margin Bounds
[3:45] Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Stability and Generalization of Stochastic Gradient Methods for Minimax Problems
Spotlights 3:20-3:50
[3:20] Outside the Echo Chamber: Optimizing the Performative Risk
[3:25] Asymptotic Normality and Confidence Intervals for Prediction Risk of the Min-Norm Least Squares Estimator
[3:30] Provable Meta-Learning of Linear Representations
[3:35] Sample Complexity of Robust Linear Classification on Separated Data
[3:40] The Impact of Record Linkage on Learning from Feature Partitioned Data
[3:45] Train simultaneously, generalize better: Stability of gradient-based minimax learners
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Conformal prediction interval for dynamic time-series
Spotlights 3:20-3:50
[3:20] End-to-End Learning of Coherent Probabilistic Forecasts for Hierarchical Time Series
[3:25] Segmenting Hybrid Trajectories using Latent ODEs
[3:30] Z-GCNETs: Time Zigzags at Graph Convolutional Networks for Time Series Forecasting
[3:35] Event Outlier Detection in Continuous Time
[3:40] Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting
[3:45] Cumulants of Hawkes Processes are Robust to Observation Noise
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
4 a.m.
Orals 4:00-4:20
[4:00] Can Subnetwork Structure Be the Key to Out-of-Distribution Generalization?
Spotlights 4:20-4:50
[4:20] DORO: Distributional and Outlier Robust Optimization
[4:25] AdaXpert: Adapting Neural Architecture for Growing Data
[4:30] Neural SDEs as Infinite-Dimensional GANs
[4:35] Exact Optimization of Conformal Predictors via Incremental and Decremental Learning
[4:40] Mandoline: Model Evaluation under Distribution Shift
[4:45] How and Why to Use Experimental Data to Evaluate Methods for Observational Causal Inference
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Spotlights 4:00-4:45
[4:00] Deep Latent Graph Matching
[4:05] Asymmetric Loss Functions for Learning with Noisy Labels
[4:10] Clusterability as an Alternative to Anchor Points When Learning with Noisy Labels
[4:15] More Powerful and General Selective Inference for Stepwise Feature Selection using Homotopy Method
[4:20] Training Recurrent Neural Networks via Forward Propagation Through Time
[4:25] Deep Learning for Functional Data Analysis with Adaptive Basis Layers
[4:30] An Integer Linear Programming Framework for Mining Constraints from Data
[4:35] Classification with Rejection Based on Cost-sensitive Classification
[4:40] Versatile Verification of Tree Ensembles
Q&As 4:45-4:50
[4:45] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Break-It-Fix-It: Unsupervised Learning for Program Repair
Spotlights 4:20-4:50
[4:20] Policy Analysis using Synthetic Controls in Continuous-Time
[4:25] MC-LSTM: Mass-Conserving LSTM
[4:30] HyperHyperNetwork for the Design of Antenna Arrays
[4:35] SAINT-ACC: Safety-Aware Intelligent Adaptive Cruise Control for Autonomous Vehicles Using Deep Reinforcement Learning
[4:40] 12-Lead ECG Reconstruction via Koopman Operators
[4:45] A large-scale benchmark for few-shot program induction and synthesis
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Multi-layered Network Exploration via Random Walks: From Offline Optimization to Online Learning
Spotlights 4:20-4:50
[4:20] Combinatorial Blocking Bandits with Stochastic Delays
[4:25] Sparsity-Agnostic Lasso Bandit
[4:30] Quantile Bandits for Best Arms Identification
[4:35] Beyond $log^2(T)$ regret for decentralized bandits in matching markets
[4:40] Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling
[4:45] Adapting to misspecification in contextual bandits with offline regression oracles
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Learning Gradient Fields for Molecular Conformation Generation
Spotlights 4:20-4:50
[4:20] An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming
[4:25] SagaNet: A Small Sample Gated Network for Pediatric Cancer Diagnosis
[4:30] ACE: Explaining cluster from an adversarial perspective
[4:35] Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design
[4:40] Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving
[4:45] Non-Autoregressive Electron Redistribution Modeling for Reaction Prediction
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Improved Regret Bound and Experience Replay in Regularized Policy Iteration
Spotlights 4:20-4:50
[4:20] Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs
[4:25] Optimal Off-Policy Evaluation from Multiple Logging Policies
[4:30] Provably Correct Optimization and Exploration with Non-linear Policies
[4:35] Safe Reinforcement Learning Using Advantage-Based Intervention
[4:40] Robust Pure Exploration in Linear Bandits with Limited Budget
[4:45] Of Moments and Matching: A Game-Theoretic Framework for Closing the Imitation Gap
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:40
[4:00] Multi-Dimensional Classification via Sparse Label Encoding
[4:20] Latent Programmer: Discrete Latent Codes for Program Synthesis
Spotlights 4:40-4:50
[4:40] LEGO: Latent Execution-Guided Reasoning for Multi-Hop Question Answering on Knowledge Graphs
[4:45] SpreadsheetCoder: Formula Prediction from Semi-structured Context
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] RNN with Particle Flow for Probabilistic Spatio-temporal Forecasting
Spotlights 4:20-4:50
[4:20] A Differentiable Point Process with Its Application to Spiking Neural Networks
[4:25] Diffusion Source Identification on Networks with Statistical Confidence
[4:30] Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding
[4:35] Making Paper Reviewing Robust to Bid Manipulation Attacks
[4:40] Model Distillation for Revenue Optimization: Interpretable Personalized Pricing
[4:45] Learning Generalized Intersection Over Union for Dense Pixelwise Prediction
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] A Precise Performance Analysis of Support Vector Regression
Spotlights 4:20-4:50
[4:20] Lower-Bounded Proper Losses for Weakly Supervised Classification
[4:25] On Variational Inference in Biclustering Models
[4:30] Infinite-Dimensional Optimization for Zero-Sum Games via Variational Transport
[4:35] Dropout: Explicit Forms and Capacity Control
[4:40] Finding Relevant Information via a Discrete Fourier Expansion
[4:45] On the Inherent Regularization Effects of Noise Injection During Training
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Analysis of stochastic Lanczos quadrature for spectrum approximation
Spotlights 4:20-4:50
[4:20] Sample-Optimal PAC Learning of Halfspaces with Malicious Noise
[4:25] On Robust Mean Estimation under Coordinate-level Corruption
[4:30] Multidimensional Scaling: Approximation and Complexity
[4:35] Toward Better Generalization Bounds with Locally Elastic Stability
[4:40] Adaptive Newton Sketch: Linear-time Optimization with Quadratic Convergence and Effective Hessian Dimensionality
[4:45] Interpreting and Disentangling Feature Components of Various Complexity from DNNs
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
5 a.m.
Invited Talk:
Edward Chang
(ends 6:00 AM)
6 a.m.
Posters 6:00-8:00
(ends 8:00 AM)
2 p.m.
Orals 2:00-2:20
[2:00] Coded-InvNet for Resilient Prediction Serving Systems
Spotlights 2:20-2:50
[2:20] Memory-Efficient Pipeline-Parallel DNN Training
[2:25] Putting the ``Learning" into Learning-Augmented Algorithms for Frequency Estimation
[2:30] Robust Testing and Estimation under Manipulation Attacks
[2:35] Optimization Planning for 3D ConvNets
[2:40] Robust Learning-Augmented Caching: An Experimental Study
[2:45] Parallel Droplet Control in MEDA Biochips using Multi-Agent Reinforcement Learning
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Tighter Bounds on the Log Marginal Likelihood of Gaussian Process Regression Using Conjugate Gradients
Spotlights 2:20-2:50
[2:20] Isometric Gaussian Process Latent Variable Model for Dissimilarity Data
[2:25] Variational Auto-Regressive Gaussian Processes for Continual Learning
[2:30] Sparse within Sparse Gaussian Processes using Neighbor Information
[2:35] SigGPDE: Scaling Sparse Gaussian Processes on Sequential Data
[2:40] On Signal-to-Noise Ratio Issues in Variational Inference for Deep Gaussian Processes
[2:45] Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Differentiable Particle Filtering via Entropy-Regularized Optimal Transport
Spotlights 2:20-2:50
[2:20] DAGs with No Curl: An Efficient DAG Structure Learning Approach
[2:25] Generalized Doubly Reparameterized Gradient Estimators
[2:30] Whittle Networks: A Deep Likelihood Model for Time Series
[2:35] On the Convergence of Hamiltonian Monte Carlo with Stochastic Gradients
[2:40] Addressing Catastrophic Forgetting in Few-Shot Problems
[2:45] Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Tilting the playing field: Dynamical loss functions for machine learning
Spotlights 2:20-2:50
[2:20] Adversarial Robustness Guarantees for Random Deep Neural Networks
[2:25] Implicit Bias of Linear RNNs
[2:30] Analyzing the tree-layer structure of Deep Forests
[2:35] Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels
[2:40] Implicit Regularization in Tensor Factorization
[2:45] Uniform Convergence, Adversarial Spheres and a Simple Remedy
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Fair Selective Classification Via Sufficiency
Spotlights 2:20-2:50
[2:20] Learning Representations by Humans, for Humans
[2:25] Strategic Classification in the Dark
[2:30] Fairness for Image Generation with Uncertain Sensitive Attributes
[2:35] Characterizing Fairness Over the Set of Good Models Under Selective Labels
[2:40] GANMEX: One-vs-One Attributions using GAN-based Model Explainability
[2:45] Directional Bias Amplification
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment
Spotlights 2:20-2:50
[2:20] Dataset Condensation with Differentiable Siamese Augmentation
[2:25] PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees
[2:30] Parameterless Transductive Feature Re-representation for Few-Shot Learning
[2:35] Sharing Less is More: Lifelong Learning in Deep Networks with Selective Layer Transfer
[2:40] Memory Efficient Online Meta Learning
[2:45] Detecting Rewards Deterioration in Episodic Reinforcement Learning
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Spotlights 2:20-2:50
[2:20] GRAND: Graph Neural Diffusion
[2:25] On Linear Identifiability of Learned Representations
[2:30] Learning disentangled representations via product manifold projection
[2:35] A Collective Learning Framework to Boost GNN Expressiveness for Node Classification
[2:40] Directed Graph Embeddings in Pseudo-Riemannian Manifolds
[2:45] Aggregating From Multiple Target-Shifted Sources
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Confidence Scores Make Instance-dependent Label-noise Learning Possible
Spotlights 2:20-2:50
[2:20] Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation
[2:25] Self-Damaging Contrastive Learning
[2:30] Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels
[2:35] GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings
[2:40] Neural Transformation Learning for Deep Anomaly Detection Beyond Images
[2:45] Wasserstein Distributional Normalization For Robust Distributional Certification of Noisy Labeled Data
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
Orals 2:00-2:20
[2:00] Local Algorithms for Finding Densely Connected Clusters
Spotlights 2:20-2:50
[2:20] Systematic Analysis of Cluster Similarity Indices: How to Validate Validation Measures
[2:25] Local Correlation Clustering with Asymmetric Classification Errors
[2:30] Near-Optimal Algorithms for Explainable k-Medians and k-Means
[2:35] BasisDeVAE: Interpretable Simultaneous Dimensionality Reduction and Feature-Level Clustering with Derivative-Based Variational Autoencoders
[2:40] Hierarchical Clustering of Data Streams: Scalable Algorithms and Approximation Guarantees
[2:45] A Scalable Second Order Method for Ill-Conditioned Matrix Completion from Few Samples
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 PM)
3 p.m.
Orals 3:00-3:20
[3:00] Improved, Deterministic Smoothing for L_1 Certified Robustness
Spotlights 3:20-3:50
[3:20] Mixed Nash Equilibria in the Adversarial Examples Game
[3:25] Learning to Generate Noise for Multi-Attack Robustness
[3:30] Query Complexity of Adversarial Attacks
[3:35] Training Adversarially Robust Sparse Networks via Bayesian Connectivity Sampling
[3:40] Efficient Training of Robust Decision Trees Against Adversarial Examples
[3:45] Scalable Certified Segmentation via Randomized Smoothing
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Delving into Deep Imbalanced Regression
Spotlights 3:20-3:50
[3:20] HAWQ-V3: Dyadic Neural Network Quantization
[3:25] Nondeterminism and Instability in Neural Network Optimization
[3:30] Phase Transitions, Distance Functions, and Implicit Neural Representations
[3:35] PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models
[3:40] Conservative Objective Models for Effective Offline Model-Based Optimization
[3:45] TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design
Spotlights 3:20-3:50
[3:20] Beyond the Pareto Efficient Frontier: Constraint Active Search for Multiobjective Experimental Design
[3:25] Finite mixture models do not reliably learn the number of components
[3:30] Evaluating the Implicit Midpoint Integrator for Riemannian Hamiltonian Monte Carlo
[3:35] Streaming Bayesian Deep Tensor Factorization
[3:40] Active Learning of Continuous-time Bayesian Networks through Interventions
[3:45] Bayesian Attention Belief Networks
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] I-BERT: Integer-only BERT Quantization
Spotlights 3:20-3:50
[3:20] SparseBERT: Rethinking the Importance Analysis in Self-attention
[3:25] Learning to Rehearse in Long Sequence Memorization
[3:30] Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
[3:35] Linear Transformers Are Secretly Fast Weight Programmers
[3:40] Predict then Interpolate: A Simple Algorithm to Learn Stable Classifiers
[3:45] Expressive 1-Lipschitz Neural Networks for Robust Multiple Graph Learning against Adversarial Attacks
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Differentially Private Query Release Through Adaptive Projection
Spotlights 3:20-3:50
[3:20] Differentially Private Quantiles
[3:25] PAPRIKA: Private Online False Discovery Rate Control
[3:30] Privacy-Preserving Video Classification with Convolutional Neural Networks
[3:35] Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning
[3:40] Differentially Private Correlation Clustering
[3:45] Accuracy, Interpretability, and Differential Privacy via Explainable Boosting
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Benchmarks, Algorithms, and Metrics for Hierarchical Disentanglement
Spotlights 3:20-3:50
[3:20] Whitening for Self-Supervised Representation Learning
[3:25] Feature Clustering for Support Identification in Extreme Regions
[3:30] Symmetric Spaces for Graph Embeddings: A Finsler-Riemannian Approach
[3:35] Robust Representation Learning via Perceptual Similarity Metrics
[3:40] Decoupling Representation Learning from Reinforcement Learning
[3:45] Sharf: Shape-conditioned Radiance Fields from a Single View
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Temporal Difference Learning as Gradient Splitting
Spotlights 3:20-3:50
[3:20] First-Order Methods for Wasserstein Distributionally Robust MDP
[3:25] Off-Policy Confidence Sequences
[3:30] Adaptive Sampling for Best Policy Identification in Markov Decision Processes
[3:35] Quantum algorithms for reinforcement learning with a generative model
[3:40] Posterior Value Functions: Hindsight Baselines for Policy Gradient Methods
[3:45] Learning Interaction Kernels for Agent Systems on Riemannian Manifolds
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Learning Noise Transition Matrix from Only Noisy Labels via Total Variation Regularization
Spotlights 3:20-3:50
[3:20] On the Power of Localized Perceptron for Label-Optimal Learning of Halfspaces with Adversarial Noise
[3:25] CLOCS: Contrastive Learning of Cardiac Signals Across Space, Time, and Patients
[3:30] Disambiguation of Weak Supervision leading to Exponential Convergence rates
[3:35] Active Covering
[3:40] Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences
[3:45] Principal Bit Analysis: Autoencoding with Schur-Concave Loss
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 PM)
Orals 3:00-3:20
[3:00] Dash: Semi-Supervised Learning with Dynamic Thresholding
Spotlights 3:20-3:45
[3:20] In-Database Regression in Input Sparsity Time
[3:25] Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification
[3:30] Transfer-Based Semantic Anomaly Detection
[3:35] Graph Convolution for Semi-Supervised Classification: Improved Linear Separability and Out-of-Distribution Generalization
[3:40] Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed Self-Training
Q&As 3:45-3:50
[3:45] Q&A
(ends 4:00 PM)
4 p.m.
Orals 4:00-4:20
[4:00] CARTL: Cooperative Adversarially-Robust Transfer Learning
Spotlights 4:20-4:50
[4:20] Skew Orthogonal Convolutions
[4:25] Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries
[4:30] Defense against backdoor attacks via robust covariance estimation
[4:35] Adversarial Purification with Score-based Generative Models
[4:40] Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks
[4:45] To be Robust or to be Fair: Towards Fairness in Adversarial Training
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Annealed Flow Transport Monte Carlo
Spotlights 4:20-4:50
[4:20] Nonparametric Decomposition of Sparse Tensors
[4:25] Parallel tempering on optimized paths
[4:30] Sparse Bayesian Learning via Stepwise Regression
[4:35] Geometric convergence of elliptical slice sampling
[4:40] Bayesian Quadrature on Riemannian Data Manifolds
[4:45] A Gradient Based Strategy for Hamiltonian Monte Carlo Hyperparameter Optimization
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm via Langevin Monte Carlo within Gibbs
Spotlights 4:20-4:50
[4:20] Nonmyopic Multifidelity Acitve Search
[4:25] Active Testing: Sample-Efficient Model Evaluation
[4:30] Oblivious Sketching for Logistic Regression
[4:35] SGLB: Stochastic Gradient Langevin Boosting
[4:40] Flow-based Attribution in Graphical Models: A Recursive Shapley Approach
[4:45] On the difficulty of unbiased alpha divergence minimization
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Domain Generalization using Causal Matching
Spotlights 4:20-4:50
[4:20] Unified Robust Semi-Supervised Variational Autoencoder
[4:25] Representation Subspace Distance for Domain Adaptation Regression
[4:30] Personalized Federated Learning using Hypernetworks
[4:35] f-Domain Adversarial Learning: Theory and Algorithms
[4:40] Few-Shot Conformal Prediction with Auxiliary Tasks
[4:45] Learning a Universal Template for Few-shot Dataset Generalization
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Modeling Hierarchical Structures with Continuous Recursive Neural Networks
Spotlights 4:20-4:50
[4:20] Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline
[4:25] Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution
[4:30] Rissanen Data Analysis: Examining Dataset Characteristics via Description Length
[4:35] Matrix Completion with Model-free Weighting
[4:40] PHEW : Constructing Sparse Networks that Learn Fast and Generalize Well without Training Data
[4:45] EL-Attention: Memory Efficient Lossless Attention for Generation
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Locally Private k-Means in One Round
Spotlights 4:20-4:50
[4:20] Matrix Sketching for Secure Collaborative Machine Learning
[4:25] Markpainting: Adversarial Machine Learning meets Inpainting
[4:30] Differentially-Private Clustering of Easy Instances
[4:35] Inference for Network Regression Models with Community Structure
[4:40] DeepReDuce: ReLU Reduction for Fast Private Inference
[4:45] Label-Only Membership Inference Attacks
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates
Spotlights 4:20-4:50
[4:20] Enhancing Robustness of Neural Networks through Fourier Stabilization
[4:25] Quantifying Availability and Discovery in Recommender Systems via Stochastic Reachability
[4:30] Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data
[4:35] RNNRepair: Automatic RNN Repair via Model-based Analysis
[4:40] Adversarial Policy Learning in Two-player Competitive Games
[4:45] Fairness of Exposure in Stochastic Bandits
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] On Disentangled Representations Learned from Correlated Data
Spotlights 4:20-4:50
[4:20] Training data-efficient image transformers & distillation through attention
[4:25] SketchEmbedNet: Learning Novel Concepts by Imitating Drawings
[4:30] GeomCA: Geometric Evaluation of Data Representations
[4:35] Online Limited Memory Neural-Linear Bandits with Likelihood Matching
[4:40] Environment Inference for Invariant Learning
[4:45] Neural Feature Matching in Implicit 3D Representations
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
Orals 4:00-4:20
[4:00] Graph Contrastive Learning Automated
Spotlights 4:20-4:50
[4:20] Barlow Twins: Self-Supervised Learning via Redundancy Reduction
[4:25] Pointwise Binary Classification with Pairwise Confidence Comparisons
[4:30] Learning from Similarity-Confidence Data
[4:35] Unsupervised Co-part Segmentation through Assembly
[4:40] Randomized Dimensionality Reduction for Facility Location and Single-Linkage Clustering
[4:45] Learning Binary Decision Trees by Argmin Differentiation
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 PM)
5 p.m.
Invited Talk:
Cecilia Clementi
(ends 6:00 PM)
6 p.m.
Posters 6:00-8:00
(ends 8:00 PM)

FRI 23 JUL
2 a.m.
Orals 2:00-2:20
[2:00] Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm
Spotlights 2:20-2:50
[2:20] Maximum Mean Discrepancy Test is Aware of Adversarial Attacks
[2:25] Learning Diverse-Structured Networks for Adversarial Robustness
[2:30] PopSkipJump: Decision-Based Attack for Probabilistic Classifiers
[2:35] Towards Better Robust Generalization with Shift Consistency Regularization
[2:40] Robust Learning for Data Poisoning Attacks
[2:45] Mind the Box: $l_1$-APGD for Sparse Adversarial Attacks on Image Classifiers
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Probabilistic Generating Circuits
Spotlights 2:20-2:50
[2:20] Model Fusion for Personalized Learning
[2:25] Pareto GAN: Extending the Representational Power of GANs to Heavy-Tailed Distributions
[2:30] Run-Sort-ReRun: Escaping Batch Size Limitations in Sliced Wasserstein Generative Models
[2:35] Statistical Estimation from Dependent Data
[2:40] Context-Aware Online Collective Inference for Templated Graphical Models
[2:45] Causality-aware counterfactual confounding adjustment as an alternative to linear residualization in anticausal prediction tasks based on linear learners
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Exponential Reduction in Sample Complexity with Learning of Ising Model Dynamics
Spotlights 2:20-2:50
[2:20] Objective Bound Conditional Gaussian Process for Bayesian Optimization
[2:25] Automatic variational inference with cascading flows
[2:30] Estimating Identifiable Causal Effects on Markov Equivalence Class through Double Machine Learning
[2:35] Bias-Free Scalable Gaussian Processes via Randomized Truncations
[2:40] SG-PALM: a Fast Physically Interpretable Tensor Graphical Model
[2:45] Black-box density function estimation using recursive partitioning
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Global Prosody Style Transfer Without Text Transcriptions
Spotlights 2:20-2:50
[2:20] SoundDet: Polyphonic Moving Sound Event Detection and Localization from Raw Waveform
[2:25] EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture
[2:30] Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
[2:35] Learning de-identified representations of prosody from raw audio
[2:40] UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
[2:45] You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation
Spotlights 2:20-2:50
[2:20] Backpropagated Neighborhood Aggregation for Accurate Training of Spiking Neural Networks
[2:25] Crystallization Learning with the Delaunay Triangulation
[2:30] Group Fisher Pruning for Practical Network Compression
[2:35] BASE Layers: Simplifying Training of Large, Sparse Models
[2:40] STRODE: Stochastic Boundary Ordinary Differential Equation
[2:45] A Zeroth-Order Block Coordinate Descent Algorithm for Huge-Scale Black-Box Optimization
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Gradient Disaggregation: Breaking Privacy in Federated Learning by Reconstructing the User Participant Matrix
Spotlights 2:20-2:50
[2:20] Provable Lipschitz Certification for Generative Models
[2:25] HEMET: A Homomorphic-Encryption-Friendly Privacy-Preserving Mobile Neural Network Architecture
[2:30] Explanations for Monotonic Classifiers.
[2:35] Lossless Compression of Efficient Private Local Randomizers
[2:40] CRFL: Certifiably Robust Federated Learning against Backdoor Attacks
[2:45] Grey-box Extraction of Natural Language Models
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Commutative Lie Group VAE for Disentanglement Learning
Spotlights 2:20-2:50
[2:20] Self-supervised Graph-level Representation Learning with Local and Global Structure
[2:25] Generalization Error Bound for Hyperbolic Ordinal Embedding
[2:30] Neighborhood Contrastive Learning Applied to Online Patient Monitoring
[2:35] Simple and Effective VAE Training with Calibrated Decoders
[2:40] Decomposed Mutual Information Estimation for Contrastive Representation Learning
[2:45] Structured World Belief for Reinforcement Learning in POMDP
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Leveraged Weighted Loss for Partial Label Learning
Spotlights 2:20-2:50
[2:20] Unitary Branching Programs: Learnability and Lower Bounds
[2:25] Provably End-to-end Label-noise Learning without Anchor Points
[2:30] MorphVAE: Generating Neural Morphologies from 3D-Walks using a Variational Autoencoder with Spherical Latent Space
[2:35] Unsupervised Embedding Adaptation via Early-Stage Feature Reconstruction for Few-Shot Classification
[2:40] Improved Algorithms for Agnostic Pool-based Active Classification
[2:45] Adversarial Multi Class Learning under Weak Supervision with Performance Guarantees
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
Orals 2:00-2:20
[2:00] Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization
Spotlights 2:20-2:50
[2:20] KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation
[2:25] REPAINT: Knowledge Transfer in Deep Reinforcement Learning
[2:30] Exploiting Shared Representations for Personalized Federated Learning
[2:35] Large-Scale Meta-Learning with Continual Trajectory Shifting
[2:40] Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning
[2:45] LogME: Practical Assessment of Pre-trained Models for Transfer Learning
Q&As 2:50-2:55
[2:50] Q&A
(ends 3:00 AM)
3 a.m.
Orals 3:00-3:20
[3:00] Integer Programming for Causal Structure Learning in the Presence of Latent Variables
Spotlights 3:20-3:50
[3:20] Online Selection Problems against Constrained Adversary
[3:25] SGA: A Robust Algorithm for Partial Recovery of Tree-Structured Graphical Models with Noisy Samples
[3:30] Efficient Online Learning for Dynamic k-Clustering
[3:35] On Recovering from Modeling Errors Using Testing Bayesian Networks
[3:40] Towards Practical Mean Bounds for Small Samples
[3:45] Monte Carlo Variational Auto-Encoders
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Solving high-dimensional parabolic PDEs using the tensor train format
Spotlights 3:20-3:50
[3:20] Large Scale Private Learning via Low-rank Reparametrization
[3:25] Breaking the Deadly Triad with a Target Network
[3:30] Average-Reward Off-Policy Policy Evaluation with Function Approximation
[3:35] Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games
[3:40] Optimal Non-Convex Exact Recovery in Stochastic Block Model via Projected Power Method
[3:45] Optimal Counterfactual Explanations in Tree Ensembles
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] WILDS: A Benchmark of in-the-Wild Distribution Shifts
Spotlights 3:20-3:45
[3:20] Improving Generalization in Meta-learning via Task Augmentation
[3:25] Improving Predictors via Combination Across Diverse Task Categories
[3:30] MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration
[3:35] Offline Meta-Reinforcement Learning with Advantage Weighting
[3:40] Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation
Q&As 3:45-3:50
[3:45] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Learn-to-Share: A Hardware-friendly Transfer Learning Framework Exploiting Computation and Parameter Sharing
Spotlights 3:20-3:50
[3:20] Latent Space Energy-Based Model of Symbol-Vector Coupling for Text Generation and Classification
[3:25] Policy Caches with Successor Features
[3:30] Meta-Thompson Sampling
[3:35] Integrated Defense for Resilient Graph Matching
[3:40] Supervised Tree-Wasserstein Distance
[3:45] Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Calibrate Before Use: Improving Few-shot Performance of Language Models
Spotlights 3:20-3:45
[3:20] On-the-fly Rectification for Robust Large-Vocabulary Topic Inference
[3:25] Towards Understanding and Mitigating Social Biases in Language Models
[3:30] Disentangling syntax and semantics in the brain with deep networks
[3:35] Cross-model Back-translated Distillation for Unsupervised Machine Translation
[3:40] Few-shot Language Coordination by Modeling Theory of Mind
Q&As 3:45-3:50
[3:45] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Private Stochastic Convex Optimization: Optimal Rates in L1 Geometry
Spotlights 3:20-3:50
[3:20] Differentially Private Aggregation in the Shuffle Model: Almost Central Accuracy in Almost a Single Message
[3:25] Model-Targeted Poisoning Attacks with Provable Convergence
[3:30] Practical and Private (Deep) Learning Without Sampling or Shuffling
[3:35] Leveraging Public Data for Practical Private Query Release
[3:40] Private Adaptive Gradient Methods for Convex Optimization
[3:45] Oneshot Differentially Private Top-k Selection
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Unsupervised Representation Learning via Neural Activation Coding
Spotlights 3:20-3:50
[3:20] Demystifying Inductive Biases for (Beta-)VAE Based Architectures
[3:25] Examining and Combating Spurious Features under Distribution Shift
[3:30] Unsupervised Part Representation by Flow Capsules
[3:35] Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection
[3:40] Efficient Iterative Amortized Inference for Learning Symmetric and Disentangled Multi-Object Representations
[3:45] Temporal Predictive Coding For Model-Based Planning In Latent Space
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Label Inference Attacks from Log-loss Scores
Spotlights 3:20-3:50
[3:20] Generative Causal Explanations for Graph Neural Networks
[3:25] Watermarking Deep Neural Networks with Greedy Residuals
[3:30] Differentially Private Bayesian Inference for Generalized Linear Models
[3:35] Globally-Robust Neural Networks
[3:40] When Does Data Augmentation Help With Membership Inference Attacks?
[3:45] Correcting Exposure Bias for Link Recommendation
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
Orals 3:00-3:20
[3:00] Dimensionality Reduction for the Sum-of-Distances Metric
Spotlights 3:20-3:50
[3:20] A Sampling-Based Method for Tensor Ring Decomposition
[3:25] CountSketches, Feature Hashing and the Median of Three
[3:30] Single Pass Entrywise-Transformed Low Rank Approximation
[3:35] Active Slices for Sliced Stein Discrepancy
[3:40] Projection techniques to update the truncated SVD of evolving matrices with applications
[3:45] Fixed-Parameter and Approximation Algorithms for PCA with Outliers
Q&As 3:50-3:55
[3:50] Q&A
(ends 4:00 AM)
4 a.m.
Orals 4:00-4:20
[4:00] A General Framework For Detecting Anomalous Inputs to DNN Classifiers
Spotlights 4:20-4:50
[4:20] Towards Defending against Adversarial Examples via Attack-Invariant Features
[4:25] Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons
[4:30] Uncovering the Connections Between Adversarial Transferability and Knowledge Transferability
[4:35] Improving Gradient Regularization using Complex-Valued Neural Networks
[4:40] Double-Win Quant: Aggressively Winning Robustness of Quantized Deep Neural Networks via Random Precision Training and Inference
[4:45] Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Spotlights 4:00-4:45
[4:00] Explaining Time Series Predictions with Dynamic Masks
[4:05] Neural Tangent Generalization Attacks
[4:10] Understanding and Mitigating Accuracy Disparity in Regression
[4:15] Backdoor Scanning for Deep Neural Networks through K-Arm Optimization
[4:20] DANCE: Enhancing saliency maps using decoys
[4:25] Blind Pareto Fairness and Subgroup Robustness
[4:30] Testing DNN-based Autonomous Driving Systems under Critical Environmental Conditions
[4:35] On the Problem of Underranking in Group-Fair Ranking
[4:40] Testing Group Fairness via Optimal Transport Projections
Q&As 4:45-4:50
[4:45] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Correlation Clustering in Constant Many Parallel Rounds
Spotlights 4:20-4:50
[4:20] A Scalable Deterministic Global Optimization Algorithm for Clustering Problems
[4:25] One Pass Late Fusion Multi-view Clustering
[4:30] Data-Free Knowledge Distillation for Heterogeneous Federated Learning
[4:35] Sharper Generalization Bounds for Clustering
[4:40] Active Learning for Distributionally Robust Level-Set Estimation
[4:45] Dual Principal Component Pursuit for Robust Subspace Learning: Theory and Algorithms for a Holistic Approach
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Additive Error Guarantees for Weighted Low Rank Approximation
Spotlights 4:20-4:45
[4:20] Near-Optimal Entrywise Anomaly Detection for Low-Rank Matrices with Sub-Exponential Noise
[4:25] Fast Sketching of Polynomial Kernels of Polynomial Degree
[4:30] Finding k in Latent $k-$ polytope
[4:35] HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections
[4:40] Streaming and Distributed Algorithms for Robust Column Subset Selection
Q&As 4:45-4:50
[4:45] Q&A
(ends 5:00 AM)
Orals 4:00-4:40
[4:00] Graph Cuts Always Find a Global Optimum for Potts Models (With a Catch)
[4:20] SKIing on Simplices: Kernel Interpolation on the Permutohedral Lattice for Scalable Gaussian Processes
Spotlights 4:40-4:50
[4:40] Prediction-Centric Learning of Independent Cascade Dynamics from Partial Observations
[4:45] Marginalized Stochastic Natural Gradients for Black-Box Variational Inference
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Learning Transferable Visual Models From Natural Language Supervision
Spotlights 4:20-4:50
[4:20] Two Heads are Better Than One: Hypergraph-Enhanced Graph Reasoning for Visual Event Ratiocination
[4:25] A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning
[4:30] Meta-Learning Bidirectional Update Rules
[4:35] Function Contrastive Learning of Transferable Meta-Representations
[4:40] A Discriminative Technique for Multiple-Source Adaptation
[4:45] Debiasing Model Updates for Improving Personalized Federated Training
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:40
[4:00] Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation
[4:20] Mixed Cross Entropy Loss for Neural Machine Translation
Spotlights 4:40-4:50
[4:40] Fused Acoustic and Text Encoding for Multimodal Bilingual Pretraining and Speech Translation
[4:45] Self-supervised and Supervised Joint Training for Resource-rich Machine Translation
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Differentially Private Sliced Wasserstein Distance
Spotlights 4:20-4:45
[4:20] Differentially Private Densest Subgraph Detection
[4:25] Machine Unlearning for Random Forests
[4:30] A Framework for Private Matrix Analysis in Sliding Window Model
[4:35] The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation
[4:40] Privacy-Preserving Feature Selection with Secure Multiparty Computation
Q&As 4:45-4:50
[4:45] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Graph Neural Networks Inspired by Classical Iterative Algorithms
Spotlights 4:20-4:50
[4:20] FILTRA: Rethinking Steerable CNN by Filter Transform
[4:25] Link Prediction with Persistent Homology: An Interactive View
[4:30] Conjugate Energy-Based Models
[4:35] Implicit-PDF: Non-Parametric Representation of Probability Distributions on the Rotation Manifold
[4:40] Equivariant Networks for Pixelized Spheres
[4:45] Efficient Statistical Tests: A Neural Tangent Kernel Approach
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
Orals 4:00-4:20
[4:00] Crowdsourcing via Annotator Co-occurrence Imputation and Provable Symmetric Nonnegative Matrix Factorization
Spotlights 4:20-4:50
[4:20] Message Passing Adaptive Resonance Theory for Online Active Semi-supervised Learning
[4:25] Object Segmentation Without Labels with Large-Scale Generative Models
[4:30] SinIR: Efficient General Image Manipulation with Single Image Reconstruction
[4:35] GBHT: Gradient Boosting Histogram Transform for Density Estimation
[4:40] Hierarchical Agglomerative Graph Clustering in Nearly-Linear Time
[4:45] Improving Ultrametrics Embeddings Through Coresets
Q&As 4:50-4:55
[4:50] Q&A
(ends 5:00 AM)
5:30 a.m.
Spotlights 5:30-5:55
[5:30] Conditional Temporal Neural Processes with Covariance Loss
[5:35] Fast margin maximization via dual acceleration
[5:40] ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
[5:45] Diffusion Earth Mover's Distance and Distribution Embeddings
[5:50] Learn2Hop: Learned Optimization on Rough Landscapes
Q&As 5:55-6:00
[5:55] Q&A
(ends 6:00 AM)
Spotlights 5:30-5:55
[5:30] BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining
[5:35] Reasoning Over Virtual Knowledge Bases With Open Predicate Relations
[5:40] Recovering AES Keys with a Deep Cold Boot Attack
[5:45] Overcoming Catastrophic Forgetting by Bayesian Generative Regularization
[5:50] Bayesian Structural Adaptation for Continual Learning
Q&As 5:55-6:00
[5:55] Q&A
(ends 6:00 AM)
Spotlights 5:30-5:55
[5:30] Decentralized Riemannian Gradient Descent on the Stiefel Manifold
[5:35] ADOM: Accelerated Decentralized Optimization Method for Time-Varying Networks
[5:40] A Second look at Exponential and Cosine Step Sizes: Simplicity, Adaptivity, and Performance
[5:45] A Value-Function-based Interior-point Method for Non-convex Bi-level Optimization
[5:50] Accelerating Gossip SGD with Periodic Global Averaging
Q&As 5:55-6:00
[5:55] Q&A
(ends 6:00 AM)
Spotlights 5:30-5:55
[5:30] Lenient Regret and Good-Action Identification in Gaussian Process Bandits
[5:35] Equivariant Learning of Stochastic Fields: Gaussian Processes and Steerable Conditional Neural Processes
[5:40] Value-at-Risk Optimization with Gaussian Processes
[5:45] High-Dimensional Gaussian Process Inference with Derivatives
[5:50] GP-Tree: A Gaussian Process Classifier for Few-Shot Incremental Learning
Q&As 5:55-6:00
[5:55] Q&A
(ends 6:00 AM)
Spotlights 5:30-5:50
[5:30] Learning Online Algorithms with Distributional Advice
[5:35] Boosting for Online Convex Optimization
[5:40] Online Learning with Optimism and Delay
[5:45] Learner-Private Convex Optimization
Q&As 5:50-5:55
[5:50] Q&A
(ends 6:00 AM)
Spotlights 5:30-5:55
[5:30] On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization
[5:35] Optimal Thompson Sampling strategies for support-aware CVaR bandits
[5:40] On Limited-Memory Subsampling Strategies for Bandits
[5:45] Problem Dependent View on Structured Thresholding Bandit Problems
[5:50] Leveraging Good Representations in Linear Contextual Bandits
Q&As 5:55-6:00
[5:55] Q&A
(ends 6:00 AM)
Spotlights 5:30-5:55
[5:30] A Proxy Variable View of Shared Confounding
[5:35] Budgeted Heterogeneous Treatment Effect Estimation
[5:40] Permutation Weighting
[5:45] Valid Causal Inference with (Some) Invalid Instruments
[5:50] Operationalizing Complex Causes: A Pragmatic View of Mediation
Q&As 5:55-6:00
[5:55] Q&A
(ends 6:00 AM)
Spotlights 5:30-5:55
[5:30] Discretization Drift in Two-Player Games
[5:35] Elementary superexpressive activations
[5:40] Regularizing towards Causal Invariance: Linear Models with Proxies
[5:45] A Language for Counterfactual Generative Models
[5:50] How rotational invariance of common kernels prevents generalization in high dimensions
Q&As 5:55-6:00
[5:55] Q&A
(ends 6:00 AM)
Spotlights 5:30-5:55
[5:30] Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation
[5:35] Meta-Cal: Well-controlled Post-hoc Calibration by Ranking
[5:40] Towards Open-World Recommendation: An Inductive Model-based Collaborative Filtering Approach
[5:45] Smooth $p$-Wasserstein Distance: Structure, Empirical Approximation, and Statistical Applications
[5:50] Locally Adaptive Label Smoothing Improves Predictive Churn
Q&As 5:55-6:00
[5:55] Q&A
(ends 6:00 AM)
Spotlights 5:30-5:55
[5:30] T-SCI: A Two-Stage Conformal Inference Algorithm with Guaranteed Coverage for Cox-MLP
[5:35] Self-Improved Retrosynthetic Planning
[5:40] A Structured Observation Distribution for Generative Biological Sequence Prediction and Forecasting
[5:45] CURI: A Benchmark for Productive Concept Learning Under Uncertainty
[5:50] CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection
Q&As 5:55-6:00
[5:55] Q&A
(ends 6:00 AM)
6 a.m.
Posters 6:00-8:00
(ends 8:00 AM)
2:45 p.m.
3:15 p.m.
5:45 p.m.
11 p.m.

SAT 24 JUL
2 a.m.
1:15 p.m.
2:40 p.m.
2:43 p.m.
4:50 p.m.
5:45 p.m.
Workshop:
(ends 3:05 AM)
6 p.m.

OSZAR »