Title: Computing the Kreiss Constant of a Matrix
Abstract: As famously introduced by H.-O. Kreiss about six decades ago, the now-called Kreiss constant of a square matrix bounds the magnitude of the transient behavior of the associated system of ordinary differential (or difference) equations. Although asymptomatically stable systems must eventually decay to a stable state, they nevertheless may incur surprisingly large transients before settling down, and such behavior can result in consequences ranging from undesirable to catastrophic. For example, large transients are innately power inefficient and may pose comfort or safety concerns, e.g., in transportation applications. Moreover, if the underlying physical system is nonlinear, large transient behavior in a stable linear model of that system may end up feeding nonlinear instabilities, which in turn could destabilize the actual physical process!
Although Kreiss constants theoretically allow us to robustly predict how severely a linear dynamical system will undergo transient behavior, until recently, no methods existed to actually compute them. Instead, Kreiss constants have historically only been approximated using crude and ad hoc techniques that may not provide even a single digit of accuracy. A sticking point toward better methods has been that the value of a Kreiss constant is formulated via certain global continuous optimization problems that often are nonconvex and have multiple local optimizers, and general nonconvex optimization is NP Hard. However, by taking advantage of special structure specific to our problem of interest, we present the very first algorithms to compute Kreiss constants to arbitrary accuracy. In fact, using two very different techniques, we give two different classes of algorithms to reliably compute Kreiss constants.
Title: Coding Techniques to Mitigate Stragglers in Large-Scale Distributed Computation
Abstract: Large-scale computation distributed across servers is often slowed down by stragglers—servers that delay or fail to complete tasks on time. In this talk, I will present two recent advancements in coding schemes to mitigate the impact of stragglers in distributed computing, focusing on matrix multiplication and gradient-based optimization. First, I will introduce a coding scheme for matrix multiplication that leverages subtask execution order to improve efficiency and recoverability. By probabilistically recovering incomplete tasks through coded ones, our method reduces encoding and decoding complexity while maintaining high recoverability. Next, I will present the ignore-straggler gradient coding (IS-GC), which tolerates an arbitrary number of stragglers in distributed machine learning. Using a graph-based decoding model, IS-GC maximizes gradient recovery and reduces training time.
Title: Estimating Shapley Values via Leverage Score Sampling
Abstract: Explaining machine learning predictions is crucial to deploying AI in high-stakes domains. Inspired by game theory, Shapley values are a widely used method for attributing model predictions to input features. Due to the exponential time (in the number of features) required to compute Shapley values exactly, popular algorithms like Kernel SHAP are used to approximate Shapley values. While Kernel SHAP is agnostic to the machine learning model and is remarkably effective in practice, there are no known guarantees on its performance. We propose Leverage SHAP: a theoretically motivated modification to Kernel SHAP that exploits leverage score sampling. In experiments, we find that Leverage SHAP can even outperform the highly optimized official implementation of Kernel SHAP. Further, we exploit the elegant connection between Shapley values and linear regression to show that Leverage SHAP is provably accurate: With high probability, Shapley values can be effectively approximated by Leverage SHAP in almost linear time.
Title: Self-Supervised Reinforcement Learning: Algorithms and Emergent Properties
Abstract: In this talk, I will discuss recent work on self-supervised reinforcement learning, focusing on how we can learn complex behaviors without the need for hand-crafted rewards or demonstrations.
I will introduce contrastive RL, a recent line of work that can extract goal-reaching skills from unlabeled interactions. This method will serve to highlight how there is much more to self-supervised RL than simply adding an LLM or VLM to an RL algorithm; rather, self-supervised RL can be seen as a form of generative AI itself. I will also share some recent work on blazing-fast simulators and new benchmarks, which have accelerated research in my group. Finally, I'll discuss emergent properties in self-supervised RL: preliminary evidence that we have found, and hints for where to go searching for more.
Title: Data Privacy Threats on Advanced Machine Learning Models
Abstract: As machine learning becomes increasingly embedded in society, concerns regarding data privacy and security have grown. Models often rely on sensitive information to achieve high performance, making them attractive targets for adversaries seeking to extract private data. Users are also concerned about privacy leakage, especially when their data might be used without consent. This seminar focuses on data privacy threats on two prominent models: semi-supervised learning modelsand graph neural networks (GNNs).
Semi-supervised learning, which leverages limited labeled data and large volumes of unlabeled data, has shown promising results but also raises privacy concerns about unauthorized data use. To address this issue, we propose a novel membership inference method that helps users determine if their data has been used in training. By leveraging metrics like inter-consistency and intra-entropy tailored for advanced learning algorithms, our method is highly accurate in identifying training data, addressing a critical privacy risk in semi-supervised learning. Graph neural networks are powerful tools for analyzing graph-structured data, such as social networks, but they are also vulnerable to link inference attacks, which exploit the relational nature of GNNs to infer sensitive connections between nodes. To counter this, we introduce the Graph Link Disguise (GRID) solution, an optimization-based defense mechanism that adds crafted noise to node prediction vectors. This approach effectively obscures private link information while maintaining model performance, achieving a superior balance between privacy and utility compared to existing methods. Our studies highlight significant privacy challenges in advanced machine learning models and demonstrate novel methods for mitigating data inference threats.
Title: Balancing Quality and Efficiency in Future AI Systems
Abstract: Quality and efficiency are both essential in AI systems as data sources become more diverse and model sizes grow. In this talk, I will present techniques to address challenges in data and model quality as well as their efficiency, which are essential for building high-performing and sustainable AI systems. I will first introduce a toolkit for enhancing the quality of datasets, which can be used in a broad range of learning tasks including the training or fine tuning large language models (LLMs), laying the groundwork for model training with good data. Then, considering the specific challenge where data is distributed unevenly across sources with varying sizes, quality, and availability, such as in the case of federated learning, I will introduce the FedAU algorithm. This algorithm dynamically adjusts aggregation weights in the model training process based on the availability of data sources, to prevent model bias and improve training convergence. Afterwards, I will introduce techniques to make both training and inference more efficient, focusing on a framework that optimizes model selection from a zoo of LLMs to minimize energy usage while maintaining model performance guarantees. Together, these approaches form a blueprint for future AI systems that are capable of learning effectively from a vast amount of data at diverse sources and delivering high quality models while enhancing resource efficiency in real-world applications.
Title: Space Jammed: Cryptography against Space-Bounded Adversaries
Abstract: Traditional cryptographic frameworks often assume adversaries are limited by computational power. In contrast, this talk explores cryptographic constructions against space-bounded adversaries, where constraints are placed on the adversaries' storage capacity. This paradigm enables unconditionally secure protocols and previously unattainable security notions.
We begin by revisiting Maurer’s Bounded Storage Model (BSM) (CRYPTO '92), where adversaries abide by certain memory bounds throughout their attack. Leveraging Raz’s lower bound on parity learning (FOCS '16), we show simple constructions of a key exchange protocol, a commitment scheme, and an Oblivious Transfer (OT) protocol. All of these constructions achieve information-theoretic security, making them resilient to adversaries with unbounded computational power. Furthermore, our commitment and OT schemes are round-optimal, requiring only 1 and 2 communication rounds, respectively, a significant improvement over the prior best result (Ding et al, TCC '04) of 5 rounds for each.
The second part of the talk focuses on combining computational assumptions with the space constraint to derive powerful new primitives, ones that are unachievable with computational assumptions alone. Specifically, we show how to construct encryption schemes that guarantee privacy even if the adversary is able to retrieve the decryption keys at a later time, and signature schemes where the adversary cannot produce a valid signature on any message, even one it has seen signed before. We present these constructions in the context of Incompressible Cryptography, a novel framework where the adversary can use an arbitrary amount of memory during protocol execution but is bounded by its long-term storage.