Publications

Here you can find my research publications across various venues in machine learning, AI, and data science. You can also find my articles on my Google Scholar profile.

Conference Papers

TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding

ACL 2025 Main

TETRIS Algorithm Demo

Data-Centric AI in the Age of Large Language Models

EMNLP 2024 Findings

DETAIL Algorithm Demo

DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning

NeurIPS 2024

DETAIL Algorithm Demo

Probably Approximate Shapley Fairness with Applications in Machine Learning

AAAI 2023 (Oral)

DETAIL Algorithm Demo

Preprints

Uncover Scaling Laws for Large Language Models via Inverse Problems

Under Review at ARR

Uncover Scaling Law Paper Demo

MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents

arXiv

MEM1 visualization

Data Value Estimation on Private Gradients

arXiv

For gradient-based machine learning (ML) methods commonly adopted in practice such as stochastic gradient descent, the de facto differential privacy (DP) technique is perturbing the gradients with random Gaussian noise. Data valuation attributes the ML performance to the training data and is widely used in privacy-aware applications that require enforcing DP such as data pricing, collaborative ML, and federated learning (FL). Can existing data valuation methods still be used when DP is enforced via gradient perturbations? We show that the answer is no with the default approach of injecting i.i.d. random noise to the gradients because the estimation uncertainty of the data value estimation paradoxically linearly scales with more estimation budget, producing estimates almost like random guesses. To address this issue, we propose to instead inject carefully correlated noise to provably remove the linear scaling of estimation uncertainty w.r.t. the budget. We also empirically demonstrate that our method gives better data value estimates on various ML tasks and is applicable to use cases including dataset valuation and FL.