Welcome to Bobby's Homepage

👋 Nice to meet you!

My name is Zijian. I am a second-year PhD candidate advised by Bryan Kian Hsiang Low at NUS. I am currently also a research engineer at Singapore-MIT Alliance for Research and Technology Centre (SMART), advised by Daniela Rus at MIT. Prior to that, I completed my undergraduate studies at NUS, majoring in Computer Science and Mathematics. I also interned at TikTok (Singapore) as an ML engineer for a year on the advertisement moderation team.

🔬 Research Interests

My research journey began with a game-theoretic perspective of machine learning. As data increasingly becomes the fuel that powers large-scale ML models, it is imperative to effectively value, curate, and attribute data to make modern ML systems more reliable, fair, and efficient. My initial research focus was on estimating the value of data in machine learning model training, which involves two key questions: 1. How can we provide "fair" valuation to all data such that data providers are rewarded equitably? 2. How can we evaluate the value of data in privacy-preserving ML settings? With the advent of Large Language Models (LLMs), my research interests have gradually shifted toward understanding the "data" aspects of LLMs. Instead of focusing solely on pre-training data, I now concentrate on post-training aspects including reinforcement fine-tuning, prompt optimization, and inference speedups.

🎯 Current Research Focus Areas

Reinforcement Fine-tuning

Past research has demonstrated the effectiveness of reinforcement learning in improving LLM performance, including aligning with human preferences and strengthening reasoning capabilities. To achieve effective RL, a critical ingredient is learning the policy on trajectories that meet desired outcomes. An interesting problem is how to generate high-quality trajectories to make learning more efficient or enable agents to learn harder tasks.

Prompt Optimization

A newly emerged paradigm in LLMs is using prompts to alter model behavior or inject new knowledge through in-context learning (ICL). In ICL, task demonstrations are supplied in prompts so LLMs can learn during inference. This paradigm is analogous to classic ML training, where task demonstrations are like training samples. A natural research question is how to evaluate and interpret these demonstrations in the inference-time context.

Speculative Decoding

Speculative decoding speeds up next token generation by employing a draft model to quickly speculate future tokens, then letting the target model verify them in parallel. We can treat the draft model as a data generator and the target model as a data consumer. Similar to data selection in ML training, we can carefully select draft tokens most likely to be accepted, or have the drafter generate many tokens and select the most probable ones for verification.

🌟 Let’s Connect!

I'm always excited to discuss research ideas, collaborate on projects, or simply chat about the fascinating world of AI and machine learning. Feel free to connect with me on email, LinkedIn, or X (Twitter) and let's chat!

ZHOU Zijian (Bobby) 周子健