Research
How people understand the systems they use — and how those systems should be designed with that in mind.
2024–2026
AI Coaching & Trust Calibration in Negotiations
Designed and ran a 267-person experiment on Trucey, an AI negotiation coach. Conversational AI reduced fear but degraded human judgment — imposing cognitive load under stress and undermining integrative thinking. Cognitive profiles predicted when AI guidance helped vs. harmed.
2025–2026
Mutual Theory of Mind in AI-Assisted Education
Investigated trust-reliance dynamics in AI-assisted learning contexts. Users pragmatically relied on AI despite accuracy concerns — revealing a disconnect between stated trust and actual reliance behavior, and how students calibrate (or fail to) their trust in AI vs. human responses.
2024–2025
RL Agent Misalignment
Experiments on model organisms of misalignment in Minecraft RL environments. Empirically documented alignment failures including mesa-optimizer objective resistance, reward hacking via underground tunneling, and instrumental convergence failures.
2025
Multimodal AI Safety Evaluation
Systematic review of 176 multimodal AI systems using an LLM-assisted research pipeline (Gemini Pro, κ=0.717 IRR). Found 93% strip social context via modality-to-text conversion, and 45% lack ethical discussion — with evaluation over-relying on static benchmarks over human-centered assessment.