Responsible AI Integration in Survey Research
Responsible AI Integration in Survey Research was written by David M. Rothschild, Jenny Marlar, Ashley Amaya, Soubhik Barari, Trent Buskirk, Curtiss Cobb, Jen Gennai, Sunshine Hillygus, Ramya Korlakai Vinayak, Masha Krupenkin, Sunghee Lee, Darby Steiger, and Brock Webb in 2026. It was published online as an AAPOR report.
My notes focus on section 3.1.2: AI as a Respondent.
The authors identify three applications of AI as a respondent:
- pre-field testing
- post-field imputation
- synthetic data
Pre-field testing of a survey instrument can be an excellent method for identifying e.g. broken skip patterns.
There are several approaches to missingness, some of which are model-based e.g. MRP. While LLMs are generally considered less transparent, they are fundamentally just another model-based estimate. The authors anticipate greater risks to this approach when filling in sparse demographic clusters, or when inferring change over time or across items.
Collecting synthetic responses (alt. 'silicon samples') as a substitute for surveying features the greatest level of risk. These responses generally are too homogeneous, especially with "marginalized groups, culturally specific concepts, and emotionally charged topics" because LLMs are tuned to avoid controversial positions. Synthetic responses seem to reproduce real marginal distributions, but it remains to be seen if they reproduce multivariate structures e.g. correlations.
