In my dissertation, I develop statistical methods for analyzing rank data. Substantively, my project examines the consequences of an emerging electoral reform — switching from first-past-the-post to ranked-choice voting — for voters’ preferences and minority representation in the U.S. My dissertation is supported by the Electoral Reform Research Group (New America) and is composed of three articles.

In “Does Ranked-Choice Voting Reduce Racially Polarized Voting?” (with Theo Landsman), I examine whether switching from first-past-the-post (FPTP) to ranked-choice voting (RCV) reduces the degree of racially polarized voting. Despite a popular discourse that RCV induces moderation in ethnic party competition, the literature has not yet offered theoretical evidence that RCV brings more moderation than does FPTP. To solve this problem, I offer a multidimensional neo-Downsian theory of ethnic party competition and algorithmic evidence that RCV does not necessarily yield more moderation than FPTP. To test this hypothesis, I collect data from RCV and FPTP mayoral elections in the Bay Area, California from 1994 to 2020. Using rank clustering methods and ecological inference, I show that switching from FPTP to RCV neither increases or decreases the degree of racially polarized voting. This result is robust to various metrics to measure the distance between each pair of ranked preferences.

In “Statistical Methods for Partially Ranked Ballot Data,” I develop a new statistical model for analyzing data from ranked-choice voting (RCV) elections. Many rank elicitation processes, such as RCV elections, generate partially ranked data. Existing methods, however, do not properly account for the underlying legal and behavioral mechanisms of why voters offer partially ranked ballots in elections, leaving researchers non-optimal options to use existing tools to analyze ranked ballot data while making unreasonable assumptions. To remedy this problem, I introduce a novel framework of analyzing and modeling partially ranked data. The core idea is to decompose observed partial rankings into three forms: (1) structural partial rankings controlled by election law (voters are only allowed to rank top-K candidates); (2) incomplete rankings caused by random errors (voters fail to rank lower-choice candidates); and (3) strategic ballot concentration (voters intentionally choose only a single or two candidates), and to model observed partial rankings as a mixture of these data. The last component has been especially important for minority voters and has been known as “plumping” in the alternative voting system and minority representation literature (e.g., cumulative voting and limited voting). I illustrate the proposed method by analyzing individual-level ranked ballots from more than 100 ranked-choice voting elections in the U.S.

In “Causal Inference with Rankings as Generalized Discrete Outcomes,” I propose a potential outcomes-based framework for identifying and estimating causal effects of treatments on ranked outcomes in randomized experiments. Rankings offer rich information about people’s preferences that can give rise to various types of discrete outcomes, including binary, multinomial, and pairwise choices, which are well-studied in political science research. With this framework, researchers can make causal inferences on people’s ranked preferences on a wide array of items in social sciences and medical science. Given the high-dimensional nature of ranked outcomes, I introduce three different causal estimands and derive appropriate estimators (both for partial and point identification) and inference methods for them. I present several optimal experimental designs for each estimand and discuss potential issues that applied researchers must be aware of. I illustrate this method by analyzing heterogeneous ballot order effects by race and ethnicity (the effect of the order in which candidate names appear on a ballot on voters’ ranked outcomes) in upcoming RCV elections in the U.S.