Adding more categories is rarely good enough: How to translate data inclusivity best practice into analytical inclusivity
Representative samples of individuals are necessary to make accurate inferences to target populations of interest. When populations of people are of interest, it is often crucial that samples are representative across a suite of demographic variables, notably race, ethnicity, sex, and gender. Current best practice is usually to include many different categories for these variables that an individual can identify into (perhaps multiple categories simultaneously) in an effort to ensure maximum inclusion of demographic diversity. However, there are often major obstacles to translating this “data inclusivity” into an actual “analytical inclusivity,” once data have been collected and are analyzed with various statistical techniques. For instance, analyses usually contain a single “ethnicity” variable, but does an individual identifying with 3 different ethnicities have only one of these three represented in the analysis? Or, if we have 20 different ethnicity categories, how can we say anything meaningful about the majority of categories that invariably have sparse representation, when most statistical tools require sufficient sample sizes to estimate effects? Is gender best captured as a categorical response, or a continuous one, or something else? If we collapse categories with sparse responses (the usual recommendation), are we not actually destroying the attempted inclusivity of the data collection process? In this talk, I will discuss these questions (and more) in detail and offer modern solutions (when they exist) and ideas for best practice for translating data inclusivity into analytical inclusivity.
Date & Time
Tuesday, October 18th, 1:00-2:30 pm
Hybrid: Zoom and Room 2C, Neville Scarfe Education Building
Dr. Edward Kroc
This Zoom event will be hosted by the PSCTC, ECPS, Faculty of Education, UBC
Registration closes October 17, 2022