Avoiding Bias When Inferring Race Using Name-based Approaches

Maybe your like

Manual validation

The data is presented as a series of distributions of names across race (Table 1). In name-based inference methods, it is not uncommon to use a threshold to create a categorical distinction: e.g., using a 90% threshold, one would assume that all instances of Juan as first name should be categorized as Hispanic and all instances of Washington as a given name should be categorized as Black. In such a situation, any name not reaching this threshold would be excluded (e.g., those with the last name of “Lee” would be removed from the analysis). This approach, however, assumes that the distinctiveness of names across races does not significantly differ.

Download:

PPTPowerPoint slide
PNGlarger image
TIFForiginal image

Table 1. Sample of family names (U.S. Census) and given names (mortgage data).

https://doi.org/10.1371/journal.pone.0264270.t001

To test this, we began our analysis by manually validating name-based inference at three threshold ranges: 70–79%, 80–89%, and 90–100%. We sampled 300 authors from the WoS database, 25 randomly sampled for every combination of racial category and inference threshold. Two coders manually queried a search engine for the name and affiliation of each author and attempted to infer a perceived racial category through visual inspection of their professional photos and information listed on their websites and CVs (e.g., affiliation with racialized organizations such as Omega Psi Phi Fraternity, Inc., SACNAS, etc.).

Fig 1 shows the number of valid and invalid inferences, as well as those for whom a category could not be manually identified, and those for whom no information was found. Name-based inference of Asian authors was found to be highly valid at every considered threshold. The inference of Black authors, in contrast, produced many invalid or uncertain classifications at the 70–80% threshold, but had higher validity at the 90% threshold. Similarly, inferring Hispanic authors was only accurate after the 80% threshold. Inference of White authors was highly valid at all thresholds but improved above 90%. This suggests that a simple threshold-based approach does not perform equally well across all racial categories. We thereby consider an alternative weighting-based scheme that does not provide an exclusive categorization but uses the full information of the distribution.

Avoiding Bias When Inferring Race Using Name-based Approaches

Manual validation

Bias History, Family Crest & Coats Of Arms - HouseofNames

Bias Name Meaning & Bias Family History At ®

Bias Surname Origin, Meaning & Last Name History - Forebears

Bias Surname

When Last Comes First: The Gender Bias Of Names - Cornell Chronicle

BIAS: Origin Of Last Name And Genealogy - Geneanet

Bias Name Origin, Meaning And Family History

Name Bias In Recruitment: Is Your Name Holding You Back? - Raconteur

The Last Name Effect: How Last Name Influences Acquisition Timing

With This Ring, I Thee Take ... Your Hispanic-Sounding Surname?

Bias Meaning - Bias Name Origin | .uk

The Gender Bias Of Names: Surnames Standing Solo Gives Men ...

What's Wrong With Asking “Where Are You From?”

Calling Men By Their Surname Gives Them An Unfair Career Boost

Contact