Standardized Letters of Recommendation in Plastic Surgery: T… : Plastic and Reconstructive Surgery

The selection of trainees is essential to the success of the field of plastic surgery and the success of individual programs. Programs each have unique algorithms and methods in selecting their interviewees and ranking applicants. Although there are a variety of important selection factors, including United States Medical Licensing Examination scores, Alpha Omega Alpha designation, clerkship grades, and research endeavors, letters of recommendation are consistently found to be the most important resident selection factor.1–3 Both what the letters convey and who writes the letters are important to the quality of a letter of recommendation.1,3,4 With so much importance placed on the letter of recommendation, it is vital to understand both the strengths and shortcomings of relying on letters of recommendation for an accurate representation of applicants.

Although narrative letters of recommendation have long been the standard, standardized letters of recommendation have seen increasing popularity in the last few decades. These standardized letters use percentile scoring in various applicant attributes combined with short answer questions to describe an applicant and are used either alone or in combination with a narrative letter of recommendation. In 1995, emergency medicine first introduced the concept of standardized letters of recommendation in effort to curtail adjective inflation and increase the relaying of accurate information about applicants. This was met with a positive response, showing increased interrater reliability between standardized letters of recommendation and decreased time to interpret the letter.5 In 2011, otolaryngology adopted the standardized letters of recommendation,6 followed by plastic surgery in 2012.7 Since 2012, both dermatology and orthopedic surgery have created their own versions of standardized letters. Although standardized letters of recommendation in these other specialties have seen some success, they are not without their own shortcomings; analysis of standardized letters in emergency medicine and otolaryngology reveals grade inflation and gender bias.8–12

Throughout the years, the use of standardized letters of recommendation within plastic surgery has undergone revisions, and they are largely used as an adjunct to the narrative letter of recommendation.7 Although the standardized letter has existed for nearly a decade, there has yet to be a critical analysis of its strengths, weaknesses, and intrinsic biases. Given recent literature that has unveiled the low rates of minority representation among interviewees and residents13,14 and persistent racial and gender disparities,15–17 it is important to understand how letters of recommendation may play a role in perpetuating these disparities. Through analysis of scoring patterns, this study evaluated the impact of race and gender on performance in standardized letters of recommendation and provides suggestions for programs to optimally interpret them and minimize bias when writing letters of recommendation.


Data Collection

Institutional review board exemption was obtained at the senior authors’ (A.A.G.) institution to analyze standardized letters of recommendation that were submitted to the institution between 2015 and 2019. Available standardized letters were those of interviewed applicants to the integrated plastic surgery program between 2015 and 2018 (82 total applicants) and all applicants for the year 2019 (241 total applicants). The format of the standardized letters was changed before the 2019/2020 application year, with the adoption of percentile format, deletion of the Conscientiousness category, and addition of the Patient Care and Research/Training categories.7 Two authors (S.W. and J.B.) performed the data gathering from secure standardized letters of recommendation to an encrypted and anonymized data spreadsheet, while two authors (M.K.C. and M.R.) performed the data interpretation and analysis. The data collected included applicant demographics, letter writer demographics, and scores in each standardized letters of recommendation category. Gender of the letter writer was ascertained from online biographies or program websites. Data and demographic information were not complete for every applicant—the relevant sample size for a presented value is specifically mentioned in each data table if data were not available for the entire sample. Statistical analyses were performed using SPSS 2020 software (IBM, Armonk, N.Y.).

Data Analysis

Because of differences in application availability and format of the standardized letters of recommendation, the 2015 to 2018 standardized letters were analyzed separately from the 2019 standardized letters. The 2015 to 2018 standardized letter characteristics have five scoring categories: top 5 percent, top 10 percent, top 25 percent, top 50 percent, and bottom 50 percent, which were numerically represented as 1, 2, 3, 4, and 5, respectively. Given their numerical spectrum in a percentile format and lack of independence, these categories were represented as a continuous variable for relevant analyses, such as t testing. The data from the 2019 standardized letters of recommendation were in percentile format and were analyzed as a continuous variable. Linear regression analysis was used to determine a correlation between length of contact between applicant and letter writer and standardized letters of recommendation scores. One-way analysis of variance was used to determine if nature of contact was correlated to standardized letters of recommendation scores; a post hoc test was unable to be performed as the data did not fit relevant assumptions (not continuous or normally distributed, unequal sample sizes). Multivariable regressions were used to analyze whether other demographic factors, including age, hometown, and school region, correlated to standardized letters of recommendation score. Unpaired two-tailed t tests were utilized to compare scores by gender and by race/ethnicity.

Interrater reliability was calculated for each of the standardized letters of recommendation categories among three randomly selected letters for one applicant as Fleiss kappa. For the 2019 standardized letters, the response percentiles were binned into five categories: 0 to 20 percent, 21 to 40 percent, 41 to 60 percent, 61 to 80 percent, and 81 to 100 percent. Strength of agreement was considered poor if less than 0.20, fair if 0.21 to 0.40, moderate if 0.41 to 0.60, and good if greater than 0.6.


A total of 267 standardized letters of recommendation for 82 applicants from 2015 to 2018 were analyzed, which represents a total of 11.3 percent of U.S. plastic surgery applicants from those years.18–21 Each student had, on average, 3.25 standardized letters of recommendation. A total of 677 standardized letters of recommendation for 241 applicants from 2019 were analyzed, which represents nearly all applicants who ranked programs in plastic surgery from that year. Each student had, on average, 2.79 standardized letters of recommendation. The majority of letters were written by division/department heads or program directors (Table 1). Most often, letter writers knew the applicants through clinical contact (>50 percent), although some were acquainted through research contact (7 percent) or both research and clinical contact (~30 percent).

Table 1. -
Applicant and Letter Writer Demographics


Total no. of applicants 82 241
Total no. of letters 267 677
Average no. of letters per applicant ± SD 3.25 ± 0.90 2.79 ± 0.99
Average length of contact, wk (n = 265)
66.6 ± 71.5
(n = 645)
63.90 ± 84.52
Average applicant age ± SD, yr 29.7 ± 2.32 27.47 ± 2.58
Applicant race (n = 70) (n = 238)
 Asian 27% 21%
 Black 0% 4%
 Mixed 6% 6%
 Other 1% 65%
 Hispanic White 10% 4%
 Non-Hispanic White 56%
Applicant gender identity

 Female 50% 47%
 Male 50% 53%
 Transgender or nonbinary 0% 0%
Applicant hometown location (n = 72) (n = 240)
 Midwest 8% 18%
 Northeast 14% 20%
 South 26% 25%
 West 28% 20%
 International 24% 6%
 Multiple 0% 11%
Applicant medical school location

 Midwest 17% 21%
 Northeast 2% 28%
 South 2% 32%
 West 33% 12%
 International 24% 7%
 Multiple 3% 1%
Average Step 1 score ± SD 249 ± 11.4 246 ± 13
Average Step 2 score ± SD (n = 58)
256 ± 10.0
(n = 186)
253 ± 13
Average no. of publications ± SD (n = 79)
7.49 ± 12.09
(n = 223)
5.9 ± 10.6
Average no. of endeavors ± SD 10.55 ± 11.2 19.9 ± 20.0
Writer gender (n = 267) (n = 677)
 Female 17% 12%
 Male 83% 86%
 Multiple writers of different genders 0% 2%
Writer region (n = 267) (n = 677)
 Midwest 15% 25%
 Northeast 20% 30%
 South 35% 30%
 West 30% 13%
 International 0% 2%
Writer position (n = 267) (n = 677)
 Department chair/chief 45% 32%
 Program director 27% 27%
 Professor 8% 13%
 Associate professor 10% 11%
 Assistant professor 6% 9%
 Medical student director 2% <1%
 Private practice 1% 1%
 Other 1% 5%
Writer division (n = 267) (n = 677)
 Plastic surgery 85% 90%
 General surgery 9% 6%
 Private practice <1% <1%
 Other 5% 3%
Nature of contact (n = 267) (n = 677)
 Clinical 67% 56%
 Research 7% 7%
 Clinical and research 24% 34%
 Other§ <1% <1%

*Demographic information for both applicants and letter writers is presented here. When information was not available for every applicant for a specific demographic variable, the relevant sample size is designated (n).

Includes plastic surgery subspecialties such as aesthetic surgery, microsurgery, and craniofacial surgery.

Includes bariatric and minimally invasive surgery, burn service, cardiothoracic surgery, comprehensive breast health, Global Health Institute, hand surgery, master’s education in plastic surgery, neurosurgery, orthopedic surgery, otolaryngology, pediatric surgery, and surgical oncology.

§Includes advising relationships.

2015 to 2018 Integrated Applicants

Demographics for this group of applicants and letter writers are listed in Table 1. Every applicant attribute had an average score in the top 10 percent. The highest-scoring categories were Professionalism, Team Player, and Work Ethic, and the lowest scoring categories were Overall Rating, Conscientiousness, and Academic Performance (Table 2). According to linear regression analysis, a longer length of contact between the letter writer and applicant was correlated with higher scores in all characteristics (p < 0.05) except for the categories of Technical Ability (p = 0.10) and Team Player (Table 3). Clinical contact alone as compared to research contact correlated with significantly lower scores in the Overall (clinical: 1.61, research: 1.18, p = 0.002), Conscientiousness (clinical: 1.65, research: 1.38, p = 0.01), and Academic Performance categories (clinical: 1.68, research: 1.28, p < 0.001) (Table 3). Applicant age (p = 0.84), hometown (p = 0.89), and school region (p = 0.90) were not correlated to applicant scoring. Fleiss kappa analysis demonstrated poor agreement (κ < 0.2) among letter writers for every applicant attribute.

Table 2. -
Standardized Letters of Recommendation Rating Distributions

2015–2018 SLOR (Average Score ± SD)* 2019 SLOR (Average Percentile Rank ± SD)
Total SLOR, no. 267 679
Overall rating (n = 265/672) 1.50 ± 0.68 86.09 ± 11.49
Work Ethic (n = 265/674) 1.26 ± 0.50 90.44 ± 9.91
Technical Ability (n = 258/674) 1.44 ± 0.76 85.16 ± 10.58
Conscientiousness (n = 255/NA) 1.56 ± 0.68 NA
Self-Initiative (n = 259/674) 1.31 ± 0.62 90.22 ± 10.48
Communication (n = 266/674) 1.36 ± 0.56 88.14 ± 10.72
Academic (n = 262/669) 1.54 ± 0.74 86.90 ± 10.76
Team Player (n = 266/669) 1.24 ± 0.50 90.54 ± 10.25
Professionalism (n = 227/675) 1.12 ± 0.35 91.62 ± 9.88
Patient Care (n = NA/647) NA 86.78 ± 12.00
Research/Training (n = NA/665) NA 91.62 ± 9.88

SLOR, standardized letters of recommendation; NA, not applicable.

*For the 2015 to 2018 cohort, categories are treated as a continuous variable in which 1 is assigned to the top 5 percent category, 2 is assigned to the top 10 percent category, 3 is assigned to the top 25 percent category, 4 is assigned to the top 50 percent category, and 5 is assigned to the below 50 percent category. For the 2019 cohort, ratings are a continuous variable from the zero to 100th percentiles.

Table 3. -
Relationship between Length of Contact and Standardized Letters of Recommendation Scores

2015–2018 SLOR 2019 SLOR
Regression Equation
Regression Equation
Overall rating Y = −0.152X + 1.6 0.007 y = 0.195x + 84.2 <0.001
Work Ethic Y = −0.140X + 1.3 0.011 y = 0.160x + 89.1 <0.001
Technical Ability Y = −0.080X + 1.5 0.102 y = 0.149x + 84.0 <0.001
Conscientiousness Y = −0.164X + 1.7 0.004 N/A
Self-Initiative Y = −0.150X + 1.4 0.008 y = 0.197x + 88.5 <0.001
Communication Y = −0.166X + 1.4 0.003 y = 0.099x + 87.2 0.006
Academic Y = −0.189X + 1.7 0.001 y = 0.155x + 85.5 <0.001
Team Player Y = −0.135X + 1.3 0.140 y = 0.065x + 89.9 0.051
Professionalism Y = −0.197X + 1.2 0.001 y = 0.136x + 90.6 <0.001
Patient Care NA

y = 0.092x + 86.8 0.011
Research/Training NA

y = 0.225x + 84.7 <0.001

SLOR, standardized letters of recommendation; NA, not applicable.

*This represents linear regressions performed to determine whether there is an association between length of contact and standardized letters of recommendation scores.

†“Y” is the SLOR score from 2015 to 2018 on a scale of 1 (top 5 percent category) to 5 (below 50 percent category). “X” is length of contact measured in weeks. Negative correlations indicate higher scoring with longer length of contact.

‡“y” is the SLOR score from 2019 on a percentile scale. “x” is the length of contact measured in weeks. Positive correlations indicate higher scoring with longer length of contact.

2019 Integrated Applicants

Demographics for the 2019 applicants and letter writers are listed in Table 1. No scoring category had an average score below the eighty-fifth percentile. The highest-scoring categories were Team Player, Work Ethic, and Self-Initiative, and the lowest-scoring categories were Technical Ability, Overall, and Research/Training (Table 2). A longer length of contact between the letter writer and applicant was significantly correlated with higher scores in all characteristics (p < 0.005) (Table 3). The nature of contact between letter writer and applicant affected standardized letters of recommendation scores as follows: clinical contact correlated to significantly lower scores than research contact in every category (p < 0.001) (Table 4). Applicant age (p = 0.31), hometown (p = 0.87), and school region (p = 0.16) were not correlated to applicant scoring. Fleiss kappa analysis demonstrated poor agreement (κ < 0.2) among letter writers for every applicant attribute.

Table 4. -
Relationship between Nature of Contact and Standardized Letters of Recommendation Scores

2015–2018 SLOR 2019 SLOR
Clinical Research Both
Clinical Research Both
Overall rating 1.61 ± 0.72 1.18 ± 0.53 1.33 ± 0.54 0.002 83.61 ± 11.61 89.07 ± 13.06 89.48 ± 9.50 <0.001
Work Ethic 1.32 ± 0.55 1.11 ± 3.23 1.16 ± 0.41 0.065 88.68 ± 10.10 93.33 ± 9.77 92.75 ± 8.87 <0.001
Technical Ability 1.44 ± 0.68 1.13± 0.52 1.54 ± 1.01 0.235 83.33 ± 10.94 91.00 ± 7.59 87.37 ± 9.58 <0.001
Conscientiousness 1.65 ± 0.71 1.38 ± 0.87 1.36 ± 0.52 0.011 NA NA NA <0.001
Self-Initiative 1.39 ± 0.68 1.22 ± 0.55 1.16 ± 0.44 0.057 87.78 ± 10.81 95.00 ± 7.31 93.26 ± 8.94 <0.001
Communication 1.42 ± 0.60 1.17 ± 0.38 1.25 ± 0.47 0.084 86.24 ± 10.84 90.47 ± 13.62 90.73 ± 9.09 <0.001
Academic 1.68 ± 0.80 1.28 ± 0.58 1.29 ± 0.52 <0.001 84.87 ± 10.59 90.47 ± 10.68 89.53 ± 9.88 <0.001
Team Player 1.30 ± 0.56 1.06 ± 0.24 1.14 ± 0.04 0.068 89.05 ± 10.50 92.33 ± 12.69 92.72 ± 8.87 <0.001
Professionalism 1.14 ± 0.38 1.06 ± 0.25 1.10 ± 0.30 0.596 90.24 ± 9.81 94.89 ± 8.15 93.27 ± 9.73 <0.001
Patient Care NA NA NA NA 86.13 ± 9.47 86.54 ± 14.41 90.13 ± 9.67 <0.001
Research/Training NA NA NA NA 83.45 ± 11.94 92.67 ± 10.31 91.03 ± 9.57 <0.001

SLOR, standardized letters of recommendation; NA, not applicable.

*The p value represents an unpaired two-tailed t test comparing clinical and research contact.

p < 0.05 was considered statistically significant.

Analysis by Gender

Fifty percent of the 2015 to 2018 applicant cohort and 47 percent of the 2019 applicant cohort were female applicants. In the 2015 to 2018 cohort, applicants of the female gender scored lower on average than men in seven of nine categories, and this difference was significant in Academic Performance (male applicants, 1.45; female applicants, 1.63; p = 0.05), despite no significant difference in United States Medical Licensing Examination Step 1 or step 2 score by gender. Furthermore, male letter writers scored male applicants significantly higher in the Overall, Conscientiousness, Self-Initiative, and Academic Performance categories, whereas female letter writers scored female applicants higher in the Communication and Overall categories (Fig. 1). As more than 80 percent of letter writers are men, this cumulatively advantaged male applicants. In the 2019 cohort, female gender was not significantly correlated with lower scoring in any categories.

Fig. 1.:

The 2015 to 2018 standardized letters of recommendation (SLOR) scores by gender of applicant and writer. Blue bars represent male (M) applicants, and green bars represent female (F) applicants. An asterisk over M-M (male-to-male) writers indicates a statistically significant difference (p < 0.05) via paired t testing between male writer rating of female and male applicants. An asterisk over F-F (female-to-female) writers indicates a statistically significant difference (p < 0.05) using paired t testing between female writer rating of female and male applicants.

Analysis by Race/Ethnicity

Fifty-six percent of applicants in the 2015 to 2018 cohort and 65 percent of applicants in the 2019 cohort identified as non-Hispanic White students. Only 10 percent of the 2015 to 2018 cohort and 8 percent of the 2019 cohort identified as Black or Hispanic. Minority students were considered to be Black, Hispanic, Native American/Alaska Native, Asian, or mixed-race students. Asian students were specifically included because of racial biases that exist in evaluation, although they are not underrepresented in medicine. In the 2015 to 2018 cohort, minority students scored lower than non-Hispanic White students in eight out of nine categories, although this difference was not significant. In the 2019 cohort, applicants of a minority race received lower scores on average in nine out of 10 categories, a difference that was significant in the Team Player category (non-Hispanic White, 91.5; minority, 88.2; p = 0.015) (Fig. 2).

Fig. 2.:

The 2019 standardized letters of recommendation (SLOR) scores by applicant race/ethnicity. Hispanic, Black/African American, Asian, and multiracial applicants are included in the group labeled “Minority.” An asterisk represents a statistically significant (p < 0.05) difference between scoring of the groups.


Given the ongoing challenges that plastic surgery faces in diversity in both trainee and leadership representation,13–15,22 it is crucial to analyze how plastic surgeons are judged and recruited along various steps in the professional pathway. This study presents the first critical analysis of the use of standardized letters of recommendation in plastic surgery resident selection and the potential role of gender and racial bias. In reflecting on the results of this study, the investigators hope to provide insight into how letter writers can improve upon their use of the standardized letters of recommendation in describing an applicant, and how letter readers can optimize their interpretation of the standardized letters of recommendation.

Grade Inflation

In a percentile distribution, the population should be equally distributed in each single percentile from zero to 100. Mathematically, it does not make sense for the applicants to be scoring, on average, in the eighty-fifth percentile or above in most categories; yet this is exactly what is happening with score inflation in the plastic surgery standardized letters of recommendation. This phenomenon of score inflation is not isolated to the field of plastic surgery; it is described in both emergency medicine and otolaryngology, in which nearly all standardized letters of recommendation responses are in the top two deciles (nine, 10). The narrative letters of recommendation, which are usually used as an adjunct to the standardized letters of recommendation, fall prey to a similar phenomenon, titled “superlative inflation.”23,24 Naturally, this makes it very difficult to differentiate applicants given the lack of granularity in scoring in the top two percentiles. Schools and letter writers have an inherent bias in wanting their students to match and thus being incentivized to score them highly. Letter writers are often aware of the way other students are scored and, in keeping up with the inflation of other applicant scores, may have to score their applicant on the same inflated scale to avoid disadvantaging them. It becomes a cycle that is difficult to break; it would require a mutual nationwide agreement of honesty and transparency to break this cycle, which seems like an unrealistic goal at this point.

Currently, letter readers are left to rely on familiarity with the letter writer, mutual trust, and deciphering hidden phrases in order to take a letter of recommendation at face value.25 Even with this, letter readers often exhibit low interreader reliability,26 with interpretations of different phrases differing based on the faculty reader. This study aimed to aid in interpretation of standardized letters of recommendation as it shows that letter writers have lower scoring and perhaps more honest scoring in the following categories: Overall rating, Academic, Technical Ability, and Research/Training. These categories are worth attention, as it appears that there is some variability in scoring and, thus, some information that may be gleaned. It may be prudent to not place weight on those categories that all applicants score highly on (i.e., Work Ethic, Self-Initiative, Professionalism) and use these categories more to scan for “red flags”; if an applicant scores low in a classically inflated category, it may be concerning. Both applicants and programs may find it interesting that both reduced length of contact and isolated clinical contact (as may happen with letters from away rotations) are characteristics that correlate with lower standardized letters of recommendation scoring.

The Impact of Gender

In looking at narrative letters of recommendation, there is a long history of female applicants being described with different verbiage than male applicants. Turrentine et al. noted that superlatives, the applicant’s name, and comments relating to ability, achievement, awards, leadership, and scholarship were all mentioned more often for male applicants to surgical residency by both male and female letter writers.27 On the other hand, physical descriptions and doubt raisers were more frequently mentioned in female applicants’ letters, and more frequently positive statements included qualifiers.27 These types of differences pervade across fields and job level, affecting medical school evaluations,28 fellowship applications,29 and academic jobs.30,31

Standardized letters of recommendation in emergency medicine and otolaryngology, although they still suffer from some gender bias, have demonstrated an overall decrease in gender bias as compared to narrative letters of recommendation.8,11,12 This study demonstrated gender bias in the 2015 to 2018 standardized letters of recommendation, with significantly lower scoring of female in the Academic category an overall preference of male letter writers to male applicants and female letter writers to female applicants. These findings, however, do not persist in the 2019 standardized letters of recommendation responses, which at a preliminary level may represent some success in the revisionary process that led to the new 2019 standardized letters of recommendation form succeeding in reducing areas for interpretation or bias.7 Further investigation will be required in oncoming years to investigate whether gender bias persists in the standardized letters of recommendation and narrative letters of recommendation in plastic surgery. For now, the results of this study serve as a warning to all genders of letter writers to examine and mitigate their own gender-based expectations and biases.

The Impact of Race/Ethnicity

Underrepresented minorities face additional challenges compared to their colleagues along almost every step of the pipeline in medicine and surgery. Although, as of 2019, African Americans and Hispanics comprised at least 13.4 and 18.5 percent, respectively, of the U.S. population,32 they comprised only 7.1 percent and 6.2 percent of medical school matriculants.33 Despite an overall increase in representation of underrepresented minorities in medical school graduates, there has overall been a decline in Black representation of integrated plastic surgery residents.13 The proportion of Black applicants was significantly higher than the resident representation in each examined year from 2010 to 2016.13 Even after admission to a residency program, underrepresented minorities often face significant barriers to retention and success.34

Although gender bias has been investigated in many fields, few studies have examined the barriers faced by underrepresented minorities specifically relating to letters of recommendation. In radiology, it was demonstrated that, in narrative letters of recommendation, Black and Latinx applicants are described as less agentic than Whites and Asians.35 In orthopedic surgery, underrepresented minorities in orthopedic surgery are significantly less frequently described with standout words as compared to their White colleagues in narrative letters of recommendation; these differences were not noted in standardized letters of recommendation.36 In this study, analysis was limited by the small sample size of Black and Latinx applicants. This study was able to demonstrate that, in the 2019 standardized letters of recommendation, students of color were scored lower in the Team Player category. Overall, the racial biases in both standardized letters of recommendation and narrative letters of recommendation warrant further study, and the results of this study are a reminder to all letter writers to be aware of and mitigate their own racial biases.

How to Decrease Bias in Letters of Recommendation

To work with diverse trainees, it is imperative that everyone unearth, examine, and address their implicit biases. The evidence indicates that health care professionals exhibit the same amount of implicit bias as the wider population, with the following characteristics at issue: race/ethnicity, gender, socioeconomic status, age, mental illness, disability, and social circumstances.37 The evidence does not indicate a clear path to bias reduction, but some promising strategies include introducing bias literacy, exposure to counter stereotypical exemplars, evaluative conditioning, encouraging mentorship and sponsorship, and identifying the self with the outgroup exhibit promise.38,39 In filling out letters of recommendation for applicants, the literature recommends to focus on the following for both female and racial minority applicants: (1) avoid physical descriptions of applicants; (2) emphasize accomplishments, not effort; (3) be mindful of raising doubt; (4) write a letter of equal length to those of other applicants; (5) mention research and publications; (6) mention leadership positions and qualities; and (7) review the adjectives used. When assigning percentiles to applicant qualities, faculty should review the objective metrics (i.e., for academic skills reflect on their examination scoring and for research skills reflect on their research productivity) and reconcile those with the faculty’s personal gestalt to provide some balance to any potential inner bias. To remedy score inflation, faculty must collectively be honest in scoring applicants. Fiftieth percentile, rather than being egregiously low as it is now, should stand for what it means—an applicant is average in that category. This will improve the ability of interviewers to actually rely on the word of their faculty colleagues and enhance the reputation of a letter writer for writing useful letters.


We acknowledge several limitations to this study. Gender in our cohort is described as binary because of the lack of self-identified nonbinary applicants in the described cohorts. We acknowledge that gender is not a binary variable. The analysis of race/ethnicity was limited by a lack of data regarding the race/ethnicity of letter writers to examine how race concordance could play a role. The study was underpowered in Black and Latinx applicants and had no Native American/Alaska Native applicants. Although gender and race/ethnicity were analyzed, we were unable to address other minority groups, including LGBTQ and first-generation, low-income groups. The 2015 to 2018 cohort is limited in size and could have selection bias, given that they represent those applicants chosen to be interviewed by a single institution. It is difficult to account for all confounders that could affect applicant scoring. Although this study analyzed and addressed standardized letters of recommendation, it did not collect data on the narrative letters of recommendation.


Letters of recommendation are considered the most important factor in resident selection, and yet, they are subject to score inflation, gender bias, and underrepresented minority bias. The onus falls on letter writers to recognize areas in which they may be introducing bias into their scoring and improve upon their letter writing practices.


1. Janis JE, Hatef DA. Resident selection protocols in plastic surgery: A national survey of plastic surgery program directors. Plast Reconstr Surg. 2008;122:1929–1939.

2. Rogers CR, Gutowski KA, Rio AM, et al. Integrated plastic surgery residency applicant survey: Characteristics of successful applicants and feedback about the interview process. Plast Reconstr Surg. 2009;123:1607–1617.

3. Liang F, Rudnicki PA, Prince NH, Lipsitz S, May JW Jr, Guo L. An evaluation of plastic surgery resident selection factors. J Surg Educ. 2015;72:8–15.

4. Nagarkar P, Pulikkottil B, Patel A, Rohrich RJ. So you want to become a plastic surgeon? What you need to do and know to get into a plastic surgery residency. Plast Reconstr Surg. 2013;131:419–422.

5. Love JN, Smith J, Weizberg M, et al. SLOR Task Force. Council of Emergency Medicine Residency Directors’ standardized letter of recommendation: The program director’s perspective. Acad Emerg Med. 2014;21:680–687.

6. Kimple AJ, McClurg SW, Del Signore AG, Tomoum MO, Lin FC, Senior BA. Standardized letters of recommendation and successful match into otolaryngology. Laryngoscope 2016;126:1071–1076.

7. Reghunathan M, Mehta I, Gosman AA. Improving the standardized letter of recommendation in the plastic surgery resident selection process. J Surg Educ. 2021;78:801–812.

8. Li S, Fant AL, McCarthy DM, Miller D, Craig J, Kontrick A. Gender differences in language of standardized letter of evaluation narratives for emergency medicine residency applicants. AEM Educ Train. 2017;1:334–339.

9. Grall KH, Hiller KM, Stoneking LR. Analysis of the evaluative components on the standardized letter of recommendation (SLOR) in emergency medicine. Western Journal of Emergency Medicine 2014;15:419–423.

10. Kominsky AH, Bryson PC, Benninger MS, Tierney WS. Variability of ratings in the otolaryngology standardized letter of recommendation. Otolaryngol Head Neck Surg. 2016;154:287–293.

11. Friedman R, Fang CH, Hasbun J, et al. Use of standardized letters of recommendation for otolaryngology head and neck surgery residency and the impact of gender. Laryngoscope 2017;127:2738–2745.

12. Messner AH, Shimahara E. Letters of recommendation to an otolaryngology/head and neck surgery residency program: Their function and the role of gender. Laryngoscope 2008;118:1335–1344.

13. Parmeshwar N, Stuart ER, Reid CM, Oviedo P, Gosman AA. Diversity in plastic surgery: Trends in minority representation among applicants and residents. Plast Reconstr Surg. 2019;143:940–949.

14. Silvestre J, Serletti JM, Chang B. Racial and ethnic diversity of U.S. plastic surgery trainees. J Surg Educ. 2017;74:117–123.

15. Smith BT, Egro FM, Murphy CP, Stavros AG, Nguyen VT. An evaluation of race disparities in academic plastic surgery. Plast Reconstr Surg. 2020;145:268–277.

16. Phillips NA, Tannan SC, Kalliainen LK. Understanding and overcoming implicit gender bias in plastic surgery. Plastic Reconstr Surg. 2016;138:1111–1116.

17. Bucknor A, Kamali P, Phillips N, et al. Gender inequality for women in plastic surgery: A systematic scoping review. Plast Reconstr Surg. 2018;141:1561–1577.

18. National Resident Matching Program. Results and Data: 2015 Main Residency Match. Available at: Accessed December 24, 2020.

19. National Resident Matching Program. Results and Data: 2016 Main Residency Match. Available at: Accessed December 24, 2020.

20. National Resident Matching Program. Results and Data: 2015 Main Residency Match. Available at: Accessed December 24, 2020.

21. National Resident Matching Program. Results and Data: 2015 Main Residency Match. Available at: Accessed December 24, 2020.

22. Chen W, Baron M, Bourne DA, Kim JS, Washington KM, De La Cruz C. A report on the representation of women in academic plastic surgery leadership. Plast Reconstr Surg. 2020;145:844–852.

23. Greenburg AG, Doyle J, McClure DK. Letters of recommendation for surgical residencies: What they say and what they mean. J Surg Res. 1994;56:192–198.

24. Fortune JB. The content and value of letters of recommendation in the resident candidate evaluative process. Current Surgery 2003;59:79–83.

25. Saudek K, Saudek D, Treat R, Bartz P, Weigert R, Weisgerber M. Dear program director: Deciphering letters of recommendation. J Grad Med Educ. 2018;10:261–266.

26. Dirschl DR, Adams GL. Reliability in evaluating letters of recommendation. Acad Med 2000;75:1029.

27. Turrentine FE, Dreisbach CN, St Ivany AR, Hanks JB, Schroen AT. Influence of gender on surgical residency applicants’ recommendation letters. J Am Coll Surg. 2019;228:356–365.e3.

28. Issac C, Chertoff J, Lee B, Carnes M. Do students’ and authors’ genders affect evaluations? A linguistic analysis of medical student performance evaluations. Acad Med. 2011;86:59–66.

29. Hoffman AL, Grant WJ, McCormick MF, et al. Gendered differences in letters of recommendation for transplant surgery fellowship applicants. Academic Surgical Congress Abstracts Archive 2019;76:427–432. Available at: Accessed January 16, 2019.

30. Madera JM, Hebl MR, Martin RC. Gender and letters of recommendation for academia: Agentic and communal differences. J Appl Psychol. 2009;94:1591–1599.

31. Schmader T, Whitehead J, Wysocki VH. A linguistic comparison of letters of recommendation for male and female chemistry and biochemistry job applicants. Sex Roles 2007;57:509–514.

32. U.S. Census Bureau. QuickFacts: United States. Available at: Accessed December 24, 2020.

33. Association of American Medical Colleges. Table A-9: Matriculants to U.S. MD-Granting Medical Schools by Race, Selected Combinations of Race/Ethnicity and Sex, 2018-2019 through 2021-2022. Available at: Accessed December 24, 2020.

34. Osseo-Asare A, Balasuriya L, Huot SJ, et al. Minority resident physicians’ views on the role of race/ethnicity in their training experiences in the workplace. JAMA Netw Open 2018;1:e182723.

35. Grimm LJ, Redmond RA, Campbell JC, Rosette AS. Gender and racial bias in radiology residency letters of recommendation. J Am Coll Radiol. 2020;17:64–71.

36. Powers A, Gerull KM, Rothman R, Klein SA, Wright RW, Dy CJ. Race- and gender-based differences in descriptions of applicants in the letters of recommendation for orthopaedic surgery residency. JBJS Open Access 2020;5:e20.00023–e20.00023.

37. FitzGerald C, Hurst S. Implicit bias in healthcare professionals: A systematic review. BMC Med Ethics 2017;18:19.

38. FitzGerald C, Martin A, Berner D, Hurst S. Interventions designed to reduce implicit prejudices and implicit stereotypes in real world contexts: A systematic review. BMC Psychol. 2019;7:29.

39. DiBrito SR, Lopez CM, Jones C, Mathur A. Reducing implicit bias: Association of women surgeons #HeForShe task force best practice recommendations. J Am Coll Surg. 2019;228:303–309.

Source link

Back to top button