Award Abstract # 1535435
Collaborative Research: Recruiting STEM Faculty: A Systematic Analysis of the Faculty Hiring Process at Research-Intensive Universities

NSF Org: DRL
Division of Research on Learning in Formal and Informal Settings (DRL)
Recipient: REGENTS OF THE UNIVERSITY OF CALIFORNIA, THE
Initial Amendment Date: July 30, 2015
Latest Amendment Date: July 30, 2020
Award Number: 1535435
Award Instrument: Standard Grant
Program Manager: Jolene Jesse
jjesse@nsf.gov
 (703)292-7303
DRL
 Division of Research on Learning in Formal and Informal Settings (DRL)
EDU
 Directorate for STEM Education
Start Date: September 1, 2015
End Date: August 31, 2021 (Estimated)
Total Intended Award Amount: $491,830.00
Total Awarded Amount to Date: $491,830.00
Funds Obligated to Date: FY 2015 = $491,830.00
History of Investigator:
  • Catherine Albiston (Principal Investigator)
    calbiston@law.berkeley.edu
  • Victoria C Plaut (Co-Principal Investigator)
Recipient Sponsored Research Office: University of California-Berkeley
1608 4TH ST STE 201
BERKELEY
CA  US  94710-1749
(510)643-3891
Sponsor Congressional District: 12
Primary Place of Performance: University of California-Berkeley
2240 Piedmont Avenue # 7200
Berkeley
CA  US  94720-7200
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): GS3YEVSS12N6
Parent UEI:
NSF Program(s): RES ON GENDER IN SCI & ENGINE
Primary Program Source: 04001516DB NSF Education & Human Resource
Program Reference Code(s):
Program Element Code(s): 154400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.076

ABSTRACT

This proposal was submitted in response to EHR Core Research (ECR) program announcement NSF 15-509. As part of ECR, this project is funded by the Research on Gender in Science and Engineering (GSE) program. GSE seeks to understand and address gender-based differences in science, technology, engineering and mathematics (STEM) education and workforce participation through education and implementation research that will lead to a larger and more diverse domestic STEM workforce. This study will address two core research questions in the area of STEM Workforce Development: Are there gender and racial/ethnic disparities in STEM faculty hiring? If so, what conditions, processes and social contexts generate/mitigate these disparities? The researchers will conduct a systematic theory-driven evaluation of the faculty hiring process by compiling an unprecedented dataset on faculty hiring across ten research-intensive universities. The data will be used to test hypotheses about how gender and race/ethnicity influence applications for faculty positions, evaluation of applicants, and outcomes at multiple stages in the faculty hiring process. By identifying the steps in the hiring process that are most susceptible to bias and the characteristics of the hiring process that amplify/mitigate disparities, this study will identify the most important targets for policy interventions aimed at increasing equity and diversity in faculty hiring.

The study will be framed by expectation states theory, which explains how status beliefs (widely shared beliefs that people in one category of a status-based social distinction, such as gender or race, are more socially worthy and competent than those in another category) influence interpersonal interactions, evaluations of self and others, and individual and group behavior to generate and reinforce social inequality. The researchers will construct and use a unique dataset from an online administrative system that compiles information from all faculty recruitments at all University of California campuses. This rich data source includes detailed information on applicant pools, applicant credentials and achievements, hiring processes and committees, and candidates' progression from application through the short list, interview, and offer steps in the process. The richness of the administrative data will be enhanced using automated text analytic tools to generate a dataset that provides large sample sizes, detailed measurement of key variables, links to supplemental data sources that provide measures of influential factors (e.g., institutional prestige), and observable variation in factors which theory predicts affect bias in faculty hiring. Multilevel statistical models will be used to accurately identify and disentangle the influences operating at the faculty search-level versus those operating at the applicant-level to affect the faculty hiring process. Analyses will be disaggregated by STEM discipline, gender, detailed race/ethnicity and race-by-gender categories.

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The Evaluating Equity in Faculty Recruitment (EEFR) project produced a multilevel database from the online faculty recruitment system used by a multi-campus public research university. The database currently includes data for all assistant professor faculty recruitments in STEM fields conducted between 2013 and 2020 across nine distinct campuses. It includes multiple types of recruitment- and applicant-level information that is compiled from both structured and unstructured (e.g., text-based data) administrative records. The analytic utility of these data is enhanced by linkages to data from multiple secondary sources.

Recruitment-level data: The main recruitment-level data file compiles the basic characteristics of each recruitment, including the year, campus, hiring department(s), rank of the recruitment, and the detailed fields targeted for the hire. These data link to files that include measures of the following information for each recruitment:

  • Job advertisement - the language of the job ad used for the faculty recruitment and where it was posted;
  • Department characteristics - the number, demographic composition and rank distribution of the faculty in the hiring department, along with the NRC rank of the department/program;
  • Committee characteristics - the number, demographic composition and rank distribution of the members of the recruitment committee and of the committee chair;
  • Diversity benchmarks - measures of the representation of women, underrepresented minorities and Asians, as well as of gender-by-race/ethnicity subpopulations created from NSF Survey of Earned Doctorates data that include temporary visa holders (internationals) as well as U.S. Citizens and Permanent Residents.

Application-level data: The main application-level file includes variables measuring the characteristics of each application submitted to a faculty recruitment, including the applicant?s demographic characteristics (self-reported via an online survey administered by the recruitment system), current position and institutional affiliation, and application outcome. These data link to files compiling the following information for each application and from which individually identifiable information is deleted to ensure anonymity:

  • CV data - the scholarly achievement of each applicant is coded from the ?Education,? ?Publications? and ?Grants? sections of their uploaded CV with an automated tool that uses text-recognition utilities to identify, extract and structure the textual data. The extracted publication entries are linked to open-source Digital Object Identifier (DOI) registries and proprietary (e.g., Scopus) bibliometric databases to obtain publication-specific measures of citation counts, journal impact factors, and other bibliometric information.
  • Letter of recommendation data - the text of each recommendation letter is parsed to remove person and place names, and then coded at the word-level for using concept-specific dictionaries, at the sentence-level using publicly-available tools (e.g., sentiment classifiers), and at the paragraph-level using a classification tool created by the EEFR team to identify the main topic of each paragraph.
    • Referee data - the referee's gender is imputed from first name using open-source and proprietary utilities, bibliometric information is coded through linkage to Scopus data, and the rank of each referee?s program/department is coded using NRC ranking data.
    • Diversity statement data - the text of the diversity statements submitted by each applicant (when requested/required as part of the application materials) is parsed at the document, paragraph and sentence levels for coding and analysis using, e.g., word-level dictionary approaches and document-level topic modeling techniques.

 

Multiple bespoke coding tools were produced in the process of developing the EEFR database. These tools are available for use in other studies:  

  • Automated CV-coding tool - The tool uses text and text-format recognition utilities to distinguish sections of CVs (uploaded as pdf files), identifies targetted sections (publications, education, and grants) and extracts text strings. The text extracted from the publications section is coded via interaction with online databases (Crossref, DOI.org) to obtain parsed information and linkage to bibliometric data (Scopus). Individually identifiable information is then deleted.
  • Validated Dictionaries - Dictionaries for concepts that are relevant to the analysis of hiring evaluation and selection processes were developed and validated using a multi-step process. We developed first-stage dictionaries from comprehensive reviews of relevant literatures and by applying text-analysis methods (e.g., word embedding models of keywords, part of speech tagging of trait adjectives) to the EEFR document corpuses. A crowd-sourced validation protocol was then applied to these comprehensive dictionaries to generate word-specific validation scores that may be used to produce analytic dictionaries meeting varying validity thresholds.
  • Python Dictionary Tool - A package of utilities that can be used by researchers for text preprocessing and dictionary analysis of job descriptions. The tool consists of a Jupyter notebook, specialized dictionaries, and associated files with instructions that functions as a standalone download for researchers interested in analyzing job descriptions.

 

The EEFR database is currently supporting analyses of: (1) the recruitment characteristics correlated with the composition of the applicant pool and outcomes of the recruitment; (2) gender and/or race/ethnic disparities in the applicants? characteristics, qualifications and labor market outcomes; (3) gender and/or race/ethnic disparities in the association between applicant characteristics and hiring outcomes.


Last Modified: 01/03/2022
Modified by: Catherine Albiston

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page