Professionalism, emotional wellbeing, and dropout intention in health professions students during the pandemic
Ethics
The study was conducted during the pandemic and involved collaboration between institutions in Peru and Spain. The study adhered to the regulatory framework for research involving human subjects in both countries. The ethical review was assigned to the Research Ethics Committee of La Rioja in Spain (Reference Nr. CEImLAR-PI-440) and the Research Ethics Committee of the National University of the Altiplano in Peru (Reference Nr. 03-CIEI-UNA-PUNO), which acted as independent ethics committees in their respective countries.
Informed consent was provided by each respondent prior to the start of data collection. Participation was voluntarily, and secret. An email with a participant information file was sent to 6670 undergraduate students enrolled in medical and nursing programs from four Peruvian universities (two public and two private) during the academic course 2020–2021. Students who accepted to participate in the study confirmed their personal agreement with the conditions described within the participant information. This information included: (i) eligibility criteria (undergraduate students currently living in Peru, and actively enrolled in a medical or nursing program at one of the participant universities); (ii) study protocol (completing a series of psychometric instruments for measuring professionalism abilities and conditions related to students’ emotional wellbeing, and sociodemographic variables); (iii) research purposes, data storage, confidentiality and data management (no personal data will be shared with third parties, and the collected data will be used in an anonymized form); (iv) use of the Internet Protocol (IP) for geolocation and association with epidemiological indicators related to the COVID-19 pandemic; (v) ethical right to withdraw from the study at any point; and (vi) contact details of researchers coordinating the study in each institution.
For reasons related to individual identification privacy and confidentiality, information collected has been anonymised in the published dataset by removing all Personal Identifiable Information (PII). In addition, other sensitive information that could be used for de-identification by crossing with open data sources, such as the names of the medical and nursing schools, IP addresses, and city of residence have been also removed.
Study design
The dataset is the principal outcome of a cross-sectional online questionnaire-based study. The study was performed in Peru with the main purpose of collecting information associated with the academic performance, mental health, and emotional wellbeing of medical and nursing students attending non-face-to-face classes during the COVID-19 pandemic. The information collected included measures of three abilities described as specific components of professionalism (empathy, teamwork, and lifelong learning abilities), and self-perceived measures of symptoms related to anxiety disorder or general depression, loneliness, and subjective wellbeing. In addition, the questionnaire included a socio-demographic form, in which information concerning students’ age, sex, and academic aspects (i.e. discipline, university, academic course, student’s perception of career choice during the pandemic, dropout intention, accessibility to Internet, and electronic facilities for attending non-face-to-face classes) was also collected. Finally, respondents were asked if they had suffered an episode (with a clinical diagnosis) of anxiety and/or depression before the first COVID-19 outbreak in December 2019.
The online recruitment process began in August 2020 and ended on April 16th, 2021, when the last questionnaire was collected through the SurveyMonkey® Web platform. Duplicate questionnaires were avoided restricting a unique access by respondent to the survey. At the end of this process, 2316 surveys were collected. From them, 609 surveys were excluded during the data cleaning process as is shown in Fig. 1. A detailed summary of this process is described in the technical validation section. Once the data cleaning procedure finished, a study sample including 1707 records was obtained. This sample size was greater than the initial estimation required of 1449 records, considering a 99% of confidence level, and a 3% margin of error.

Flowchart of dataset creation.
Psychometric measures
Clinical empathy
The 20-item Jefferson Scale of Empathy (JSE) was used for measuring empathetic abilities in clinical settings, also known as clinical empathy11. This ability has been defined as a predominantly cognitive (rather than an affective or emotional) attribute that involves an understanding (rather than feeling) of experiences, concerns, and perspectives of the patient, combined with a capacity to communicate this understanding, and an intention to help6. Two versions of the JSE were used: the S-Version (JSE-S) for medical students12, and the HPS-Version (JSE-HPS) for students in all health professions disciplines other than medicine, such as nursing13. The S-Version and the HPS-Version have been developed to reflect students’ orientation or attitudes toward empathy in patient care14. The content of both versions is similar with only minor modifications to make the items appropriate for the target groups. For example, the item in the S-Version reading “It is difficult for a physician to view things from patients’ perspective” in the HPS-Version is presented as “It is difficult for a health care provider to view things from patients’ perspective”. Items of the JSE are answered using a Likert scale from 1 (strongly disagree) to 7 (strongly agree). Ten items of the JSE are positively worded, and the other ten are negatively worded. The positively ones were directly scored, whereas the negatively ones required to be transformed prior to calculate a global score. A higher global score indicated a greater development of the ability measured.
Teamwork abilities
The 15-item Jefferson Scale of Attitudes toward Physician-Nurse Collaboration (JSAPNC) was used for measuring inter-professional collaboration abilities between physicians and nurses, also called teamwork15. This ability has been defined as an attribute that nurses and physicians must have for working together cooperatively, sharing responsibilities for solving problems and making decisions to formulate and carry out plans for patient care6. The JSAPNC is answered using a Likert scale from 1 (strongly disagree) to 4 (strongly agree). Thirteen items of the JSAPNC are positively worded, and two are negatively worded. So, the positively ones were directly scored, whereas the negatively ones required to be transformed prior to calculate a global score. A higher global score indicated a greater development of teamwork abilities.
Lifelong learning abilities
The 14-item Jefferson Scale of Physician Lifelong Learning (JeffSPLL) was used for measuring lifelong learning abilities16. This ability has been defined as a set of self-initiated activities (behavioural aspect) and information-seeking skills (capabilities) that are activated in individuals with a sustained motivation (predisposition) to learn and the ability to recognize their own learning needs (cognitive aspect)6. Two versions of the JeffSPLL were used: the MS-Version (JeffSPLL-MS) for medical students17, and the HPS-Version (JeffSPLL-HPS) for students in all health professions disciplines other than medicine, such as nursing18. The MS-Version and the HPS-Version have been developed to reflect students’ lifelong learning abilities in their discipline. The content of both versions is similar with only minor modifications to make the items appropriate for the target groups. For example, the item in the MS-Version reading “One of the important goals of medical school is to develop students’ lifelong learning skills” in the HPS-Version is presented as “One of the important goals of healthcare professions’ education is developing students’ lifelong learning skills”. The JeffSPLL is answered using a Likert scale from 1 (strongly disagree) to 4 (strongly agree). Since all items are positively worded, they were directly scored. A higher score indicated a greater orientation toward lifelong learning.
Loneliness
The 15-item Social and Emotional Loneliness Scale for Adults (SELSA-S) was used for measuring loneliness in three specific contexts: family, romantic relationships, and social environments19. Loneliness, as a condition measured by the SELSA-S, is defined as the perception that one lacks meaningful connections with others, indicating an absence of interpersonal skills that is reflected in unsatisfactory human connections20. The SELSA-S offers four possible measures of loneliness: one global, and one by each specific social environment. The SELSA-S is answered using a Likert scale from 1 (strongly disagree) to 7 (strongly agree). Six items of the SELSA-S are positively worded, and nine are negatively worded. So, the positively ones were directly scored, whereas the negatively ones required to be transformed prior to calculate a global score. A higher score indicated a greater perception of loneliness.
Satisfaction with life
The 5-item Satisfaction with Life Scale (SWLS) was used for measuring satisfaction with life, also called subjective wellbeing21. Subjective wellbeing refers to the emotional and cognitive self-perception of personal life22. The SWLS is answered using a Likert scale from 1 (strongly disagree) to 5 (strongly agree). Since all items are positively worded, they were directly scored. A higher score indicated a greater satisfaction with life.
Anxiety
The 2-item Generalized Anxiety Disorder Scale (GAD-2) was used for measuring anxiousness/nervousness and uncontrollable worry, symptoms related to generalized anxiety disorder23. Respondents indicated the persistence of two core symptoms associated with anxiety during the last two weeks using a frequency scale from 0 (not at all) to 3 (nearly every day). A higher score indicated more severe anxiety symptoms. A score of 3 or more in the GAD-2 has been described as an acceptable cut-off for identifying clinically significant anxiety symptoms in the general population.
Depression
The 2-item Patient Health Questionnaire (PHQ-2) was used for measuring cognitive and affective depressive symptoms associated with general depression24. Respondents indicated the persistence of two core symptoms associated with general depression during the last two weeks using a frequency scale from 0 (not at all) to 3 (nearly every day). A higher score indicated more severe general depression symptoms. A score of 3 or more in the PHQ-2 has been described as an acceptable cut-off for identifying clinically significant depression symptoms in the general population.
Demographic and epidemiological variables
Sex
Refers to the biological sex assigned at birth that is provided in the dataset as a dichotomous variable (male, female).
Professional studies
Professional studies are provided in the dataset as a dichotomous variable (medicine, nursing).
University
Due to previous agreement with participating institutions, university names were anonymized in the dataset. For more information in this regard, see Data Records section. However, a variable indicating if the participant institution was either private or public is provided in the dataset.
Students’ academic achievement
This information is provided in the dataset with three variables: semester, course, and academic stage. Undergraduate medical studies in the medical schools participating in this study are distributed in seven years (14 semesters), while nursing studies are distributed in five years (10 semesters). Clinical training in the above-mentioned medical schools starts in the third year. In nursing schools, this training starts in the second year.
Working sector preference
This information is provided in the dataset as a dichotomous variable (private, public).
Specialty interest
Students provided this information answering a multiple-choice question with three possible options: “primary care”, which included specialties for addressing a large majority of personal health care needs, developing a sustained partnership with patients, and practicing in the context of family and community (i.e. family medicine, communitarian nursing, paediatrics); “specialty care”, which included specialties for addressing patients with specific clinical needs and complex medical conditions requiring specialized treatment and management that is performed in hospitals; and “other”, grouping specialties different from the two previously described and not requiring direct contact with the patients (i.e. clinical laboratory, anatomical pathology).
Students’ career choice motivation, perception, and dropout intention
Students indicated if their career choice was a “personal decision” (i.e. vocation), or it was taken due to external factors, such as the influence of their relatives, economic circumstances, or others.
Students indicated if their perception of their career choice changed during the pandemic responding a multiple-choice question with three possible options: “it is better”; “it is the same”; or “it is worse”.
Students indicated if during the pandemic considered dropping out of their studies and how often this thought came to their minds answering a frequency scale composed by five possible options: “never”; “rarely”; “sometimes”; “very often”; and “always”.
Internet connection and digital device used for attending on-line classes
Students indicated how they connected to Internet in a multiple-choice question with three options: “I have a wireless access point at home (Wi-Fi router)”; “smartphone”; or “I connect through an external wireless network”.
Students indicated the digital device they mainly used for attending their on-line classes from home in a multiple-choice question with four possible options: “I use my personal computer, laptop, or similar”; “I shared a personal computer, laptop, or similar”; “I use a tablet or similar”; or “I use a smartphone”.
Previous diagnosis of anxiety or depression
Students informed if they suffered episodes of anxiety or depressive with a clinical diagnosis before the first COVID-19 outbreak was declared in November 2019. This information is provided in the dataset in two separate dichotomous variables.
Place of residence
Information related to the students’ place of residence was obtained using geolocation based on respondents IP codes. This information is provided in the dataset in one variable indicating administrative regions (25 options) where students were living.
Date, epidemiological week, and epidemiological year
Date of response is provided in one variable. Based on this information, another two variables were created and are provided in the dataset following the standard definition. These variables correspond to the epidemiological year and week, respectively.
Positivity rate in student’s location
Based on respondent’s IP code, date of response, and information available in the dataset of the Peruvian National Institute of Health, a new variable was created indicating the rate of positive PCR results to the total number of tests taken in each epidemiological week by student’s region of residence.
Average of beds occupied with COVID-19 patients at hospital in student’s location
Based on respondent’s IP code, date of response, and information available in the dataset of the Peruvian National Health Superintendency, a new variable was created indicating the rate of hospital bed occupancy to the total number of hospital beds availability taken in each epidemiological week by student’s region of residence.
link
