September 8, 2022

•

8 mins.

New dataset offers unique insights into
‍peer review‍

Researchers can now apply to study the anonymized manuscript metadata of 2,000+ Elsevier journals

Linda Willems

The mission of Elsevier’s International Center for the Study of Research (ICSR) is an ambitious one — to advance research evaluation in all fields. The ICSR Lab is just one of the ways it has supported that aim over the past two years. This cloud-based computational platform enables researchers to analyze large structured datasets, including those that power Elsevier solutions such as Scopus and PlumX.

This week, the ICSR Lab added a dataset that is breaking new ground in the scholarly communication world; three years’ worth of Elsevier journal manuscript and peer review data containing rich and connected information on more than 5 million manuscript authors, reviewers, and editors.

Interested researchers will also have access to editorial decisions and timelines, including submission and decision dates, along with metadata for authors, editors and reviewers. All the data are housed in a special part of the virtual lab, called the Peer Review Workbench.

The initiative is the brainchild of Elsevier’s Reviewer Experience Lead, Dr Bahar Mehmani, who developed the project with one very clear goal in mind: to support systematic research into peer review.

She explains:

Unlike other research topics, studies on peer review have always been somewhat fragmented and are often based on limited evidence. This is mainly because researchers struggle to get access to manuscript and peer review data at scale.

At Elsevier, we periodically receive peer review study proposals, for example, from researchers who want to learn what kind of peer review model works best in their field or who want to look at peer review though a gender lens. We try to support them where we can, but building these ad hoc requests into existing workflows isn’t always easy.

Bahar Mehmani

Elsevier’s Reviewer Experience Lead

New in Scholary Kitchen

The Peer Review Workbench: An Interview with Bahar Mehmani

Driven by a desire to help these researchers, and her own positive experiences as a member of the peer review working group PEERE, Bahar approached the ICSR Lab team with a proposal to supplement their existing datasets with Elsevier manuscript and peer review data. The ICSR Lab team jumped at the chance.

Dr Andrew Plume, VP of Research Evaluation at Elsevier and President of the ICSR, notes:

Journal-mediated peer review of manuscripts is often held up to be a gold standard for expert review, and yet it is a process about which we lack comprehensive studies. By opening these data to research scrutiny in a safe and anonymized way, we anticipate the creation of an evidence base to support improvements in peer review.

In addition, we hope that it will inform policy in other areas where expert review is heavily relied on, such as formal research assessment practices.

Andrew Plume

VP of Research Evaluation at Elsevier and President of the ICSR

What data are available?

For this initial release, the Peer Review Workbench will contain three years of manuscript metadata from Elsevier’s online manuscript submission systems, including information on editorial decisions, manuscript authors and peer reviewers.

“Owing to the sheer volume of annual manuscript submissions to Elsevier journals,” Bahar says, “we have opted to start with three years’ worth of data and aim to grow that corpus each year to increase the utility of the data for trend analysis. So next year, there will be four years’ worth of data available, and so on.

“But even with just three years of data,” she adds, “we are talking about information on at least 5 million individuals, which we have anonymized, aggregated and enriched.” In addition, each group’s access is restricted to the variables and subset of the data that is required to approach their own research question.

According to Bahar, these steps are partly designed to ensure that no individual can be identified by anyone studying the data:

Researchers will only have access to a set of ID numbers that are unique to this dataset. And we will remove ‘outlier’ cases: for example, authors or peer reviewers who might be identifiable via their country or institution. In addition, we have checks and balances in place, along with clear proposal guidelines, that ensure submitted studies respect ethical considerations and statistical secrecy rules.

Bahar Mehmani

Elsevier’s Reviewer Experience Lead

The ICSR Lab team supporting the Peer Review Workbench also has a variety of data enrichment options at their disposal, which will be tailored to meet the scope of each proposal. These include:

Aggregated data from Scopus author profiles: for example, the publication history of a specific cohort of publishing authors.
Inferred gender. Bahar explains: “These gender assignment algorithms can provide insightful patterns and played an important role in the compilation of Elsevier’s gender reports. Nevertheless, they also have limitations — they are less efficient in inferring the gender of non-European names, and they don’t reflect the full gender spectrum.”
PlumX Metrics, which offer insight into the reach and impact of online publications.
SciVal Topics, which provides a fine-grained view of research topics that transcends journal boundaries.
The UN Sustainable Development Goal classifications used to tag sustainability-related journal articles.

Bahar adds:

It is such a rich and multi-faceted dataset. For example, researchers can see how many authors a manuscript has, how many prior publications each of those co-authors has, and how many of these co-authors have themselves previously reviewed for Elsevier journals. We can also give some indication of reviewing history across the Elsevier journal ecosystem; for each peer reviewer in the dataset, we can look at the number of reviews submitted or the number of review invites accepted. The connections just go on and on.

Researchers are also welcome to augment this dataset with their own files — for example, survey responses or project funding. And we always welcome suggestions for new enrichment options — our data scientists love a challenge!

Bahar Mehmani

Elsevier’s Reviewer Experience Lead

What kind of analyses can researchers run?

According to Bahar, there is no shortage of potential research questions that the Peer Review Workbench dataset can help to address. And with its two years of pre-COVID data that can be used to define baselines, it’s ideal for running studies on journal peer-review resilience during the pandemic. In fact, she used a similar dataset for a collaboration with Prof Flaminio Squazzoni of the University of Milan and Prof Francesco Grimaldo of the University of Valencia looking at the impact of the COVID-lockdown measures on women academics across the globe.

She says there are some proposals she would personally love to see:

I think studies on the reviewer performance of different cohorts would be really interesting, and those cohorts might be country, career stage, discipline, or even peer review model. They would help us understand how different characteristics of peer reviewers impact the quality and outcomes of peer review. And that would help journals understand how best to engage and support different groups.
Researchers are also welcome to augment this dataset with their own files — for example, survey responses or project funding. And we always welcome suggestions for new enrichment options — our data scientists love a challenge!

Bahar Mehmani

Elsevier’s Reviewer Experience Lead

Bahar would also like to see researchers submit proposals to repeat existing studies, because “reproducibility is so important.” And she expects to see requests to follow-up earlier studies:

Researchers will be able to see whether any recommendations arising from their initial study have been acted upon by the relevant journals — and if they have, whether they made any difference.

How do researchers apply?

Proposal submissions are welcome at any time via the ICSR Lab website. Each one will be checked by the ICSR team for completeness and addressability before being shared with an independent board of four academics for peer review.

Prof Ana Marušić, Chair of the Department of Research in Biomedicine and Health and Center for Evidence-based Medicine, University of Spit School of Medicine, Croatia; co-Editor-in-Chief of the Journal of Global Health.
Prof Francisco Grimaldo, Vice Dean and Associate Professor in the School of Engineering, University of Valencia; Vice Chair of PEERE.
Prof Francesca Dominici, Professor of Biostatistics, Population and Data Science, Harvard TH Chan School of Public Health.
Dr Mario Malički, META researcher at Stanford; Editor-in-Chief of Research Integrity and Peer Review, SpringerNature.

Researchers whose proposals are accepted will be given access to the relevant data.

Although research will be conducted within the Peer Review Workbench environment, researchers are welcome to export their aggregate data and relevant code.

They will also be encouraged to register their study’s hypothesis and methodology and link each stage of their project back to the initial proposal. The preprint server SSRN has opened a new channel for this purpose. Publications can take a variety of formats, including conference presentations and papers, working papers, preprints and peer-reviewed journal articles. A full list of data access requirements is available on the ICSR Lab website.

Apply for access

Pioneering a new approach to data sharing

According to Bahar, this is the first time a publisher has taken the step of sharing manuscript and peer review data for research purposes. When the PEERE working group was active from 2015 to 2018, Elsevier, along with publishers Wiley, Springer Nature and the Royal Society, amalgamated a selection of their peer review data for use within the working group, Bahar explains. “But no publisher has made this kind of data so openly accessible before.”

While Bahar believes that this new dataset will contribute to our understanding of peer review, she’s well aware that it’s only part of a bigger picture:

Academics publish with multiple publishers — it’s the same with reviewing — so any study on our data will still have its limitations. It would be great if other publishers could find ways to share their manuscript and peer review data. And we’d love to reach a stage where we can join forces to create one single dataset.

Bahar Mehmani

Elsevier’s Reviewer Experience Lead

A global, cross-business collaboration

Colleagues from across Elsevier contributed to preparing this dataset. They include:

Legal and data privacy: Alex Chan, Dr IJsbrand Jan Aalbersberg and Jessica Alexander
ISCR: Kristy James, Silvia Dobre and Alick Bird
Submission systems: Ramsundhar Baskaravelu worked with Natarajan Kaliyamoorthy and Lauren Oppenheim of Aries Systems
Data Science Research Content team: Ramadurai Petchiappan, Dr Efthymios Tsakonas, Pascal Coupet, and Dr Georgios Tsatsaronis
Global Academic Research: Noelle Gracy and Pragya Singh

Contributors

After starting her working life as a newspaper journalist (covering everything from amateur dramatics to murder trials), Linda Willems held a variety of communications roles before joining Elsevier. During her six years with the company, she focused on researcher communications and edited several of Elsevier’s researcher-focused publications. She's now a freelance writer and owner of Blue Lime Communications.

Linda Willems

Linkedin More articles

New dataset offers unique insights into
‍peer review‍

Researchers can now apply to study the anonymized manuscript metadata of 2,000+ Elsevier journals

New in Scholary Kitchen

What data are available?

What kind of analyses can researchers run?

How do researchers apply?

Pioneering a new approach to data sharing

A global, cross-business collaboration

Contributors

Related stories

New dataset offers unique insights into peer review

New dataset offers unique insights into peer review

New dataset offers unique insights into peer review

Join the #confidenceinresearch discussion on LinkedIn