32 Comments
Almost 11 years ago, Stefan Wuchty, Benjamin Jones, and Brian Uzzi (all of Northwestern University) published an article in Science on “The Increasing Dominance of Team in Production of Knowledge.” They analyzed nearly 20 million papers published over 5 decades and 2.1 million patents and found that across all fields the number of authors per paper (or patent) steadily increased, that teams were coming to dominate individual efforts, and that teams produced more highly cited research.
In a Science review paper published a few weeks ago, Santo Fortunato and colleagues offered an overview of the “Science of Science.” One of their key messages was that “Research is shifting to teams, so engaging in collaboration is beneficial.”
I thought it would be worth exploring this concept further using NIH grants. For this post, data were acquired using a specific NIH portfolio analysis tool called iSearch. This platform prvides easy access to carefully curated, extensively-linked datasets of global grants, patents, publications, clinical trials, and approved drugs.
One way of measuring team size is to count the number of co-authors on published papers. Figure 1 shows box-and-whisker plots of author counts for 1,799,830 NIH-supported papers published between 1995 and 2017. The black diamonds represent the means. We can see from these data that the author counts on publications resulting from NIH support have steadily increased over time (mean from 4.2 to 7.4, median from 4 to 6).
Figure 2 shows corresponding data for 765,851 papers that were supported only with research (R) grants. In other words, none cited receiving support from program project (P), cooperative agreement (U), career development (K), training (T), or fellowship (F) awards. We see a similar pattern in which author counts have increased over time (mean from 4.0 to 6.2, median from 4 to 5). Also, of note is a drifting of the mean away from the median, reflecting an increasingly skewed distribution driven by a subset of papers with large numbers of authors.
Next, let’s look at corresponding data for papers that received support from at least one P grant (N=498,790) or at least one U grant (N=216,600) in Figures 3 and 4 respectively. As we can see, there are similar patterns emerging that were seen for R awards.
Figure 5 focuses on 277,330 R, P, or U-supported papers published between 2015 and 2017 and shows author counts for papers supported on R grants only (49%), P grants only (11%), U grants only (8%), R and P grants (16%), R and U grants (7%), and P and U grants (9%). The patterns are not surprising – author counts are higher for papers supported by P and U grants—likely as these are large multi-factorial activities inherently involving many researchers—but even for R grant papers the clear majority involve multiple authors.
Finally, in Figure 6 we show a scatter plot (with generalized additive model smoother) of relative citation ratio (RCR) according author count for NIH-supported papers published in 2010. As a reminder, RCR is a metric that uses citation rates to measure influence at the article level. Consistent with previous literature, an increased author count is associated with higher citation influence – in other words, the more authors on a paper, then the more likely it is to be influential in its field.
Summarizing these findings:
- Consistent with prior literature, we see that NIH-funded extramural research, including research funded by R grants, produce mostly multi-author papers, with increasing numbers of authors per paper over time. These findings are consistent with the growing importance of team science.
- Mechanisms designed to promote larger-scale team science (mainly P and U grants) generate papers with greater numbers of authors.
- There is an association by which greater numbers of authors are associated with greater citation influence.
It is important to understand that, even in this competitive funding environment, research is shifting to teams. And when we look more closely at the impact of the shift, we see that collaboration is proving to move science forward in important ways. How big should teams be? Some recent literature suggests that small teams are more likely than large teams to produce disruptive papers. A few years ago, my colleagues published a paper on the NIH-funded research workforce; they found that the average team size was 6. Is this optimal? We don’t know.
There is much more for us to look at in terms of the role of team science in NIH supported research. In the meantime, it’s great to see more confirmation that scientific collaboration is truly beneficial to moving science forward.
Its good to see the data fit the “hypothesis”- science has gotten more complicated with bigger data sets, more experimental testing, etc. The question is why is the $250K modular R01 grant still the NIH’s primary mechanism for supporting this research, in-changed over the past 15 years? This award supports the PI and 2 individuals to execute on 2-3 scientific aims. This is insufficient to support the big data collaborative research that the NIH R01 is supposedly the bedrock for. Why NIH/NHLBI has not recognized this reality and fixed it is baffling.
I endorse this view entirely. While NIH has introduced the multi-PI mechanism, the reality is that there is no funding mechanism suitable for multi-PI research. The R01 is unsuitable due to the incessant push to modular budgets. Dr. Fisher is correct that $250k per year is insufficient for even a single PI to do a considerable amount of work, especially when working in patients or expensive animal models.
Is the data regarding impact (RCR) being skewed by the genome sequencing projects carried out by ENCODE and similar consortia? What does the impact look like if these relatively few, but highly cited, papers are removed?
in my experience papers with more authors present results that are much more difficult to duplicate. i’d like to see a graph correlating the number of authors with time of papers that have either been withdrawn or otherwise have fraudulent data. Science is more complex, but with multiple authors no one takes full responsibility and few really understand all the subtleties of the data generated by complex machines.
This is so true and we have seen it first hand many times. Apart from the fact that papers with more authors also have more “decorative” authorship.
You are right. If PIs are in different Institutions we do not know how their students, postdocs or technicians used the same rigor and reproducibility. We cannot have the best of two worlds. .
This raises the question, in the era of team science, what is a more equitable way to recognize authors for their contributions since there is only one first or last author position possible?
There is an alternative explanation. The NIH has not provided inflationary increases to individual grants for 20 years reducing their purchasing power by a substantial amount. If no one lab has enough funding to do competitive science than they must collaborate to do so. Thus, underfunding might drive collaboration and the analysis presented may simply be detecting the consequence of underfunding. It would thus be useful to normalize publications to the number of dollars available per team.
Indeed, the data shown in the plot of citations as a function of the number of authors is inappropriately normalized: it does not account for cost-effectiveness. A naïve measure of cost would be “number of authors” and the relevant quantity would then be “citations [RCR] per author.” To appearances, this would more-or-less flatten out the curve; indeed, it would trend downward with team size.
Thus, the plot lends itself to a misleading interpretation; it does not reflect “productivity” in any normal sense of the term. In particular, setting policy on resource allocation based on the conclusions of this article would be a misuse of the data presented here. All it reveals is that other things being equal, those better resourced will be more cited.
“Quantity and quality of science [or perhaps, health benefit] per dollar” is probably closer to a measure of productivity that policy decisions should be aimed at optimizing. The high rate of irreproducibility of published science – even if longstanding and not merely a recent development – suggests that its a measure that NIH might prefer not to look into too deeply.
In the areas of epidemiology and public health, team efforts started way back in the days of the early NIH funded multi-center clinical trials, such as MRFIT (Multiple Risk Factor Intervention Trial) and the HDFP (Hypertension Detection and Follow-up Program). It was (and continues to be) exhilarating to work with scientists from many disciplines, with many different perspectives, and with common goals. Studies of ‘omics’ have expanded the team approaches. Science has moved from small, one-institution studies, to large multi-center one, to consortia of studies, and now to consortia of consortia. It is a privilege to be part of this vast collaborative enterprise.
More cynically, since everyone likes to cite their own work, a publication with 20 authors has tenfold more people who want to cite it than a publication with 2 authors.
So it would be interesting to adjust the relative citation metric to remove self-citation and see what happens.
Good point!
Number of times a paper gets cited is akin to a PCR reaction.
With low template input (number of authors), the reaction (amplification effect in citations) does not take off as fast.
Other factors are at play obviously, but this can probably be measured.
One could pick pairs of competing papers with the same discovery published at the same time, and then see which ones are more cited and examine correlation to number of authors.
I think that the number of authors per publication is increasing and collaborations are increasing, but that has little to do with the team concept. The multiple authors of a research paper are often at different institutions or in different departments, and are rarely members of a single team. Any conclusions favoring the creation of larger teams would be a misinterpretation of the data.
I agree with this assessment. NIH appears to have misinterpreted, or not considered, other interpretations of these data!
I agree, in general, that there is very little teamwork, especially from middle authors. That said, being at different institutions does not preclude great collaborations. In fact, my best collaborators have always been >3000 miles away.
I am surprised at the weak correlation of RCR with the number of authors/paper (Fig6; I appreciate it is log/log). If self-citations were removed would the correlation disappear or even turn negative? I am excited about ‘team science’ and understand that it is ‘the future’ but as a scientist I worry about accepting the null-hypothesis…..
Correlation has no separation of cause and effect
Please dont let the NIH fall into that trap
There is an incredible number of low citation papers even with 10 or more authors…that is a large cost for something so inconsequential…or is this an inappropriate way to measure the impact of the research…?
Mike – NCI/NIH needs to do more to give stronger recognition to team science in the context of MPI. Right now, award are attributed to the contact PI which can disadvantage the MPI(s).
Clearly, team-science is beneficial, whichever way one looks at it. NIH should consider rethinking Conflict-of-Interest (COI) policies that are in fact a deterrent to such activities. Very often, investigative teams are unable to have their application reviewed in the most appropriate Peer Review Study Section because of conflicts that are likely to arise. Most often, Junior/New/Early stage Principal Investigators are adversely impacted when they aspire to collaborate with an established Investigator who happens to be on a study section that is best-positioned to review the application.
I wonder if – in the spirit of rigor, reproducibility, and robustness – the data for these R graphs could be made available? No doubt there are more authors per paper than ever before, but looking at the graph, the payoff seems to peter out at the inflection point of the Loess fit. Beyond 20 authors, the fit is dominated by very few, but high leverage points. I would not be surprised if a traditional linear regression would show only a marginal association between team size and scientific impact.
This may reflect the incentive structure as much as productivity. There is no cost to adding extra authors, at least beyond 3, only benefit. So it is natural to expect author counts to go up even if research is not getting more collaborative.
I also have doubts about the impact, simply because it is not normalized to the number of authors. The real question should be, “Is an individual more productive when participating in a team, as compared to working on his or her own or in a small group?” To measure this marginal impact of adding an author, you should calculate the number of citations to papers with n authors minus the number of citations to papers with n-1 authors, as n goes from 2 to large.
I am not sure how well this represents collaboration. Scientists (and their institutions) have become more aggressive in seeking authorships in papers (as tenure, grants, and other recognitions) have become stringent. In my younger days, scientists freely exchanged ideas and were generous in not claiming authorships, unless they really participated in the study. yet another factor is the “citation” frenzy-it would be interesting to check whether the increased number of authors lead to increased citations (at least by the authors!).
Unless I’ve analyzed this incorrectly, the net effect is the impact to science per author is decreasing, because the impact of multiauthor papers does not increase as a factor of the number of authors. If one factors in the self-citation argument above, it would seem that the net effect of team science is science that has a lower net impact per researcher. Seems a less efficient use of money. What is missing is the RCR per dollar spent, which I would like to see, because efficiency at that level might justify the apparent losses in the first areas.
It seems too all obvious and elementary that teams collaboration would produce more publications and NIH should promote multi-centers collaborations. Some of NIH grants reviewers do not like the “long” distance between the research labs for collaboration and often criticize multi-center collaborations, this should also be addressed.
The article makes some great points and thank you for referencing our research (Wuchty, Jones, and Uzzi 2007). May I point out a typo? The first author on the paper is “Stefan Wuchty,” not “Stefan Duchy.” If it is possible to make the correction, please do.
In genetic epidemiology many manuscripts are produced by consortia involving multiple studies. Typically each study has 1-3 authors listed on a manuscript. It is not unusual for such a manuscript to have more than 100 named authors (plus more listed elsewhere in the manuscript under membership of the individual studies in the consortium but still showing up in a PubMed search).
And what is wrong with that? (“…in genetic epidemiology it is not unusual to have more than 100 named authors”). It means that all these people participated in the various studies making up the consortium. Data doesn’t generate itself – many people are involved and they should get credit, particularly junior people who are in afield where the chips of the game are research productivity, as demonstrated in important publications.
I wasn’t trying to imply there was anything wrong with that. But it is part of the explanation for why the average number of authors has increased, even if the sizes of the teams on the individual studies in the consortium have not increased.
It’s unclear where to draw the line of “contributing enough to be listed as an author.” Should the lab technicians that processed all those samples also be listed? Oftentimes, they contributed as much or more to the writing of the manuscript.
NCATS emphasizes the importance of teaching scholars team science and promoting team science activities in its funding of clinical and translational science centers. Unfortunately, my experience on review committees is reviewers punishing the use of multiple PIs and team science in general, even when well-justified, on R01s. If NIH really wanted to support the use of team science they should note that on the reviewer templates and in reviewer training. Otherwise, we are training our junior researchers in team science and setting them up for failure when it comes time for their grant applications to be reviewed.
I am wondering if the lack of appreciation of team science is more common for faculty in my discipline, psychology, and with faculty members who work in departments outside of academic medical centers.
I absolutely agree with the previous comments. The reviewers are not equipped to handle team science. Team brings collective expertise and does not just depend on the PI-often the reviewers critique the PI for lack of expertise in certain areas. They don’t look at the ability to lead a team to success etc. Also, a new team is a new team-there is no sense in comparison with an established team. It is the ideas and the expertise that they bring to the table that counts.
The team science, is particularly lacking in old teams They have a team but often less than par science. They have run out of ideas but don’t want to give up the money. They lobby heavily. In my opinion, teams should be disbanded after one term unless the productivity has been extraordinary and they have new ideas to compete with new applicants.
These were obvious in SCORe applications, many PPGs, and Population science. I had the privilege of reviewing large grants and many of them wouldn’t have even been able to compete with a RO-1 of two million dollars. These grants were funded for 15-20 million dollars.
While NIH has been critical of reviewers and investigators who seek improper channels, NIH should look into their own SRAs and big grants funding mechanisms. Often statisticians and population scientists who have little or no insight into the science or technology rail road review process. Experts don’t want to participate in the review process, too much work, too little compensation, fed up with the review system account for their lack off interest. They also fear that if they turn down any grant, the affected scientists would turn their grant in the next cycle. Our review panelists and administrators consist of washed out scientists, junior scientists who are not familiar of areas outside of their close investigations, and who have no sense of current science but do favors for those who are supportive of them.