Tag: Funding Outcomes

Productivity Metrics and Peer Review Scores, Continued

11 comments

In a previous post, I described some initial results from an analysis of the relationships between a range of productivity metrics and peer review scores. The analysis revealed that these productivity metrics do correlate to some extent with peer review scores but that substantial variation occurs across the population of grants.

Here, I explore these relationships in more detail. To facilitate this analysis, I separated the awards into new (Type 1) and competing renewal (Type 2) grants. Some parameters for these two classes are shown in Table 1.

Table 1. Selected=

Table 1. Selected parameters for the population of Type 1 (new) and Type 2 (competing renewal) grants funded in Fiscal Year 2006: average numbers of publications, citations and highly cited citations (defined as those being in the top 10% of time-corrected citations for all research publications).

For context, the Fiscal Year 2006 success rate was 26%, and the midpoint on the funding curve was near the 20th percentile.

To better visualize trends in the productivity metrics data in light of the large amounts of variability, I calculated running averages over sets of 100 grants separately for the Type 1 and Type 2 groups of grants, shown in Figures 1-3 below.

Figure 1. Running averages for the number of publications over sets of 100 grants funded in Fiscal Year 2006 for Type 1 (new, solid line) and Type 2 (competing renewal, dotted line) grants as a function of the average percentile for that set of 100 grants.

Figure 1. Running averages for the number of publications over sets of 100 grants funded in Fiscal Year 2006 for Type 1 (new, solid line) and Type 2 (competing renewal, dotted line) grants as a function of the average percentile for that set of 100 grants.

Figure 2. Running averages for the number of citations over sets of 100 grants funded in Fiscal Year 2006 for Type 1 (new, solid line) and Type 2 (competing renewal, dotted line) grants as a function of the average percentile for that set of 100 grants.

Figure 2. Running averages for the number of citations over sets of 100 grants funded in Fiscal Year 2006 for Type 1 (new, solid line) and Type 2 (competing renewal, dotted line) grants as a function of the average percentile for that set of 100 grants.

Figure 3. Running averages for the number of highly cited publications over sets of 100 grants funded in Fiscal Year 2006 for Type 1 (new, solid line) and Type 2 (competing renewal, dotted line) grants as a function of the average percentile for that set of 100 grants.

Figure 3. Running averages for the number of highly cited publications over sets of 100 grants funded in Fiscal Year 2006 for Type 1 (new, solid line) and Type 2 (competing renewal, dotted line) grants as a function of the average percentile for that set of 100 grants.

These graphs show somewhat different behavior for Type 1 and Type 2 grants. For Type 1 grants, the curves are relatively flat, with a small decrease in each metric from the lowest (best) percentile scores that reaches a minimum near the 12th percentile and then increases somewhat. For Type 2 grants, the curves are steeper and somewhat more monotonic.

Note that the curves for the number of highly cited publications for Type 1 and Type 2 grants are nearly superimposable above the 7th percentile. If this metric truly reflects high scientific impact, then the observations that new grants are comparable to competing renewals and that the level of highly cited publications extends through the full range of percentile scores reinforce the need to continue to support new ideas and new investigators.

While these graphs shed light on some of the underlying trends in the productivity metrics and the large amount of variability that is observed, one should be appropriately cautious in interpreting these data given the imperfections in the metrics; the fact that the data reflect only a single year; and the many legitimate sources of variability, such as differences between fields and publishing styles.

Productivity Metrics and Peer Review Scores

18 comments

A key question regarding the NIH peer review system relates to how well peer review scores predict subsequent scientific output. Answering this question is a challenge, of course, since meaningful scientific output is difficult to measure and evolves over time–in some cases, a long time. However, by linking application peer review scores to publications citing support from the funded grants, it is possible to perform some relevant analyses.

The analysis I discuss below reveals that peer review scores do predict trends in productivity in a manner that is statistically different from random ordering. That said, there is a substantial level of variation in productivity metrics among grants with similar peer review scores and, indeed, across the full distribution of funded grants.

I analyzed 789 R01 grants that NIGMS competitively funded during Fiscal Year 2006. This pool represents all funded R01 applications that received both a priority score and a percentile score during peer review. There were 357 new (Type 1) grants and 432 competing renewal (Type 2) grants, with a median direct cost of $195,000. The percentile scores for these applications ranged from 0.1 through 43.4, with 93% of the applications having scores lower than 20. Figure 1 shows the percentile score distribution.

Figure 1. Cumulative number of NIGMS R01 grants in Fiscal Year 2006 as a function of percentile score.

Figure 1. Cumulative number of NIGMS R01 grants in Fiscal Year 2006 as a function of percentile score.

These grants were linked (primarily by citation in publications) to a total of 6,554 publications that appeared between October 2006 and September 2010 (Fiscal Years 2007-2010). Those publications had been cited 79,295 times as of April 2011. The median number of publications per grant was 7, with an interquartile range of 4-11. The median number of citations per grant was 73, with an interquartile range of 26-156.

The numbers of publications and citations represent the simplest available metrics of productivity. More refined metrics include the number of research (as opposed to review) publications, the number of citations that are not self-citations, the number of citations corrected for typical time dependence (since more recent publications have not had as much time to be cited as older publications), and the number of highly cited publications (which I defined as the top 10% of all publications in a given set). Of course, the metrics are not independent of one another. Table 1 shows these metrics and the correlation coefficients between them.

Table 1. Correlation coefficients between nine metrics of productivity.

Table 1. Correlation coefficients between nine metrics of productivity.

How do these metrics relate to percentile scores? Figures 2-4 show three distributions.

Figure 2. Distribution of the number of publications as a function of percentile score. The inset shows a histogram of the number of grants as a function of the number of publications.

Figure 2. Distribution of the number of publications as a function of percentile score. The inset shows a histogram of the number of grants as a function of the number of publications.

Figure 3. Distribution of the number of citations as a function of percentile score. The inset shows a histogram of the number of grants as a function of the number of citations.

Figure 3. Distribution of the number of citations as a function of percentile score. The inset shows a histogram of the number of grants as a function of the number of citations.

Figure 4. Distribution of the number of highly cited publications as a function of percentile score. Highly cited publications are defined as those in the top 10% of all research publications in terms of the total number of citations corrected for the observed average time dependence of citations.

Figure 4. Distribution of the number of highly cited publications as a function of percentile score. Highly cited publications are defined as those in the top 10% of all research publications in terms of the total number of citations corrected for the observed average time dependence of citations.

As could be anticipated, there is substantial scatter across each distribution. However, as could also be anticipated, each of these metrics has a negative correlation coefficient with the percentile score, with higher productivity metrics corresponding to lower percentile scores, as shown in Table 2.

Table 2. Correlation coefficients between the grant percentile score and nine metrics of productivity.

Table 2. Correlation coefficients between the grant percentile score and nine metrics of productivity.

Do these distributions reflect statistically significant relationships? This can be addressed through the use of a Lorenz curve Link to external web site to plot the cumulative fraction of a given metric as a function of the cumulative fraction of grants, ordered by their percentile scores. Figure 5 shows the Lorentz curve for citations.

Figure 5. Cumulative fraction of citations as a function of the cumulative fraction of grants, ordered by percentile score. The shaded area is related to the excess fraction of citations associated with more highly rated grants.

Figure 5. Cumulative fraction of citations as a function of the cumulative fraction of grants, ordered by percentile score. The shaded area is related to the excess fraction of citations associated with more highly rated grants.

The tendency of the Lorenz curve to reflect a non-uniform distribution can be measured by the Gini coefficient Link to external web site. This corresponds to twice the shaded area in Figure 5. For citations, the Gini coefficient has a value of 0.096. Based on simulations, this coefficient is 3.5 standard deviations above that for a random distribution of citations as a function of percentile score. Thus, the relationship between citations and the percentile score for the distribution is highly statistically significant, even if the grant-to-grant variation within a narrow range of percentile scores is quite substantial. Table 3 shows the Gini coefficients for the all of the productivity metrics.

Table 3. Gini coefficients for nine metrics of productivity. The number of standard deviations above the mean, as determined by simulations, is shown in parentheses below each coefficient.

Table 3. Gini coefficients for nine metrics of productivity. The number of standard deviations above the mean, as determined by simulations, is shown in parentheses below each coefficient.

Of these metrics, overall citations show the most statistically significant Gini coefficient, whereas highly cited publications show one of the least significant Gini coefficients. As shown in Figure 4, the distribution of highly cited publications is relatively even across the entire percentile score range.

Fiscal Year 2010 R01 Funding Outcomes and Estimates for Fiscal Year 2011

17 comments

Fiscal Year 2010 ended on September 30, 2010. We have now analyzed the overall results for R01 grants, shown in Figures 1-3.

Figure 1. Competing R01 applications reviewed (open rectangles) and funded (solid bars) in Fiscal Year 2010.

Figure 1. Competing R01 applications reviewed (open rectangles) and funded (solid bars) in Fiscal Year 2010.

Figure 2. NIGMS competing R01 funding curves for Fiscal Years 2006-2010. The thicker curve (black) corresponds to grants made in Fiscal Year 2010. The success rate for R01 applications was 27%, and the midpoint of the funding curve was at approximately the 21st percentile. These parameters are comparable to those for Fiscal Year 2009, excluding awards made with funds from the American Recovery and Reinvestment Act.

Figure 2. NIGMS competing R01 funding curves for Fiscal Years 2006-2010. The thicker curve (black) corresponds to grants made in Fiscal Year 2010. The success rate for R01 applications was 27%, and the midpoint of the funding curve was at approximately the 21st percentile. These parameters are comparable to those for Fiscal Year 2009, excluding awards made with funds from the American Recovery and Reinvestment Act.

The total NIGMS expenditures (including both direct and indirect costs) for R01 grants are shown in Figure 3 for Fiscal Year 1996 through Fiscal Year 2010.

Figure 3. Overall NIGMS expenditures on R01 grants (competing and noncompeting, including supplements) in Fiscal Years 1995-2010. The dotted line shows the impact of awards (including supplements) made with Recovery Act funds. Results are in actual dollars with no correction for inflation.

Figure 3. Overall NIGMS expenditures on R01 grants (competing and noncompeting, including supplements) in Fiscal Years 1995-2010. The dotted line shows the impact of awards (including supplements) made with Recovery Act funds. Results are in actual dollars with no correction for inflation.

What do we anticipate for the current fiscal year (Fiscal Year 2011)? At this point, no appropriation bill has passed and we are operating under a continuing resolution through March 4, 2011, that funds NIH at Fiscal Year 2010 levels. Because we do not know the final appropriation level, we are not able at this time to estimate reliably the number of competing grants that we will be able to support. We can, however, estimate the number of research project grant applications in the success rate base (correcting for applications that are reviewed twice in the same fiscal year). We predict that this number will be approximately 3,875, an increase of 17% over Fiscal Year 2010.

UPDATE: The original post accidentally included a histogram from a previous year. The post now includes the correct Fiscal Year 2010 figure.

Another Look at Measuring the Scientific Output and Impact of NIGMS Grants

33 comments

In a recent post, I described initial steps toward analyzing the research output of NIGMS R01 and P01 grants. The post stimulated considerable discussion in the scientific community and, most recently, a Nature news article Link to external web site.

In my earlier post, I noted two major observations. First, the output (measured by the number of publications from 2007 through mid-2010 that could be linked to all NIH Fiscal Year 2006 grants from a given investigator) did not increase linearly with increased total annual direct cost support, but rather appeared to reach a plateau. Second, there were considerable ranges in output at all levels of funding.

These observations are even more apparent in the new plot below, which removes the binning in displaying the points corresponding to individual investigators.

A plot of the number of grant-linked publications from 2007 to mid-2010 for 2,938 investigators who held at least one NIGMS R01 or P01 grant in Fiscal Year 2006 as a function of the total annual direct cost for those grants. For this data set, the overall correlation coefficient between the number of publications and the total annual direct cost is 0.14.

A plot of the number of grant-linked publications from 2007 to mid-2010 for 2,938 investigators who held at least one NIGMS R01 or P01 grant in Fiscal Year 2006 as a function of the total annual direct cost for those grants. For this data set, the overall correlation coefficient between the number of publications and the total annual direct cost is 0.14.

Measuring the Scientific Output and Impact of NIGMS Grants

29 comments

A frequent topic of discussion at our Advisory Council meetings—and across NIH—is how to measure scientific output in ways that effectively capture scientific impact. We have been working on such issues with staff of the Division of Information Services in the NIH Office of Extramural Research. As a result of their efforts, as well as those of several individual institutes, we now have tools that link publications to the grants that funded them.

Using these tools, we have compiled three types of data on the pool of investigators who held at least one NIGMS grant in Fiscal Year 2006. We determined each investigator’s total NIH R01 or P01 funding for that year. We also calculated the total number of publications linked to these grants from 2007 to mid-2010 and the average impact factor for the journals in which these papers appeared. We used impact factors in place of citations because the time dependence of citations makes them significantly more complicated to use.

I presented some of the results of our analysis of this data at last week’s Advisory Council meeting. Here are the distributions for the three parameters for the 2,938 investigators in the sample set:

Histograms showing the distributions of total annual direct costs, number of publications linked to those grants from 2007 to mid-2010 and average impact factor for the publication journals for 2,938 investigators who held at least one NIGMS R01 or P01 grant in Fiscal Year 2006.

Histograms showing the distributions of total annual direct costs, number of publications linked to those grants from 2007 to mid-2010 and average impact factor for the publication journals for 2,938 investigators who held at least one NIGMS R01 or P01 grant in Fiscal Year 2006.

For this population, the median annual total direct cost was $220,000, the median number of grant-linked publications was six and the median journal average impact factor was 5.5.

A plot of the median number of grant-linked publications and median journal average impact factors versus grant total annual direct costs is shown below.

A plot of the median number of grant-linked publications from 2007 to mid-2010 (red circles) and median average impact factor for journals in which these papers were published (blue squares) for 2,938 investigators who held at least one NIGMS R01 or P01 grant in Fiscal Year 2006. The shared bars show the interquartile ranges for the number of grant-linked publications (longer red bars) and journal average impact factors (shorter blue bars). The medians are for bins, with the number of investigators in each bin shown below the bars.

A plot of the median number of grant-linked publications from 2007 to mid-2010 (red circles) and median average impact factor for journals in which these papers were published (blue squares) for 2,938 investigators who held at least one NIGMS R01 or P01 grant in Fiscal Year 2006. The shared bars show the interquartile ranges for the number of grant-linked publications (longer red bars) and journal average impact factors (shorter blue bars). The medians are for bins, with the number of investigators in each bin shown below the bars.

This plot reveals several important points. The ranges in the number of publications and average impact factors within each total annual direct cost bin are quite large. This partly reflects variations in investigator productivity as measured by these parameters, but it also reflects variations in publication patterns among fields and other factors.

Nonetheless, clear trends are evident in the averages for the binned groups, with both parameters increasing with total annual direct costs until they peak at around $700,000. These observations provide support for our previously developed policy on the support of research in well-funded laboratories. This policy helps us use Institute resources as optimally as possible in supporting the overall biomedical research enterprise.

This is a preliminary analysis, and the results should be viewed with some skepticism given the metrics used, the challenges of capturing publications associated with particular grants, the lack of inclusion of funding from non-NIH sources and other considerations. Even with these caveats, the analysis does provide some insight into the NIGMS grant portfolio and indicates some of the questions that can be addressed with the new tools that NIH is developing.