In July, IOI hosted its second State of Open Infrastructure Community Conversation — this time, exploring the state of open infrastructure grant funding

To set the stage, IOI’s senior researcher Gail Steinhart provided an overview of the methods that were used to gather over $415M USD in grant funding data for open infrastructures (OIs) and broke down some of the key findings from the analysis. To dive further into the topic of funding data, IOI Executive Director Kaitlin Thaney facilitated a panel conversation that featured Steinhart, collaborators Cameron Neylon and Karl Huang from the Curtin Open Knowledge Initiative (COKI), and John Mohr, CIO of Information Technology for the MacArthur Foundation and co-founder of the Philanthropy Data Commons. With their extensive experience in grant funding from diverse perspectives of the scholarly ecosystem, the panel shed light on the trends, impact, and limitations of grant funding for OIs. 

Below are highlights from that conversation (tune in to the full recording here): 

A closer look at the relationship of direct to indirect funding 

Steinhart kicked off the session with a closer look at the findings from our grants data analysis, built off of a total of $415,845,753 USD in funding data from 23 funders to 36 OIs, via 514 awards made over the time period 2000–2024. (Access our free dashboards and play around with the data yourself. You can also download the entire dataset from Zenodo).

Across the grants the team mapped the 36 open infrastructures represented in this dataset. Awards were categorized to reflect whether they provide:

  • direct support to an OI; 
  • indirect support, meaning the OI is referenced in the award title or abstract, but the funding does not directly support the OI though it may provide some indication of on OI’s broader impact; 
  • adoption support, which is funding that supports the implementation of an instance of an OI at a local or community scale); and 
  • grants we were unable to classify (unknown).

While a significant amount (42%) of funding goes to direct support, the majority of the funding (52%) goes to indirect support (see the figure below).

 Sum of all awards by super category.
Sum of all awards by super category.

Steinhart further explained this breakdown of “direct” and “indirect”, noting examples where an OI is named in a grant title or description, but funding goes to a user of an OI platform or service, such as a researcher publishing on the arXiv or sharing their data through Dryad or GenBank. This is important to note as we look at the reliance on OIs across the research enterprise as a signal of their use, and examine how that may or may not lead to additional sustaining funding for these infrastructures to maintain and make available these critical services. 

Challenges in accessing grant funding data

The group discussed the many challenges in gathering and analysing grant funding data. They found that the sources of grant funding data either have incomplete data about the awards, or the data has been stored in varied formats, which makes it difficult to process. In addition, only a limited set of funders make the funder data publicly accessible.

“In the process of writing the chapter, we had to analyse a lot of data, which was very complex. One of the things I say is that good data doesn’t come easy,” said Karl Huang. After explaining how they were able to retrieve datasets through website scraping and supplemented them with data from OpenAIRE, Huang acknowledged that the process was challenging and required each dataset to be treated differently, but that “we got a good idea of what was possible, which was very informative and powerful; we also showed what is needed to make data access easier for the broader community.” 

Regarding what community members can do to improve the situation, Cameron Neylon pointed out that funders in particular can register in the Crossref Funder Registry. “This is critical because it facilitates the tagging of grants with consistent metadata schema with identifiers. We are interested in building stronger systems for storing and analysing funder data, and currently, there is a problem in the open metadata ecosystem. For us to solve this challenge, we also need to embrace collaboration,” remarked Neylon.

What next for the grant funding analysis?

While we encountered many complexities when analysing the grant funding data, invaluable data and observations were derived from the process. Given the critical role of grant funding in sustaining OIs, the goal is to continue to analyse funding data with expanded datasets for the 2025 State of Open Infrastructure report. 

Responding to a question from Ella Belfer about funding strategies for ongoing operations and maintenance of open infrastructure, Mohr responded by suggesting that funders should promote or even require the use of open infrastructure, emphasizing the need to support both development and operational funding.

According to Steinhart, the plans for the next edition of the State of Open Infrastructure report include doubling the number of infrastructures and funders used in the dataset, and exploring using natural language processing to enhance the data analysis. Steinhart added, “This year, we looked at grant funding, but it is important to note that this is not the only funding source for open infrastructures. A bigger question would be what is the place of philanthropy in support for open infrastructure.” 

“One of the things that we hear about is that the funding for open infrastructure is insufficient. However, based on the data in the report, we can see that there is funding. Even if the dataset is small, $415 million is a lot. Perhaps what needs to be solved is how we make funding opportunities easier to find,” said Mohr. Mohr then introduced the Philanthropy Data Commons, an initiative by the MacArthur Foundation and other actors in the ecosystem which aims to solve the issue of funding opportunity discovery by “connecting systems and data to make data sharing easier.”

If you missed the State of Open Infrastructure: Grant Funding conversation or would like to rewatch, you'll find a recording of the session on our YouTube channel. You can also access the full write-up of this research and the data via our dashboards

Our next community conversation on Sept. 19, 2024 will focus on regional policy development. To keep up with our schedule of community calls and other future events from the State of Open Infrastructure and other IOI projects, please subscribe to our newsletter

Posted by Kaitlin Thaney, Lauren Collister and Jerry Sellanga