This text is edited from the keynote address delivered by Katherine Skinner, IOI’s Director of Programs, at the HathiTrust 2024 Member Meeting. You can also view a recording of the keynote.
It’s such an old story.
It starts with vanity and pride.
In the story, an Emperor dreams of being admired and adored by his subjects for wearing the very finest and most elaborate of robes. He hires a team of tailors who promise him that the robes they will spin will be both exceptionally beautiful…and invisible to all who lack wisdom or who are unfit to serve the Emperor.
The emperor believes he will be able to separate the grain from the chaff based on what his subjects see. He pays the tailors handsomely to begin. And the charlatans, with invisibility as their ruse, pretend to create the majestic robes.
It ends with embarrassment and loss.
First, the emperor realizes that he cannot see the robes, and in shame, he hides this from all around him. His advisors, likewise, are each horrified and scared to realize they cannot see the cloth. Like the emperor, they feign admiration; they dare not confide in each other about what they do not see.
When the robes are finished and the tailors are paid, the Emperor dons his new clothes and takes a walk around his kingdom to show off.
And as he enters the cheering crowds, they, like the Emperor and his advisors, are so busy hiding their own perceived deficits that they fail to point out the obvious.
It takes a child’s innocence to expose the plain truth - that the Emperor has no clothes.
For the ruling class, who had created and populated the spectacle of this parade, what a hard, long walk home it must have been.
They had been unwilling to “see” through the illusion that the tailors artfully wove; they instead had complimented the cloth, doubted themselves, and gone into their own silent-silos rather than sharing any part of what they were observing with others.
In so doing, they redoubled the power of the illusion, spreading it almost across the whole of the kingdom.
All too often, we humans maintain the illusions around us, not because any of us fully believe in those illusions, but because we think most other people do…and because it’s easier, frankly, to go with power and trend than it is to question things, especially in complex, multi-dimensional systems.
And the result? Well, we can miss what otherwise (especially in hindsight) seems obvious.
I think we have illusions about open infrastructure that we need to question and dispel. These haven’t been maliciously created, but they are powerful frames that distort our vision and that keep us separated, siloed, and unwilling to act collectively.
So, what are some of the illusions that I think we are living too comfortably with?
- There’s not enough money to fund open infrastructure
- If one is good, five is better
- Open infrastructures are riskier than proprietary solutions
- Open means (or should mean) it’s free or nearly free
- Nonprofits are values aligned; commercial entities are not
- Survival equals success
- Innovation is the best way to stay relevant/keep up with technology
- If it has an advisory group, it is “community governed” and “community-led”
- Only the three largest cloud services can provide scaffolding for our infrastructures
Someday, I’ll name and talk about more of these illusions. For the sake of time today, I’m focusing in on one, though even a couple of days ago I was still trying to make this 20-minute slot cover 2-3.
I’m neither the first nor the only one questioning these “illusions”.
Still, questioning them is hard.
Really questioning, let alone addressing, any of these illusions could require big changes: changes in our partnerships, investments, procurement processes, and services. They might also cause big consternation in our parent institutions or in our service base.
So let’s look at this first illusion.
Illusion 1: There’s not enough money to fund open infrastructure
Before I try to question this illusion, let me reassure the open infrastructure leaders in the room — I know full well that most of my friends and colleagues who are directing or working for open communities, open tools, open services, and others occupying that notably loose category of “open infrastructures,” especially in today’s political and fiscal climate, barely have enough money to stay afloat.
Right now, even most of the core infrastructures that the research community has come to rely on over the last ten decades are financially insecure and in a perpetual state of firefighting. They have little-to-no runway, let alone operational reserves. As a result, they are chasing money constantly - scraping together what they can from any available source and hoping that the winds will change for the better eventually.
And those who know just how dire it is ALSO know that calling out examples is risky business.
Some of the risk comes from how we currently get our information - through back rooms and back channels. Information is privileged, and even knowing and seeing problems in our open infrastructures’ governance, financial models, and organizational models doesn’t mean that we have power to address them or the authority to speak out about them.
Some of the risk also comes from the harm that calling out examples could do. Spotlighting trouble spots in infrastructure — if it’s treated as infrastructure — should enable us to collectively act to FIX those problems. If you know that your collective water infrastructure — your pipes — is getting contaminated by the lead IN those pipes, you are incentivized to fix that problem collectively. The city or borough takes care of the problem once it comes to light.
Right now, we don’t have a city or borough to lean on, though. We have open tools that we might call open infrastructure and lean on like open infrastructure, but that we fund and build as separate service pods.
Unlike calling out troubled infrastructure, calling out a troubled “open” tool risks causing ripples of doubt in the client base of the open tool or open service that is spotlit. And if there are other options for that open tool or open service?…well, many of those clients will hedge their bets and move their institutional investment somewhere else that seems safer.
Our current system incentivizes our siloed open tools and services to jockey and pretend and spend inordinate amounts of time and energy trying to attract funding to stay afloat.
The library community has spiralled around this problem for decades now, wanting to “go open” and wanting to shift from “tools” to “infrastructure,” but not knowing how to actually do it.
That brings us here, to what we have.
When we look at the sheer volume of open tools, standards, and services we are building year after year, we know there could never be enough money, time, and attention to give to all of these things.
This slide includes a vast array of open tools, standards, protocols, and services that we use to create, curate, disseminate, and archive/preserve knowledge collections. The images are screenshots of some perhaps-familiar resources that have tried to “map the open landscape”, including 101+ Innovative Tools (note that now the FORCE11 list has more than 600), Mike Roy and David Lewis’s “Mapping the Scholarly Communication Landscape” project and insightful reports, several registries and information hubs about tools including Community Owned digital Preservation Tool Registry (COPTR), SComCat, Infra Finder, and Ithaka S&R’s recent Common Scholarly Communication Infrastructure Landscape Review.
The slide gives me heart palpitations. But it’s not just because we’ve created all of these things. It’s that we’re trying to sustain most of them, not in collective ways, but as organizational — and often also technical — silos. There is not adequate money, time, and attention to sustain everything.
We are libraries. We KNOW that problem. Every collection we’ve created over the years has required a steady selection process.
We need to stop treating our open tools and assets as individual items; we need to use selection principles to bring them into cohesion.
So — what if it’s NOT that there’s not enough money, but instead, it’s that the costs of competition (and duplication, siloization, and isolation) are too high?
That is a problem we can get our arms around collectively.
We need more encouragement of integration, organizational consolidation, and interoperability, not more funding directed toward solo enterprises.
And we already have great examples of how to do this. Each builds on open environments that are arching towards “infrastructure.”
- ArchivesSpace is one. It fused together two communities and two technologies, and it helped them move into a fiscal host environment that was trying to build up its technical capabilities and readiness. Was it easy or pretty? No. Was it necessary in order to create critical mass behind one infrastructure and to take advantage of some common service needs, both organizationally and technically? Maybe.
- Meru is another. It bridges several key open infrastructures together without absorbing them into one monolith - it encourages interoperability between Janeway, OJS, and DSpace and adds service layer options and new hosting capabilities on top of them.
- COAR is a third. It provides recommendations that aim to bring repository infrastructures into better alignment.
- I’ll also point to work underway that I know many of you are following, Blue Core, and to work I know everyone here is engaged in: HathiTrust itself.
Without bogging down in the metaphor, I think of us as being in a nascent cityscape. Some things have been built, some codes and standards have been partly established through trial and error and friendly co-opitition. Now, we need to fund some city planning work, not in drips and drabs, but in a full-scale way that lets us shift from a whole lot of starving, siloed infrastructures leaning on too few resources to a smaller number of core infrastructures, all operating within a deliberate system that is orchestrated enough to incentivize standards and intentional integration but loose enough to make room for continued innovations within guidelines that make it a safer, easier landscape for customers and users to navigate.
A big part of my work these days as Director of Programs at IOI is working with our team to analyse and try to understand the funding landscape for open infrastructures in our field of knowledge collection, curation, dissemination, and preservation. That funding landscape crosses over sector and geographic boundaries; it includes all of the commercial, nonprofit, government, and academic players that are funding some portion of our “backbone” infrastructures like repository systems, cloud environments, preservation processes, publishing platforms, and standards.
I’d like to bring two of the things we’ve been studying into the conversation you all are having today about co-investment. Both tie back to collective action, which is a conscious strength demonstrated by HathiTrust, one that I think we need to cultivate and spread.
The first of these things is what we can see about the way money flows through our field.
The second is what we know about what motivates our investments.
Let’s start with the money flows.
These next few slides feature research conducted by our IOI team in 2023-2024.
Infra Finder is a discovery tool and resource that we launched early this year to help decision makers, including funders and libraries, navigate their open options and make sound investment decisions. Our first release of Infra Finder contains detailed information about 57 open infrastructures; hopefully many of you have seen and used it. We are working now towards our 2025 update which will grow that number to around 150 infrastructures.
The second big 2024 release is IOI’s first State of Open Infrastructure report. Our team produced this report by gathering, analysing, and contextualizing trends in open infrastructure funding, governance, and adoption. We scoped the report to build on the information we have collected and shared about the 57 open infrastructures from Infra Finder; and as with Infra Finder, our 2025 State of Open Infrastructure release will include information based on roughly 150 infrastructures.
What are we learning about money flows?
Well, for one thing, the amount that is flowing is significant.
One section of our State of Open Infrastructure report focuses directly on grant award data.
In collaboration with Cameron Neylon and Karl Huang (Curtin University’s COKI team) and David Riordan (Enigma Technologies), we harvested available award data about the 57 infrastructures from funder websites and from OpenAIRE.
From the data we collected through Infra Finder, 45.6% of the 57 open infrastructures report “Contributions” as their primary funding source.
Today, I want to zoom in with you on the “Contribution” category - the 45.6% that you see in dark blue on the right hand side of the graph. “Contributions” include money that the organization received from donors and grantmakers.
I want us to zoom in further to see what we can discern about a significant part of “Contributions” — grant funding.
The funding represented in the table and graph below makes up a portion of that “Contribution” category. Remember, we are only looking at the 57 open infrastructures in our scope. For those infrastructures, we have tracked money flowing from 23 grantmakers, primarily foundations and government agencies.
It has come through 514 awards to 36 open infrastructures since 2000.
The total is $416M.
Grant data is notoriously NOT standardized or easily available, and we were searching for needles in haystacks. We can safely assume this is a subset of what’s out there, even for this limited set of open infrastructures.
This begins to improve our view, elevating us from our usual ground-level to a slightly higher perspective.
516 awards is a lot.
First, it’s a lot of projects for 36 operational infrastructures to handle. While some of the funding inevitably helped to found and ground some of these open infrastructures, most of those 516 awards did not. Instead, the awards fund work that builds on those infrastructures.
Let’s take an example. A leading open infrastructure, a preprint provider, receives a multi-million dollar award. So far, so good. That’s a lot of money.
But — if that award is geared towards all new work on testing and implementing new features or new tooling or new forms of interoperability or standards support, that assumes that something else is paying for the operational costs associated with running the preprint services that already exist. There is a gap.
Or — if that award is not for new features or tooling, but instead funds research conducted ON the preprint infrastructure, that also assumes that something else is paying for the operational costs associated with running the services, and that something else is paying for any increased stress that the research might place on the system. Again, there’s a gap.
I think that gap may be costing us and our open infrastructures more than we realize.
Our grants to existing infrastructures often require but do not fund the invisible labour within the infrastructure’s staff and team that supports the research. The funder isn’t at fault, and neither is the infrastructure — but both are ignoring that gap in ways that endanger the open tool both are trying to serve.
This view helps us see the vulnerabilities open infrastructures are experiencing, even when they seem to be landing major awards that we assume subsidize the cost of the infrastructure for those who are using it. Look at what these grant funds are providing — and also at what they are not providing.
Open infrastructures report that their biggest support needs are for operations, followed by community. Those two things together amount to 66% of the “support needs” reported. This is where infrastructures report that they are feeling the pressure.
But the grant funds they both APPLY FOR and RECEIVE primarily cover research and development.
It’s time to tell a different story.
We are funding for short-term success, not for maintenance, growth, or interoperability. And that, not the lack of funding, is making it difficult for us to shift from a bunch of tools, each incentivized to innovate and build separately, to a collectively funded open infrastructure environment that we all agree to build upon.
So how do we begin to direct our dollars (and time and effort) in a different way?
As a couple of HathiTrust staff and members wrote in their “through the Looking Glass” series about your efforts to onboard more players into your shared print libraries work,
“While valuable, networks alone are not enough, particularly when we self-select with peers. We need to foster diversity of actors and expertise. With its focus on multi-entity participation, shared goals, co-evolution, networks of shifting relationships, and links of data, service and resource flows, an ecosystem approach would enable libraries to build collective action.”
Most of us have decided we should be pooling what we have, focusing on our areas of strength and clear signals of user demand.
Most of us are also grappling with how to design our environment to operate more like an ecosystem.
We even have models for this, including HathiTrust itself.
It is funded by those who use it and those who care about the collection(s) it represents. It’s not heavily dependent on grants, even for R&D. It is governed by its stakeholders. And it makes room for diverse voices through a range of engagement pathways and mechanisms.
It is becoming “infrastructure” in ways that are both human and technical.
That is powerful. It provides a level of stability that most “open infrastructures” and open programs simply do not have.
So — what if HathiTrust wasn’t the exception, but became more the rule?
What if instead of funding everything a little bit, we funded a common core well?
What could we use to select the elements of a common core? How do we decide which of the many competing groups is worth investing in?
This has been one of the hardest elements to grapple with. Who gets to make that call? How do we make sure that it’s really a collective decision, and also that it’s a wise decision that steers in mission-aligned directions for the stakeholders involved?
I don’t have the answer, but I’m going to point to some of the very important work that is going on that might help us to find the answers (they will be multiple, not singular!).
As I bring this talk to a close, I want to reflect back to two assertions I’ve made:
- Our current system incentivizes our siloed open tools and services to jockey and pretend and spend inordinate amounts of time and energy trying to attract funding to stay afloat.
- We currently get too much of our information through back rooms and back channels. Information is privileged, and even knowing and seeing problems in our open infrastructures’ governance, financial models, and organizational models doesn’t mean that we have power to address them or the authority to speak out about them.
There has been a steady swell of activity in the last few years to try to give us a better way to gain information about all of the “open” tools and elements with whom we consider working or in whom we consider investing.
Increasing adoption of and investment in open infrastructure depends in large part on making it feel safer.
We want to know that what we’re investing in isn’t vaporware or smoke and mirrors.
And one of the ways we’re increasingly working to do this is through invoking our values and principles.
From HumetricsHSS, which involves groups in defining values for themselves (and congrats to Nicky, Jason, and the team for their new values retreats, funded by the Mellon Foundation!), to the Library Partnership Rating, POSI, FOREST, and the Publication Facts — we are trying, across the field or developing ecosystem, to begin making open infrastructure options more visible and easier to understand and differentiate between.
Unlike the magic cloth that the Emperor bought to separate the grain from the chaff and identify who was wise and deserved to serve him, we can build transparent — not invisible — mechanisms that we, collectively, define and use to evaluate, study, and better understand the set of open tools and services with whom we work.
Clear criteria, representing spectra of practice (not binaries), can help open infrastructures show their strengths, recognize what they want to change and how they want to grow, and incentivize all players to aspire in common directions.
In turn, those criteria can help to guide our co-investment and collaboration.
We're continuing to monitor the trends and movements in the open infrastructure space and to share key observations and opportunities to further investment in and adoption of open infrastructure — subscribe to our newsletter to stay ahead of the conversation.