Open infrastructure governance: Current structures, nomenclature, composition, and trends

Introduction

Many open infrastructures that support research and scholarship deploy some form of “community governance,” a relatively loose term that is used to describe a wide-ranging spectrum of practices (Dana et al., 2021, Hart et al., 2022, Moore, 2021). At its plainest level, governance simply means making and enforcing decisions, and within that framework, “community governance” often signifies that a community’s members are in some way deliberately involved in decision-making processes. Usually, but not always, the work of community governance is unpaid and is provided by volunteers, not by staff members or those who are positioned to gain direct financial benefit from a programme or service. 

The concept of community governance is championed throughout the open space, and many open infrastructures explicitly claim to be “community governed" or “community led”.[1] This community involvement is often invoked as a point of differentiation and as a marker of trust, both within and beyond the scholarly communication ecosystem. It implies inclusivity and voice, but these terms are imprecise at best. What are the characteristics of community governance, and what are the models for engaging community in decision-making or advisory capacities in today’s open infrastructures that support research and scholarship?

Below, we take a close look at the range of community processes employed in 54 open infrastructures to begin to answer a few key questions:

  1. What types of governance models are deployed in/for these open infrastructures?
  2. How are open infrastructure governance bodies named? Do these names align with specific definitions/meanings? 
  3. Who participates in open infrastructure community governance, and what can we know about the group of individuals or institutions that perform these roles? 
  4. Is there overlap in governance participation (e.g., where one participant serves on multiple governance bodies)?

A total of 82 open infrastructures that support research and scholarship were considered for this analysis, all of which were initially invited to be featured in our Infra Finder tool.[2] For a full list of these open infrastructure service providers, please see the open dataset that we have published in Zenodo (Skinner, 2024). 

Methods

Between January and April 2024, we conducted web-based research to identify and record any publicly available evidence of community governance bodies and roles for 82 open infrastructures that support research and scholarship. These services were initially identified for and invited to participate in our Infra Finder tool in 2023 (Collister et al., 2024). 

Our first step was to investigate which of these 82 infrastructures had community governance documentation available publicly on the web. We recorded the names and current affiliations of all currently listed members of the body; we also captured information about the governance body name(s) and any specified roles (e.g., officers). In addition, we captured information about each infrastructure’s operational structure, and we recorded whether each governance body focused exclusively on a specific open infrastructure or if it focused more broadly (e.g., on a group of services or on the service provider’s host organization). 

Our analysis primarily focuses on the 54 open infrastructures for whom we found such public documentation on the open web. Once the dataset was created and documented (including Wayback Machine captures of all evidence), we asked a range of questions, starting with general details, such as what the 54 identified "community governance" bodies actually govern or advise, what they are called, and what size they are. We then analysed data about the unique individuals and institutions serving in these roles, and sought to understand which of those institutions and individuals are represented on more than one community governance body. 

Findings

The open infrastructures in this study included a mix of models, including those with and without community governance frameworks publicly available (see Figure 1. Community governance presence in open infrastructures). Of the 82 infrastructures we researched, 28 appeared not to have a documented community governance group.[3] More than half (42) of the open infrastructures have at least one documented, infrastructure-specific community governance body (seven had more than one).[4] Another five open infrastructures had at least one community governance body connected to what we have termed a “service group” that included several infrastructures under a single umbrella of governance (e.g., PKP’s Advisory Committee, which works with Open Journal Systems, Open Monograph Press, and Open Preprint Systems). The final seven open infrastructures in this study referenced only their host institution’s community governance body or bodies (e.g., ContentDM is governed by OCLC’s Board; Archipelago Commons is governed by Metro’s Board). 

Doughnut chart of the numbers and percentages of open infrastructures with community governance at various levels. The majority of open infrastructures have community governance at the open infrastructure level.
Figure 1. Community governance (CG) presence in open infrastructures.

All of the 19 freestanding/independent programmes have their own, open infrastructure-specific community governance bodies, while only 23 of the 35 hosted infrastructures had evidence of community governance groups dedicated specifically to the underlying open infrastructure that services were being provided for (as contrasted with service group focused, like PKP or host focused, like Metro).

Who legally/fiscally owns or bears responsibility for these infrastructures?

Open infrastructures exist in a broad array of organizational forms (e.g., university-hosted, incorporated, fiscally hosted, multi-institutional, and informal) and sectors (academic, government, commercial, nonprofit), each of which carries its own set of rules and conditions that may or may not work with particular governance frameworks.[5]

Of the 54 open infrastructures with documented community governance structures, 35 appear to be owned/operated by a host institution that provides the legal and fiscal framework under which they officially operate, while 19 appear to be freestanding or independent.[6]

The 35 hosted infrastructures represent an array of forms and relationship types. Some of the host institutions are universities (e.g., University of Bologna, Villanova, Simon Fraser, Harvard, Cornell); others are non-profit or for-profit companies that host multiple units or services (e.g., OAPEN Foundation, Coko Foundation), and still others serve as non-profit fiscal hosts that specialize in providing operational support services — including not-for-profit fiscal/legal identity — to programmes (e.g., Code for Science & Society, OpenAIRE, NumFocus). 

Some of the 19 freestanding/independent entities are nonprofits, public companies, or stichtings (Dutch foundations) depending on their national contexts (e.g., Islandora Foundation, COUNTER, Vivli, PeerCommunityIn, OA Switchboard). Others are unincorporated and represent formal or informal community efforts, or partnerships between other institutions (e.g., Oxford Common File Layout).

What are the infrastructure community governance bodies called? 

The 54 open infrastructures we studied used 26 different terms to describe their 64 governance bodies (as previously noted: seven infrastructures have more than one governance body, hence the disparity in the numbers above). The names of these groups included many variations on common themes, including “Steering Committee,” “Steering Council,” and Steering Group,” as well as “Board of Trustees,” “Board of Directors,” “Governing Board,” “Supervisory Board,” “Founding Board, “Executive Board,” “Advisory Board,” and just plain “Board.” 33 open infrastructures had at least one officer position; 21 had none. 

Looking across the selected 54 open infrastructures, their community governance approaches run the gamut from advisory bodies that provide input and guidance (e.g., arXiv, OpenCitations) to highly formalized groups that bear significant decision-making power and fiduciary and legal responsibility for a service (e.g., CrossRef, DOI Foundation). 

Based on the terminology, we can infer how many of these community governance groups likely have an official governing (decision-making) function vs. those that are likely to be advisory-only bodies. As seen in Figure 2, of the 64 governance groups, more than half (40) have names that seem to indicate governing function.[7] An additional 18 are explicitly termed “advisory,”[8] and six (four overlapping with “advisory”) reference a topical specialty.[9] A final four governance groups have names that seem too vague to warrant speculative classification as official governance bodies.[10]

Doughnut chart showing ratio of communtiy groups with governing function vs. advisory function, where 66.7% of groups have a governing function.
Figure 2. Community group function by infrastructure (ratio of full governance vs. advisory capacity)

Looking at this data from the starting point of the open infrastructure, rather than of the governance group, and again inferring from the group names, of the 54 open infrastructures, 36 (67%) have at least one group with a name that indicates governing function. Based on the names used, it appears the other 18 infrastructures (33%) have community groups with advisory responsibilities rather than full governance (decision-making) roles. Coupled with the often-confusing nomenclature, this raises questions about whether community members fully understand what role(s) their community groups do and do not play in decision making. 

For example, an infrastructure may have community governance structures, including member representation (for users or contributors), and its community members may believe this “community governance” structure means the community has an active role in decision-making processes. They may be caught off guard by discovering that it actually only has an advisory role when that infrastructure makes a major decision (e.g., moving from an independent hosting arrangement to being an acquisition of a major conglomerate). The existence of community governance, in other words, may signal levels of power and involvement that in actuality do not exist; advisory groups may play important roles, but they do not bear fiscal and legal responsibility for the infrastructure, nor do they have the legal standing to contest major changes that happen without their involvement.   

What is the distribution of individuals across these 54 infrastructures’ community governance bodies?

These 54 open infrastructures’ community governance environments include 567 total seats (averaging 11 seats per infrastructure); 496 individuals serve in these roles in 2024, with 48 individuals (10%) holding more than one seat or serving in more than one governance group and the other 448 (90%) only holding one seat each. Of those 119 seats occupied by the 48 individuals who serve on multiple groups, 30 people serve on two, 14 people serve on three, three people serve on four, and one person serves on five infrastructure governance bodies (Table 1).

This suggests a relatively wide distribution of seats to different individuals in these 54 infrastructure governance bodies, though there is some board interlock or concentration of individuals indicated by the 48 individuals occupying 119 of the total 567 seats, most particularly in the 18 people who serve on 3-5 infrastructure governance bodies in this set of 54 open infrastructures.

What is the distribution of institutions across these 54 infrastructures’ community governance bodies?

The institutional distribution in community governance seats is not as wide as the individual distribution, though it still shows strong diversity overall. The 567 total seats are held by 383 institutions.[11] Of those, 91 institutions (24%) hold more than one of these 567 total seats, while 292 (76%) hold only one (Table 1).

Of the 91 institutions that were represented in multiple governance groups, 56 institutions held seats in two, 17 institutions held seats in three, six institutions held seats in four, four institutions held seats in five, and eight institutions held seats in six or more open infrastructure governance groups. 

Individuals Institutions
One seat 448 292
Two seats 30 56
Three seats 14 17
Four seats 3 6
Five seats 1 4
Table 1. Number of community governance seats held by individuals and institutions across all 54 infrastructures

With a total of 69 seats (12%) of the total 567 seats, the eight institutions[12] that hold seats in six or more open infrastructure governance groups strongly influence the open infrastructure ecosystem. They include core contributors, founders, and longstanding supporters of a range of tools, including OJS, arXiv, and other cornerstone technologies. Four of these eight institutions are Canadian, and some of the density of their representation can be attributed to four infrastructures they have helped to found and support over several decades: OJS, OMP, OPS, and Érudit. The remaining four institutions include one US-based not-for-profit corporation (CrossRef/Publishers International Linking Association, Inc.), two universities (Harvard, Cornell), and the California Digital Library. 

The concept of community governance is championed throughout the open space, and many open infrastructures explicitly claim to be community governed or community led. This community involvement is often invoked as a point of differentiation and as a marker of trust.

Conclusions

There is no “right” governance framework or model that can or should be applied across these open infrastructures. We agree wholeheartedly with the COPIM team, who stated in 2022, “good governance is situated, i.e., it is highly specific to the resource and community in question” (Hart et al., 2022). However, after looking at the 82 infrastructures in this study, we would add that good governance is also well documented in language that the community and its extended ecosystem can readily interpret and understand.

Based on this set of 82 open infrastructures, community governance structures and terminology are prevalent, with well over half (54, or 66%) adopting and documenting such structures in some public format. That’s good news for those who seek more inclusion of users and contributors (fiscal, technical, and administrative) within open infrastructure decision-making structures. The slightly more complicated news is that these structures use fuzzy terminology that makes it difficult to tell what the community groups actually are empowered to do.

In the 54 open infrastructures that had some type of community governance structure documented, we found 26 different names and a broad mix of solo- and multi-group structures at play. Inferring from these governance group names, 36 (67%) of the 54 open infrastructures with such groups do seem to grant them some level of active decision-making power and steering or leadership function. Others seem to be limited to advisory and topical roles, though, with 18 (33%) explicitly using only “Advisory” or a topical specialty (“Editorial” or “Scientific”) in the group name(s). 

The wide-ranging nomenclature in a small field hints that there are likely a lot of unique snowflake approaches to creating and naming these community roles. If that is because the governance model is carefully situated and specific to the community’s needs, that might be read as a positive feature of these infrastructures. Where this instead becomes potentially harmful or dangerous is when a community develops a false sense of security, believing that its community groups possess an official decision-making authority that they do not have. For example, we could recount myriad open infrastructures that had visible, even vibrant, community governance groups that were sold or acquired without the consultation or involvement that many community members thought was guaranteed by the presence of community governance groups and processes. The wide range of practices (from advisory and topical to actual decision making) and the unclear language used across infrastructures can lead to confusion and misalignment. Community members' perceptions of their power and voice need to be checked and understood, ideally against official incorporation and bylaws documentation.  

While the group nomenclature might be wobbly, our research did also surface solid names and employment affiliations of the 496 individuals occupying 567 community group seats for these 54 infrastructures in 2024. That data gives us several lenses to explore, both now and in the future. First, looking at the individuals who are serving on community groups in 2024 shows that, rather than having a handful of individuals holding lots of governance seats and power across infrastructures, the field has a lot of diversity in these service roles. Similarly, 383 institutions are represented in those 567 community group seats, again demonstrating strong overall diversity. We do mark small pockets of concentration, including those human and institutional outliers who serve on multiple community governance groups.

“Good governance is situated, i.e., it is highly specific to the resource and community in question” (Hart et al., 2022). However, after looking at the 82 infrastructures in this study, we would add that good governance is also well documented in language that the community and its extended ecosystem can readily interpret and understand.

The data does show that a small group of individuals (48) and (especially) institutions (91) are represented in multiple governance groups, but it also shows a high relative distribution of individuals and institutions serving overall. In other words, instead of seeing the same set of people and institutions represented in, and representing, these infrastructures, we have found a relatively wide set of stakeholders investing their time (including institutional staff time) to support open infrastructures in this field with a few exceptions.

It’s important to note that this distribution of people and organizations only tells one part of the story. We can assume from the current numbers that we have a lot of different voices engaging in governance. But what does that actually mean? A few possibilities could include:

  • We have too many independent entities (people and institutions) designing in silos and not enough interconnection between open infrastructures.
  • We have healthy diversity and low concentration of power in community leadership.
  • We have an open playing field with room for many voices, perspectives, and approaches.
  • We have a small number of people and institutions wearing multiple open infrastructure hats who are encouraging interdependence, exchange, and collaborative alignment between infrastructures.
  • We have a small number of power brokers who are playing an outsized role in shaping open infrastructure as both institutions and individuals.

In other words, without further research, the claims we can make based on this first year of data are still limited. Over time, we hope to be able to read and understand the health of the open infrastructure ecosystem through these types of investigations. In future years, we would like to begin comparing the community group composition across time to see if the current distribution holds or if more consolidation or diversity of roles becomes visible in multi-year analysis.  We will also be able to see how much change occurs over time in the composition of specific boards. Pairing such data with additional information about funding, adoption, and use of these infrastructures should reveal much that we do not yet know about how open infrastructures are developing and maturing, not just individually, but as an interdependent system serving research and scholarship.

Data availability

The data used for the analyses presented here (Skinner 2024) may be downloaded from: http://doi.org/10.5281/zenodo.10934091 

References

Bilder, G., Lin, J., & Neylon, C. (2020). The Principles of Open Scholarly Infrastructure (v1.1, 2023). https://doi.org/10.24343/C34W2H 

COAR/SPARC. (n.d.). Good Practice Principles for Scholarly Communication Services. https://sparcopen.org/wp-content/uploads/2019/01/Sparc-Good-Practice-Principles-v4.pdf 

Collister, L., Tsang, E., & Wu, C. (2024). Infra Finder: A new tool to enhance transparency, discoverability, and trust in open infrastructure. International Digital Curation Conference (IDCC), Edinburgh, Scotland. [Preprint] https://doi.org/10.5281/zenodo.10913249 

Dana, C., Hornbein, D., Russell, V., & Schneider, N. (2021). Community Rules: Simple Templates for Great Communities. https://communityrule.info/assets/book/gov-booklet-MASTER.pdf 

Hart, Adema, & COPIM (2022). Introduction, Towards Better Practices for the Community Governance of Open Infrastructures. https://doi.org/10.21428/785a6451.0304a2a8.

Lippincott, S., & Skinner, K. (2022). FOREST Framework for Values-Driven Scholarly Communication. https://doi.org/10.5281/zenodo.6557301

Moore, S. (2021). Exploring Models for Community Governance. COPIM. https://doi.org/10.21428/785a6451.0304a2a8 

Skinner, K. (2024). Data for: Open Infrastructure Governance: Current structures, nomenclature, composition, and service trends, 2024 State of Open Infrastructure Report. Zenodo. https://doi.org/10.5281/zenodo.10934091 

Feedback

  1. For example, some description of “good” or “community” and/or “stakeholder” governance runs through many of the values and principles models that are gaining traction in the field of scholarly communication, including Principles of Scholarly Infrastructure (POSI, Bilder et al., 2020), COAR/SPARC Good Practice Principles (COAR/SPARC, n.d.) and the FOREST Framework (Lippincott and Skinner, 2022).
  2. https://infrafinder.investinopen.org
  3. This is a point of differentiation, not judgment. These 28 groups may have different reasons for not having a governance body, including because the service is still emerging/forming, because the service is distributed by design and does not desire centralized governance processes, or because its business form makes community governance hard to implement.
  4. arXiv, DOAB, DOAJ, Dryad, Europe PMC, Islandora, and OAPEN Library each have two or more governance groups.
  5. For example, many infrastructures that support research and scholarship are based in colleges and universities, and they ultimately answer to and are controlled by their home institution’s Board of Governors or Trustees; in these environments, community governance usually is constrained to advisory capacity.
  6. Our sources were publicly available documentation, including self reporting. While we have made our best attempt to categorize these accurately, hosting relationships are notoriously hard to establish without direct contact with principals.
  7. Board (8), Board of Directors (14), Board of Trustees (1), Executive Board (5), Executive Committee (2), Executive Council (1), Governing Board (1), Leadership Group (2), Steering Group (1), Steering Committee (3), and Supervisory Board (2)
  8. Advisory Board (8), Advisory Committee (4), Advisory Group (1), Editorial Advisory Council (1), Institutions Advisory Council (1), Scientific Advisory Board (1), Scientific Advisory Committee (1), Scientific Advisory Council (1)
  9. Editorial Group (1), Editorial Advisory Council (1), Scientific Advisory Board (1), Scientific Advisory Committee (1), Scientific Advisory Council (1), and Scientific Committee (1)
  10. Funder Committee (1), Operations Team (1), Participating Organization Council (1), and Project Management Committee (1)
  11. Ten of these “institutions” were individuals that identified as consultants, artists, independent, or for whom we found no formal institutional affiliation.
  12. Crossref, University of Alberta, Cornell, Harvard, Université de Montréal, Simon Fraser, Public Knowledge Project.