The following is an extract from the Digital Science White Paper, A New ‘Research Data Mechanics’, written by Simon Porter, VP Research Relations and Knowledge Architecture at Digital Science. This paper highlights six recent advances in research information infrastructure and seeks to recast how we think about metadata – not as a series of static records, but as objects that move between systems and organisations.
Simon’s paper is available as a free download here.
To borrow from Newton, our task is to define the new “laws of motion” for research information. It is time to conceive of a new Research Data Mechanics that brings to the fore the ways in which information travels through systems and, in the process, to create a template for a more efficient research cycle. In the case of classical mechanics, these laws of motion are determined by the forces felt by the particle. In the case of Research Data Mechanics, our particles are items of data and the underlying laws of motion are university, government, publisher, and funder policies and practices.
This paper illustrates the ideas of Research Data Mechanics by examining six recent advances in research infrastructure:
1. The increasing availability of publication information to research institutions.
2. The transformative effect of ORCID.
3. The disentanglement of system silos from research workflows.
4. The connection of collaborative environments into the research ecosystem.
5. The expanding network of research particles to cover research grants.
6. The rise of organizational context: an increasing shift from internal to externally linked identifiers.
Advance 1: The increasing availability of publication information to research institutions
Adopting a Research Data Mechanics view of the research landscape, we see an expanding network of systems that are connected with minimal latency. These connections change the nature of what can be easily valued by the research system, and empower universities to do more with the information they have.
Research institutions are perhaps those that have the most to gain from improved flows of research information. These organizations occupy a sometimes uneasy position at the juncture between researchers, publishers, funders and governments. They are rarely in a position to completely mediate all processes that a researcher engages with. Consequently, research institutions have faced significant challenges in discovering and collecting information on research publication, often only after they have been published in journals or by catching up on the execution of research grants after they have been awarded by funders.
With publications management in particular, the sheer volume of material created by a research institution each year makes manual recording of these records across a broad cohort of researchers impractical, or at least exorbitantly expensive. In response to this challenge, systems such as Symplectic Elements have shifted institutional workflows away from asking researchers (or administrators) to rekey their publications into university record-keeping systems, and towards processes that leverage global publication data sources to identify publications that likely belong to the researcher. From the perspective of the researcher, the work being asked of them by the institution has shifted from, ‘Can you please tell us everything about your publications (again)?’ to, ‘We think this is yours, could you please confirm?’.
This shift illustrates the first example of cyclical movement of metadata about a publication across organizational boundaries. Publication metadata that has made its way from a publisher’s submission platform and review system, into publication, then onward to discovery and citation outside of the research institution is effectively glued back into the research institution.
Having been reattached to an institutional context through a research information management system, the publication metadata gains additional context. It is linked to researchers within departments, to institutional grant records and potentially to a number of classification systems. Armed with this extra information, research institutions are not only able to construct richer internal representations of who they are and what they do, but also share these representations with the broader community through research profiles. At their best, these profiles increase the ability of students to find supervisors, companies to find research partners and researchers to find collaborators. Because these profiles increase the likelihood of research talent being discovered, these systems increase the latent serendipity of an institution’s research.
Integration with an external world of information offers advantages that go well beyond reducing administrative overhead. In moving from ‘data entry’ to ‘data glue’, a second transformation occurs. Unburdened by the limitations of manual data entry, the information reclaimed from publication data sources is richer, and crucially contains publication identifiers such as DOIs. Through the leveraging of these identifiers, institutional research information systems move from being inwardly focused, to information nets capable of linking through to updated citation counts and altmetrics. A live connection to the continuing journey of a publication has been established, allowing an institution to regularly construct up-to-date bibliometric views of research performance and attention at multiple organizational levels of the institution. For citations sources, analysis previously limited to the whole organization, can now easily be broken down by department. For altmetrics sources, something even more transformative occurs: an ability for institutions to react to metrics in ‘real time’ – actively contributing to the process of attracting broader engagement with research.
Altmetric for Institutions for instance, allows institutions to integrate their research information systems with live measures of attention including news articles, policy documents, Wikipedia citations and social media attention. What made this possible was the now standard capability of research information systems to comprehensively store for each publication a set of DOIs, Pubmed IDs, and other external identifiers. Using this data, Altmetrics for Institutions is able to provide ‘live’ departmental and researcher-level measures of attention. These data can also be used by savvy institutions to create evidence of pathways to impact for a vast array of different types of research projects. As understanding around the measurement of research impact evolves, it is clear that systematic approaches to capture and quantify impact pathways is a key component in a research system portfolio. Through the simple act of connecting existing publication information to Altmetric for Institutions, a large comprehensive institution can expect to see upwards of 100,000 new data points about its publications, with new information added daily.
With the ability to measure new forms of publication activity, it now becomes possible to value other modes of research communication, rather than just peer reviewed journals. For example, consider this commentary piece with a high Altmetric score: “Policy: Twenty tips for interpreting scientific claims.”
Altmetric has captured how this publication created strong public and policymaker engagement, even receiving a companion piece in The Guardian, “Top 20 things scientists need to know about policy-making.”
Yet under traditional assessment paradigms, this article would be unlikely to be recognized, as it is a commentary piece rather than a peer reviewed journal article. Armed with new information and efficient ways of collecting data, institutions are now empowered to value new modes of research engagement.
Altmetrics also introduce another dimension to bibliometric analysis – namely, the ability to intervene. Historically, scholarly attention to articles (or citations) accumulates with such a lag that it is difficult to significantly change the attention garnered by an article. The principal method of effecting change has been presenting at seminars and conferences to “sell” an article to an academic audience. With Altmetric for Institutions, institutions are empowered to immediately discover discussions about their research and take action to either engage new communities or promote the research more heavily. Using interesting discussions identified in the tracking of altmetrics is one way of building further relationships with alumni. Typically, this may not be in scholarly forums but rather in equally valuable discussions with local communities, industry, professional or alumni groups.
The use of altmetric data in this way creates another pressure on the collection of publications information: the need for an institution to identify a published article as soon as it appears. The cycle of data collection changes from annual to continuous. An additional requirement is that for any given publication record, there is as little latency as possible as it moves from research information systems to systems that track altmetrics. Data is not simply glued together in situ, it is enhanced in its motion through systems, and the speed at which it moves matters.
To read about the other five research advances Simon has identified, you can download the white paper from the Digital Science website or the figshare link below.