It is an important objective to achieve capable and efficient data re-use within and 10 Mar 2023 across many fields of research in the SSH-STEM spectrum, including the digital humanities (Martin-Rodilla and Gonzalez-Perez, 2019; Poole and Garwood, 2020). The potential commonly appended to data re-use is significant and wide-ranging: data re-use can facilitate the verification of research findings, support the exchange of knowledge, improve the efficiency of research planning and execution, and yield additional results if paired with the use of new methods or interpretative frameworks (Borgman, 2012; Whyte and Pryor, 2011). While the literature strongly shows that sufficient paradata (Couper, 2000; Sk ̈old et al., 2022)—i.e., process data describing how data was created and curated, cf. metadata (Mayernik, 2020)—pertaining to the dataset being re-used is a key element in making data re-use practically possible and scholarly relevant (Faniel et al., 2013; Yoon, 2017), research has just begun to explore where paradata can be found and what strategies could be employed to identify and harness paradata in support of secondary data use (e.g., B ̈orjesson et al., 2022; Huvila et al., 2021).
The present paper seeks to address this knowledge gap by elucidating what paradata outputs emerge during research, from design to enactment, reporting, and concluding data management work. Paradata outputs are identified on the basis of an interview study of researchers and professionals (n=33) working with archaeological data in different capacities. The resulting paradata-output typology is used to drive a discussion of possible strategies for capturing paradata outputs for the purpose of facilitating data re-use. The archaeological case study is valuable for understanding paradata outputs in a DH context because of the many similarities of data work in the two domains. Both DH and archaeology are characterized by data collection and processing being innovative, quickly evolving, and involving both interpretation and technological application e.g., in GIS and 3D visualization applications (Choumert- Nkolo et al., 2019; Nicolucci, 2012). The analysis of paradata outputs is guided by a genre framework (Andersen, 2008) emphasizing the close relationship between research paradata and the scholarly activity systems which they are a part of.
The findings show that paradata can be found not only in more traditional forms of documentation, but also in more informal and ephemeral contexts. At the macroscopic level, these contexts can be divided into strategic and operational groups. The strategic grouping includes examples of purposeful descriptions of data, such as data management plans, methodologies, and data dictionaries. Paradata outputs in this group were generally communicative in nature and directed toward specific audiences. The operational grouping of paradata outputs relate to the more immediate aspects of ongoing research work and include data management actions, various types of dialogue (in person, via email, or in social media), visuals, or digital signatures. Textual paradata outputs under the operational heading were often meditating in nature and primarily intended to support tasks.
The broad range of paradata outputs identified in the analysis indicates that the strategies used to capture and use paradata outputs to facilitate data re-use in DH must be similarly broad in scope. The paper suggests that high-relevance focal points to consider in the development of such strategies are procedurality, the scholarly context, user perspectives, and paradata literacies. While the former category pertains to how paradata outputs can be harnessed – e.g., by manual or automatic means – the subsequent ones stresses the importance of having actionable insights into how researchers create paradata and what paradata data re-users require in order to be successful and the need to be able to support data re-users in gaining competencies in finding, identifying, and employing useful paradata.
Andersen, J. (2008). The concept of genre in information studies. Annual Review of Information Science and Technology, 42(1), 339–367.
Birnholtz, J. P. and Bietz, M. J. (2003). Data at work: supporting sharing in science and engineering. Proceedings of the 2003 International ACM SIGGROUP Conference on Supporting Group Work, 339–348.
Borgman, C. L. (2012). The conundrum of sharing research data. Journal of the American Society for Information Science and Technology, 63(6), 1059–1078.
B ̈orjesson, L., Sk ̈old, O., Friberg, Z., L ̈owenborg, D., P ́alsson, G., and Huvila, I. (2022). Re-purposing excavation database content as paradata: an explorative analysis of paradata identification challenges and opportunities. KULA: Knowledge Creation, Dissemination, and Preservation Studies, 6(3), 1–18.
Choumert-Nkolo, J., Cust, H., and Taylor, C. (2019). Using paradata to collect better survey data: evidence from a household survey in Tanzania. Review of Development Economics, 23(2), 598–618.
Couper, M. P. (2000). Usability evaluation of computer-assisted survey instruments. Social Science Computer Review, 18(4), 384–396.
Faniel, I., Kansa, E., Whitcher Kansa, S., Barrera-Gomez, J. and Yakel, E. (2013). The challenges of digging data: a study of context in archaeological data reuse. Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 295–304.
Haslhofer, B., Isaac, A., and Simon, R. (2018). Knowledge graphs in the libraries and digital humanities domain.
Huvila, I., Sköld, O., and Börjesson, L. (2021). Documenting information making in archaeological field reports. Journal of Documentation, 77(5), 1107–1127.
Martin-Rodilla, P. and Gonzalez-Perez, C. (2019). Metainformation scenarios in digital humanities: characterization and conceptual modelling strategies. Information Systems, 84, 29–48.
Mayernik, M. S. (2020). Metadata. Knowledge Organization, 47(8), 696–713.
Niccolucci, F. (2012). Setting standards for 3D visualization of cultural heritage in Europe and beyond. In Paradata and Transparency in Virtual Heritage, eds. A. Bentkowska-Kafel, H. Denard, and D. Baker, pp. 23–36. Farnham: Ashgate.
Poole, A. H. and Garwood, D. A. (2020). Digging into data management in public-funded, international research in digital humanities. Journal of the Association for Information Science and Technology, 71(1), 84–97.
Sköld, O., Börjesson, L., & Huvila, I. (2022). Interrogating paradata. Information Research, 27.
Whyte, A. and Pryor, G. (2011). Open science in practice: researcher perspectives and participation. International Journal of Digital Curation, 6(1), 199–213.
Yoon, A. (2017). Data reusers’ trust development. Journal of the Association for Information Science and Technology, 68(4), 946–956.