Aleksandra Pawliczek: Building up a Research Infrastructure on the First World War across Borders
Abstract:
The Collaborative EuropeaN Digital Archive Research Infrastructure (CENDARI) was funded by the European Commission to create a humanities research infrastructure that integrates access to archives, connects knowledge, and supports the research process for two domains: First World War studies and medieval history. Conceptualized as more than a static portal, it will not only give access to information on archival material relevant for these two domains, but also allow for network activities of historians, archivists and librarians which will preserve the system and create new digital research space to support and enhance specific historical methods of research. This paper gives an introduction into the work of the CENDARI project, focusing on the First World War domain and putting it into relation to the Medieval Studies as well as addressing the interdependencies between cultural heritage institutions and their clientele.
Introduction
How to search for historical material for one's own research question? How to find it, how to evaluate and visualize research results in the digital space? How to extract, to store and to use the existing information and how to read it in a multilingual context?
The increasing digitisation of source materials is changing the way historians can do history. An enormous amount of archival material on First World War can be found in national, state, municipal or private archives, in libraries and museums worldwide, and many of these cultural institutions are in the process of digitising much of their holdings. The availability of finding aids, catalogues, source information on documents, as well as photographs and films makes it easier to locate material relevant to many research topics for which the historical records have been lost, fragmented, dispersed or relocated. Digital access to this information and computerized tools to search it accelerate the investigation and allow for developing new methods of historical analysis like the virtual reconstruction of split but coherent provenances.
The CENDARI infrastructure aims at providing a digital research space to support and enhance specific historical research methods. The access to cultural heritage resources and their relevance for explicit studies can be demonstrated on model approaches to precise topics on First World War, like Virtual Reconstruction of Lost Material, Submarine Warfare, Memory and Commemoration – and thus applied to any historical analysis of archival material.
What is CENDARI?
The Collaborative EuropeaNDigital Archive Research Infrastructure (CENDARI) was funded by the European Commission to create a humanities research infrastructure that integrates access to archives, connects knowledge, and supports the research process for two pilot domains: First World War studies (WWI) and medieval history (MM). CENDARI takes an innovative approach to digital historical research as well as to data integration and curation. CENDARI collates archival data and serves as a research environment where users can conduct their research, from finding and organizing sources to analysing and sharing data. The project team members represent diverse disciplinary perspectives while creating converging paths toward a research infrastructure that will advance historical inquiry in the digital age.
From the perspective of a historian, certain established research methods cannot be replaced or abandoned in a digital research infrastructure, above all, the verification of sources. Simultaneously, the focus on transnational research questions becomes more apparent within the scholarly community of historians, especially concerning events of transnational character like the First World War. The transnational aspect applies to both, the research topics as well as to historical material from different countries and institutions. In order to further transnational and comparative research and overcome entrenched historiographical and digital asymmetries, the project includes eastern and southern European repositories ('hidden archives' to many historians) along with the more visible western European institutions.
From a computer science perspective, the relevant data is heterogeneous in terms of languages, formats, level of granularity, completeness, encoding standards, etc. Therefore CENDARI has implemented an extensive approach to data integration and curation based on the concept of 'data space'. This will produce a flexible and interactive digital ecosystem, underpinned by various ontologies, that enables collaborative research using a variety of digital tools. Cooperation with the European digital humanities infrastructure Digital Research Infrastructure for the Arts and Humanities (DARIAH) will ensure the ecosystem's sustainability.
CENDARI offers historians a dynamic user interface to research their topics of interest. While its enquiry environment is focused on the initial, exploratory phases of research, CENDARI will go beyond "search and retrieva"; historians will be able to analyse data with the help of refined data mining and visualization tools. They will also be able to upload their research to a personal research space where they can organise and exchange data with other researchers using annotations, tags or semantic links. This enquiry environment has been developed based on interactive participatory design sessions, domain specific "use cases", and two domain-specific "prototype projects", intended to integrate the user's perspective while the virtual research environment is built.
Image 1: Paradigmatic visualisation of the CENDARI virtual research environment (Jean-Daniel Fekete, CENDARI, INRIA Paris)
The starting point for research on archival holdings for the First World War studies and the medieval history is the CENDARI Archive Directory. It integrates data and metadata from archives, libraries, museums and research institutions across Europe and the world. The CENDARI Archive Directory is more than a mere list of archive and library addresses and their relevant holdings. It covers, in a representative manner, different types of institutions with archival holdings in all European and many non-European areas important for historical research of the two pilot areas. The Archive Directory is the backbone of the CENDARI research infrastructure on the content level. Further information on holdings, collections and items will be built upon the information contained in the Archive Directory.
Another core activity of CENDARI is the creation of Archival Research Guides, intended to facilitate the process of researching relevant historical material to a given research topic. These Guides are organised thematically along paradigmatic research topics and leading research questions within both of our pilot domains. They will guide and enable research methods in the CENDARI virtual research environment, also taking into account the traditional methods of archival research and their methodical modification in the digital sphere.
Synergies
CENDARI is a highly collaborative project spanning across multiple areas of expertise and knowledge, and across the linguistic and cultural boundaries of Europe. It brings together fourteen partners from eight countries and comprises computing scientists, information scientists, leading historians, archivists, and librarians working within a programme of technical research informed by cutting edge reflection on the impact of the digital age on scholarly practice. Although CENDARI targets historians as its primary users, it encourages synergy between historians, archivists, and librarians. Often, digital resources are based upon scholarly standards of specific disciplines, like archival or library science, each with their own scholarly standards and methods. Within CENDARI's infrastructure, both content providers (ie cultural heritage institutions) and researchers will enrich their expertise and learn from each other's working methods.
While it is easily comprehensible that historians profit from a digital display of source information that allows for ubiquitous, time-saving access to finding aids and even single digitised documents, cultural heritage institutions are similarly interested in the needs of one of the most important groups of their visitors. At least four advantages of cooperating with digital projects like CENDARI can be mentioned at this point: visibility, contacts to the researchers, enhancement of existing information (crowdsourcing), and technical know-how accumulated within the CENDARI project.
Visibility: Portals and infrastructures like CENDARI give institutions the possibility to call attention of interested visitors to relevant holdings and collections. This refers especially to institutions which are underused but hold records of great relevance and importance for historical research of WWI and MM and can thus enrich the current research and influence the public policies of memory and memory institutions.
Contacts and Requests: To know the requests of one of the most important groups of visitors, their expectations and also their current research methods, helps developing workflows and strategies for tasks and priorities within the cultural heritage institutions and thus place the institutions in the centre of cultural and social developments.
Enhancement: The information available on historical sources – in inventories, catalogues and other finding aids – is just as good as the data provided in the process of analysis and indexing within the cultural heritage institutions. Furthermore, the metadata format often determines the usability and exchangeability of information on, for instance, allied material. The process of analysis is costly in means of personnel required and besides, the expertise on the intrinsic evidence of the sources often remains within the community of historians more than archivists and librarians. Therefore, the inclusion of this reliable community enriches and supplements the work of archivists and librarians and allows for exchange of information which can be then included in an institution's own enhanced publication and presentation.
Technical Know-how: Via cooperation, advice, consultancy, tools or access to audiences can be obtained for little investment on the part of cultural heritage institutions. There is no need for them to start looking for (technical) solutions in institutional seclusion, as many answers to key aspects of the digital sphere already exist in an open and collaborative environment. The experience of manifold projects and experts can be evaluated for one's own strategic work.[1]
CENDARI Archive Directory: Archives, Libraries, and Museums
Cultural heritage institutions possess a great amount of material, and hundreds of holdings and record groups relevant for WWI research can be easily traced within almost each and every one of them – their physical content stored and arranged in a paper trail. Each record group can be consulted via existing analogue inventories and other finding aids. Those finding aids are fully accessible within a given institution and often also fully accessible online. However, many institutions are still in the process of digitising their analogue catalogues and inventories and are far from completing this task. Some of them prefer to display at least some unstructured information in PDF or Word format, others give access to digital databases that are sometimes more, sometimes less well searchable.
For historical research aiming at transnational and comparative studies, all source material of the belligerent countries of the First World War play a key role, in regard to their military, diplomatic and political actions and thus to the records of the central administration and military leadership. In addition, the requirements of "histoire totale" and those of local or regional history of the War must not be neglected. The focus of WWI studies has changed over the time – aspects of colonial or gender history, of memory and commemoration, of the history of everyday life ("Alltagsgeschichte") on the home front have been added to the topics of political processes of decision making or military developments on the Western and Eastern fronts. And this process has not ended yet. Accordingly, the relevant information on WWI topics is vast and almost endless, given the fact that private material, usually stored in family attics, has also become of interest for historians and the public and has become accessible via web portals such as "Europeana 1914-1918".
Historians need to know the context in which their sources were created, preserved and distributed. They require information on each source's history – its use, storage and authenticity. Usually, the cultural heritage institutions possess the legal authority to account for the credibility and reliability of the source material they are responsible for. Thus, their catalogues and finding aids contain the necessary information on the subject of single holdings and record groups which explain how to use a given inventory and how to interpret its contents. This practice applies, however, almost wholly to archival institutions and is less common in libraries and museums.
At the same time, libraries and museums have proven themselves to be much more advanced than archival institutions in sharing and presenting their holdings to the broader public. Libraries and museums more often form clusters and exchange data on the material they are holding. This also applies to the digital presentation of their material. More recently, however, archives have started to apply similar methods, building collaborative networks to develop their digital presence. This ongoing process will sooner or later bring a differentiation, exchangeability and standardisation of archival information as well as a comprehensive coverage of information on what is actually stored in these institutions.
At the moment, however, any research on archival holdings is to some extent confined to existing available information, especially in the digital domain, which depends on the granularity and the extensiveness of the displayed data. An archive's digital presence often lacks comprehensive indication of what exact resources and in what amount are to be found where. Hence, bearing in mind that the process of digitisation and the increasing digital availability of information on historical (re)sources are still ongoing, it is necessary to adopt – as much as possible – the existing standards of describing (digital) archival data. Simultaneously, the digital information can be enhanced by including other existing analogue evidence on the historical material. Thus, a certain degree of interoperability and exchangeability – also sustainability – can be ensured.[2]
Consequently, the results of the initial assessment and evaluation process formed part of the process of developing the criteria for the shape and content of the Archive Directory on WWI, such as a balanced geographical range (all belligerent countries, front regions) and representative range of types of institutions, but also relevance of holdings and the status of digitisation, visibility, and accessibility.
Image 2: Example: CENDARI Archive Directory on Polish Institutions
The Archive Directory currently contains information on more than one thousand institutions relevant for research on both Medieval Manuscripts and the First World War, with more than 450 considered important for medieval research and more than 670 as part of the WWI network. All European countries are covered, with emphasis placed on those countries that meet the selection criteria mentioned above. Furthermore, institutions in many non-European countries were also included in the research. This institutional coverage is intended to build a matrix which will serve as a reference level from which derives more granular information on archival material relevant for each pilot domain. This matrix will also function as a link facilitating the attachment of further material and information in a hierarchical, yet connected and thus intuitive manner. Consulting multiple institutions is especially important for WWI research, not only for locating archival holdings and collections, but also for connecting them to each other, as historical materials have often been looted, relocated, destroyed or fragmented during the many conflicts and upheavals of the 20th century. To this ends, the CENDARI Archive Directory is an important tool for piecing together the information of each individual institution so researchers can virtually reconstruct the often intricate history of important archival materials and identify their current status and location, strongly intertwined with the history of countries and nations.
Image 3: CENDARI Archive Directory: coverage of European institutions for WWI and MM
What is visible, what is hidden?
The CENDARI research team places special emphasis on archival institutions in East and South East Europe. Those institutions and their holdings often remain neglected due to the aspect of language, as information about relevant material is only as perceptible as the language – and often languages – are readable by the interested scholar.
In this context, CENDARI gives special attention to the so-called "hidden" archives - smaller and less explored institutions. Thus, the definition of "hidden archives" has been seen as part of the process of selecting relevant information of lesser known institutions, holdings, and collections that are not being accessed widely by historians due to their perceived lack of visibility.
Hardly any public archival institution is actually "hidden" in the sense that its existence is not known to the scholarly community. Almost every institution presents itself in a clearly visible way on the web, adding information on its history, holdings and publications.[3] For researchers, most relevant institutions themselves are well-known. However, the rules of the "digital era" establish new conditions of visibility. A homepage displaying an address and office hours is no longer sufficient; more and more, the critical determinant of visibility is the quantity and quality of the information provided about the material stored in the institution.
This determines the perceptibility of the material and thus the research on a given topic in a new way – as implied in the principle "on the Internet - in the world". An institution's visibility therefore depends on the intensity of its digital presence and the accessibility it provides to all available information on its holdings. Even some national archives, virtual key players in the field of First World War Studies can in this sense be considered "hidden", as some of them do not provide accessible, structured information on their collections and holdings. On the other hand some minimal information with little granularity (eg nothing but the title of a collection or a record group) does not allow for much evidence either. It merely indicates the existence of potentially relevant material and has to be investigated more thoroughly to offer insight into its content – in order, thus, to become "unhidden".
Moreover, the visibility of archival material relevant for WWI and MM research also depends on the aspect of multilingualism. Information provided in the language of the country concerned is only as visible as the language is spoken. While Latin prevails dominantly in most medieval manuscripts, although strongly supported by numerous vernacular idioms, the content of modern historical records is much more multilingual and requires polyglotism, especially when the material deals with multinational conflicts and series of events. Apart from that, English has become the "lingua franca" of the Internet, so that meta-information provided in English is also considered more "visible", while information in many other languages, especially Eastern and South Eastern European languages, remains less "visible" in a global context. A certain significant imbalance arises from the fact that historical material produced by the Western and the Eastern belligerent countries is often treated differently, due to the individual spread of the respective languages.
CENDARI Archival Research Guides
For this reason, the CENDARI Archival Research Guides for the First World War focus mainly, though not exclusively, on topics concerning war-related events in the Eastern part of the continent, allowing for comparison with the more extensive research and literature on the Western events and policies. This way, the Guides will not only enhance the historical research on WWI in Eastern Europe and the Eastern front, but at the same time give more visibility to institutions with significant archival holdings in East and South East European countries. The Guides will allow for a virtual composition and access to comparable holdings across national and institutional borders, bestowing information on dispersed and relocated material belonging to the same historical context. They are intended to cover big thematic areas of historical research and they will also serve as "showcases" for user-generated content which may also be created in a guide format.
The CENDARI research team elaborated four components which will roughly cover the content of all CENDARI Research Guides. Although the Guides will differ to a certain extent concerning their focus and methodological emphasis, there are nevertheless several components considered to be essential for both research domains, in regard to their function as guides to topics and materials, whilst also meeting the requirement for historiographical and editorial context.
Image 4: Concept of the CENDARI Archival Research Guides (David Stuart, CENDARI, King’s College London)
The narrative text, forming a link to the current historiography, will focus on topics, concepts, events and developments, and enclose recent historiographical currents. The chosen topics do not claim objectivity, but according to certain methodological approaches they are perceived as paradigmatic and exemplary collections of information to be enhanced and complemented by researchers through comments, annotations, links to different resources etc. The content of the Research Guides does not claim to be exhaustive and can, and should, be further enriched by users. The Guides provide a starting point within the CENDARI research infrastructure for research on a given topic and a framework in which to work. The Guides will comprise collections descriptions of different granularity in more than one language across several institutions. They will strike a balance between collections with in-depth descriptions and digitised finding aids and collections descriptions from "hidden archives". They will comprise visual material as well as secondary – analogue and digital – resources.
The CENDARI Archival Research Guides direct users to variegated content on their topic as well as in the application of virtual tools and systems, thus enhancing the traditional methods of historical research. Furthermore, they will make the difference between a research on site – ie within an institution – and a digital research visible, ie stressing upon the fragmented and incomplete access to information facing the fact that not all information is yet available in a digital format.
Hence, the CENDARI Archival Research Guides are designed as methodological, paradigmatic guides for a virtual research environment. Some of the Guides will have an emphasis on "guides" rather than on "research", ie how to find closely related but physically dispersed material all over Europe and the World.
Some will be designed as access points to relevant contemporary research questions of the two pilot domains, with an emphasis on "research" rather than on "guides".
And finally, they will act as explanatory transfers from the analogue into the digital sphere, with an emphasis on "archival" rather than "guide" or "research" and they will relate to the changes in presentation and representation of archival holdings within the CENDARI Archive Directory.
However, while respecting these three methodological aspects, the CENDARI Archival Research Guides are intended to cover all three of them in a "hybrid" way, differing in the strategic focus and prioritising some aspects more than others. Nevertheless, they will all guide the path for research questions to be answered and worked on, reflecting the methods of the archives, libraries and museums storing the historical material.
Example: Parallel Records and Supplementary Material on Poland in the First World War
The concept of the CENDARI Archival Research Guides can be illustrated in the following example, dealing with the history of Poland during the First World War and after. In the narrative parts of this Guide, not only the historical context will be introduced, focusing on defining the subject to be examined, ie addressing the fact that there was, strictly speaking, no independent Polish state in existence before 1918, a fact that can be visualized by displaying the moving Polish borders in the 20th century with the help of digitally available maps and illustrations.[4] Moreover, due to these moving borders, documents addressing the history of Polish people in this period of time are now to be found in different countries which formed, on the one hand, the administration of the occupying central powers and Russia, and which, on the other hand, were the politically and organizationally responsible states before 1914, after 1919 and particularly after 1939 and 1945, and even 1989.
Image 5: CENDARI Archival Research Guide – context and content (Aleksandra Pawliczek, CENDARI, FU Berlin)
Thus, the archival material on Polish history and administrative structures has been relocated, restituted, looted and destroyed during the last century. Almost all records of Polish political institutions and military organisations, and many records concerning the central administration of Polish territories were burnt in 1944. Only fragments could be saved.[5] However, because of the fragmentation due to the changing political and administrative responsibilities, and also because of the repeated relocation of documents and records, significant amount of materials can be found in Austrian, German, Russian, Ukrainian, Lithuanian and Belarussian institutions. Beyond that, as the "Polish Question" was raised vehemently during the First World War – by the Poles themselves, but also as part of different policies of the belligerent parties – further relevant resources can be traced in France, Great Britain or the US. These countries also proved important for exiled Poles and Polish organizations after 1939, when parts of historical material were exiled with them.
In order to virtually reconstruct and merge the relevant information dispersed all over the world, it is thus appropriate to apply some of the archival work methods and to understand the process of building the archival paper/media trail within archives, libraries and museums. In this context, the aspects of origination and of use of parallel records created by the "other side" of any political or administrative action, and of supplementary and surrogate material (like personal papers, newsreels, newspapers, leaflets and pamphlets, produced in prisoner of war camps and elsewhere) become essential. They build a corresponding and complementary matrix of records and collections which can be evaluated for research purposes with the aid of historical methodologies. This way, these historical methods profit from understanding the concept and strategy of archival and librarian work and records' organisation and preservation.
The CENDARI Archival Research Guide on lost and relocated Polish documents will thus address questions like: how do archives and libraries work, how do they classify and organise information – and which pieces of information? Which aspects of their work change when this information becomes available online and which do not change at all? How can one locate documents and objects using principles of provenance, of legal and administrative responsibility, and the archival history itself? The physicality of archival objects still determines their storage place. Single documents on First World War can be consulted in numerous institutions, but they mostly cannot be consulted online. Considering the amount of the relevant material, this may never change. The digital access to (re)sources does not mean that every piece of information can be obtained digitally. The CENDARI Archival Research Guide will address this fact, emphasizing advantages of digital research methods but also accentuating its parameters and limits.
Therefore, all three aspects mentioned above, the archival, the research and the guide aspect, will be addressed in this CENDARI Research Guide, focusing on archival methodology, considering as well the current historiography as the task of transferring analogue historical knowledge into a digital environment.
CENDARI and the CENDARI Research Guides thus aim at linking different knowledge communities to each other, translating their needs and facilitating their respective strategies in obtaining and mastering information, and adjusting their individual techniques and practices of expertise to each other.
About the Author
Aleksandra Pawliczek
Dr Aleksandra Pawliczek is Senior Researcher in Digital Humanities at the Friedrich Meinecke Institute for Modern History at Freie Universität Berlin. She is a trained archivist and historian and published on Digital Humanities, History of Science and Jewish Studies. She worked for the Secret State Archives Prussian Cultural Heritage and the Federal Commissioner for the Stasi Records of the former German Democratic Republic. She is currently coordinator and researcher for the CENDARI Archival Infrastructure.