Introduction to FAIR data principles for art researchers
Akseli Gallen-Kallela, Winter Landscape (Talvimaisema), 1900, oil on canvas, 27.5 × 38, Kirpliä Art Collection.
What is FAIR data?
FAIR data principles help make research data easy to find, access, use, and share with others. Rather than rigid technical requirements, these principles offer strategic guidance for data creation and management that foster broad usability and collaboration across organizations and applications.
The acronym FAIR stands for
Findable: Data should be easy to discover by humans and computer systems.
Accessible: Once found, users should be able to easily access that data and know what authentication and authorization are required for use.
Interoperable: Data should be able to integrate with other datasets, applications, or workflows.
Reusable: Data should be well-documented and include clear usage licenses so that others can use and reuse it.
For art historians, adopting FAIR practices when creating art-related data means creating a more transparent, collaborative, and efficient research environment that ultimately advances knowledge in the field. This article introduces the benefits of the practice, discusses each component of FAIR data principles, and suggests a few methods to work toward them.
The benefits of FAIR data principles in art research
The application of FAIR principles to art research offers substantial benefits for historians, institutions, and the field as a whole.
Enhanced connection of dispersed collections
FAIR data practices enable researchers to more easily discover and connect information across institutional boundaries through consistent metadata and persistent identifiers. This connectivity reveals previously obscured relationships between works, techniques, and historical contexts that remain hidden when data is siloed within individual institutions or projects.
Scientific validation in technical art history
For technical art studies that rely on scientific analyses, FAIR principles strengthen the foundation of the discipline. When analytical data follows FAIR guidelines, findings can be more rigorously verified, analytical methods can be reproduced, and research can build systematically upon previous work rather than duplicating efforts.
Interdisciplinary collaboration
FAIR data practices bridge the diverse specializations that contribute to comprehensive art research. By creating common frameworks for data exchange, these principles help integrate findings from conservation, art history, materials science, and other relevant disciplines into more holistic understandings of cultural objects.
Preservation of scholarly knowledge
Beyond the immediate research benefits, FAIR data principles serve a critical preservation function. Well-documented, interoperable data ensures that today's research findings remain accessible and usable for future generations, creating a more durable scholarly record.
How to create findable, accessible, interoperable, and reusable data
How researchers create data that is findable, accessible, interoperable, and reusable is entirely up to them, but there are some standard tools and approaches available.
Making research findable
Data that is findable allows art researchers to discover relevant materials across institutions, connect dispersed works, validate analyses, and build on previous scholarship, saving time and improving research quality. Rich metadata, persistent identifiers, standard vocabularies, and using relevant repositories are common approaches to creating findable data.
Create rich metadata: Develop comprehensive descriptions in metadata that facilitate discovery, using fields relevant to art research such as creator, period, medium, subject matter, and research methodology.
Use persistent identifiers: Assign unique and stable identifiers (such as DOIs, ARKs, or OCRID iDs) to datasets, artwork, and researchers to ensure reliable citation and location over time.
Implement standardized vocabularies: Use established art historical thesauri like the Getty Vocabularies when describing objects, techniques, and other entities to enhance consistency and discoverability.
Register with relevant repositories: Deposit data in disciplinary repositories where colleagues naturally look for information, such as the Digital Art History Data Service, Registry of Research Data Repositories, or institutional repositories with strong arts and humanities collections.
The Rijksmuseum makes its data findable through persistent identifiers, rich metadata, and standardized vocabularies. The Collection IT department manages an integrated architecture that combines digital protocols, data models, stable URLs, and a metadata repository aligned with FAIR principles, supporting multiple formats including CIDOC-CRM. For vocabulary standardization, they employ a range of established thesauri and international standards. This comprehensive framework of persistent identifiers, standardized metadata, and controlled vocabularies exemplifies findable data practices in cultural heritage. These efforts have created an online version of its collection that allows users to search for artwork and build their own galleries and exhibitions of works.
Making research data accessible
Accessible data lets art researchers examine collections remotely with clear usage permissions, removing barriers to scholarship regardless of location or institutional affiliation. Accessibility doesn’t mean all data must be open access; rather, clear conditions for access must be established. Defining access conditions, using standardized protocols, and comprehensive metadata support researchers in making their data accessible.
Define access conditions: Specify who can access the data, under what circumstances, and for what purposes. Defining access is particularly important for sensitive personal information that is often contained in provenance information or unpublished research.
Use standardized protocols: Implement common protocols for data retrieval, such as HTTP, for web-based access (e.g., using a browser) rather than proprietary systems.
Ensure metadata remains accessible: Even when the full data has access restrictions due to copyright or privacy concerns, make metadata openly available so others can discover what exists.
Consider tiered access: Implement different access levels for various components when appropriate, such as providing basic documentation openly while requiring registration for an account to access high-resolution images or detailed analytical data.
The Smithsonian Institution excels at making its data accessible using these approaches. The Smithsonian Open Access initiative released over 4.5 million digital collection items into the public domain with a Creative Commons Zero designation (CC0), allowing unrestricted use of its vast digital assets. The organization provides multiple access points through its robust API ecosystem, including the Smithsonian API and Collection Search API, which implement standardized RESTful protocols with clear documentation. The Smithsonian's metadata strategy is particularly strong, maintaining detailed descriptive information even for restricted content, enabling discovery across its 19 museums and research centers. Its carefully designed tiered access model accommodates various usage scenarios, from public domain materials to those with specific cultural sensitivities or copyright restrictions.
Albert Bierstadt, Among the Sierra Nevada, California, 1868, oil on canvas, overall: 72 x 120 1⁄8 in. (183 x 305 cm) frame: 96 1⁄4 x 144 3⁄8 x 7 1⁄4 in. (244.5 x 366.7 x 18.4 cm), Smithsonian American Art Museum, Bequest of Helen Huntington Hull, granddaughter of William Brown Dinsmore, who acquired the painting in 1873 for “The Locusts,” the family estate in Dutchess County, New York, 1977.107.1.
Making research data interoperable
Interoperable data enables art researchers to combine information from different collections and sources, integrating it into their own data and revealing new connections that would remain hidden in isolated systems.
Interoperability can be created through the following strategies:
Using standardized formats: Adopt widely used file formats appropriate to different types of art research, e.g., TIFF for images, CSV for tabular data, and XML or JSON for structured documentation.
Implementing common data models: Use established models such as CIDOC CRM for cultural heritage information rather than developing entirely new frameworks.
Employing linked data principles: Connect research data to related resources using standardized identifiers, enabling networked connections across datasets and disciplines.
Documenting relationships: Clearly articulate connections between different data components, such as an artwork, conservation records, and related analytical results.
The Yale Center for British Art (YCBA) implementation of metadata and standard vocabularies makes it a practical example of an institution that prioritizes interoperability in its collection. It uses machine-readable formats, has adopted the IIIF Consortium standards, and organizes its data with the CIDOC CRM. This approach ensures that its collection data can be easily integrated into various systems, enhancing research and reactive endeavors in the digital environment. By structuring its collection data with standard vocabularies and publishing this information through an API, it has significantly enhanced the interoperability of its research resources.
William Morris, News from Nowhere, 1892, Rare Books and Manuscripts (PR5079 .N48 1892), Yale Center for British Art.
Making research data reusable
Reusable data, accompanied by clear documentation and licenses, can enable researchers to build directly on others' work, preventing duplication of effort and accelerating the development of new insights.
To enhance reusability:
Provide comprehensive documentation: Include detailed information about research methods, analytical parameters, and data processing steps to enable meaningful reuse. Also include which model, formats, and vocabularies have been used.
Assign clear licenses: Specify how others can reuse the data through standardized licenses like Creative Commons, balancing openness with appropriate protection of intellectual property.
Include origin information: Document the origin and processing history of data to establish its credibility and enable proper attribution.
Documentation limitations: Clearly state any known limitations, biases, or uncertainties in the data to guide appropriate reuse.
The National Gallery of Art, Washington D.C., exemplifies a dedication to making its data reusable. It follows a clear licensing framework that balances openness with appropriate protections. Its structured data is released under a CC0 dedication, making it completely open for any use; narrative text content (like descriptions and notes) is available under Creative Commons Attribution 4.0, which allows copying and adaptation with attribution. Images can be shared for non-commercial purposes with attribution, but not adapted under Creative Commons Attribution-NonCommercial-NoDerivs 4.0. Relying on external standards helps users know what is allowed when reusing the immense amount of data available through The National Gallery of Art.
You can explore documentation from additional institutions, such as the Metropolitan Museum of Art, the Art Institute of Chicago, the Victoria & Albert Museum, the Smithsonian, and the Minneapolis Institute of Art.
Challenges and practical considerations to begin using FAIR data principles
While the benefits of FAIR data practices are substantial, implementing these principles can present challenges that require thoughtful consideration.
Resource constraints
Making data FAIR requires time, expertise, and often financial investment. Researchers and institutions with limited resources may find it challenging to implement comprehensive FAIR data practices. However, even incremental improvements yield benefits, and numerous free or low-cost tools can support FAIR data management.
Intellectual property complexities
Art research frequently involves complex intellectual property considerations related to artwork, publications, and research findings. While FAIR principles emphasize accessibility, they also acknowledge the importance of clear usage licenses and conditions, requiring careful balancing of openness with appropriate rights management.
Technical expertise gaps
The technical aspects of implementing FAIR principles may be unfamiliar to many art researchers. Addressing this gap requires either developing these skills within research teams or establishing partnerships with data specialists who can provide necessary technical support.
Institutional and disciplinary traditions
The shift toward FAIR practices may challenge traditional approaches to scholarship that emphasize individual attribution and institutional ownership of data. Successfully implementing FAIR principles often requires cultural change within research communities and institutions.
Starting your FAIR data journey
For art researchers interested in applying FAIR principles, several practical steps provide a starting point:
Assess current practices: Evaluate existing data management approaches to identify strengths and areas for improvement.
Start with metadata: Begin by creating more comprehensive, standardized metadata for research materials, even if the underlying data cannot yet be fully FAIR-compliant.
Explore existing resources: Investigate data models, vocabularies, and repositories already established in the field, such as the Getty Research Institute's vocabularies or the Europeana Data Model.
Connect with colleagues: Engage with other researchers and institutions implementing FAIR principles to learn from their experiences and potentially collaborate on shared approaches.
Seek training opportunities: Take advantage of workshops, online courses, and other resources related to data management in cultural heritage fields.
Implement incrementally: Adopt FAIR practices gradually, focusing first on new projects while developing plans to address legacy data over time.
Supporting future researchers and expanding the field
FAIR data principles offer art researchers a valuable framework for improving how research data is managed, shared, and preserved. By making research data more findable, accessible, interoperable, and reusable, these principles enhance collaboration, efficiency, and impact. While implementing FAIR principles requires investment and adaptation, the benefits to individual researchers, institutions, and the broader scholarly community make this effort worthwhile.