Almost frictionless: open access museum collections in practice in 2026

Jun 1

The past decade has changed what it means to have access to a museum collection. Major institutions, including the Metropolitan Museum of Art, the Art Institute of Chicago, and the Rijksmuseum, have released vast holdings under Creative Commons Zero licenses. In doing so, they've made high-resolution images and detailed metadata freely available to researchers with an internet connection, for any purpose, without fee or permission. For art historians, this represents a genuine shift: research that once required sustained physical proximity to a collection, or the resources to license reproductions, can now be retrieved, compared, and analyzed at a scale that was not previously possible.

The details included here reflect the real experience of working with open-access collections to build a dataset of images of paintings made by artists who showed at one of the eight independent exhibitions in Paris from 1874 to 1886, often referred to as the Impressionist exhibitions. The goal is to provide an overview of five open-access APIs from museums with strong Impressionist holdings and name recurring challenges that can occur whenever working with open-access collections programmatically.

Pierre-Auguste Renoir, *Pêches*, 1881. Oil on canvas, 53.3 × 64.8 cm. Image courtesy of The Metropolitan Museum of Art, New York (in the Public Domain). First exhibited in 1882 at the *7me exposition des artistes indépendants,* March 1882, no. 159 (as "Les pêches").

Common challenges when working with open-access collections

The collections described in this article are genuinely accessible, and the technical challenges researchers encounter are consistently solvable. Knowing what to look for makes the difference between an afternoon lost to diagnosis and an afternoon spent doing research.

The most common challenges fall into four patterns.

Implicit requirements are dependencies that the documentation doesn't mention, which produce errors or empty results until identified — for example, a required header or a field that must be explicitly requested.
Documentation drift occurs when an API has evolved, but its documentation hasn't kept pace, so code written from the docs behaves differently than expected.
Parameter inconsistency appears when the same concept is named differently across endpoints or API versions, meaning only one formulation works even when several seem plausible.
Unconventional architecture describes cases where an institution's programmatic access follows a model outside standard conventions, so the usual first attempts don't apply, and a different approach is needed from the start.

None of these patterns is unique to cultural heritage data, and none requires specialized expertise to navigate. They simply need to be named before a researcher arrives.

The Metropolitan Museum of Art: no authentication, clean JSON, and reliable images

The Met's collection API is openly documented, requires no authentication, and has been publicly available since 2018. A researcher can query it without registering for a key, without agreeing to terms beyond the CC0 license that governs the images themselves, and without any special configuration. Requests return clean JSON. Images resolve reliably. The open access policy is reflected accurately in the data: works flagged as open access are, in practice, open.

One distinction is worth understanding before starting. Access (retrieving a work once you know what you're looking for) and discovery (finding it in the first place) are built on different infrastructures and can behave differently. The Met's API handles both well, with one nuance on the discovery side: artist-name search does not always return results organized by the queried name. Researchers who arrive with a list of object IDs, or who retrieve full department listings and filter locally, will find the API entirely reliable. The Met's online collection is a practical way to identify object IDs before querying the API directly.

For most purposes, the Met's API is as close to frictionless as publicly available museum data gets. It is a reasonable model against which to measure the others.

The Art Institute of Chicago: rich metadata and complex queries

The Art Institute of Chicago API is well-designed and generously documented. It supports complex queries, returns rich metadata, and covers a collection with significant holdings in Impressionism, Post-Impressionism, and American modernism. Once configured correctly, it is among the most capable museum APIs available.

Two implicit requirements are worth knowing before building an image retrieval pipeline. The AIC hosts images through a separate IIIF server, and requests to that server require a Referer header pointing to www.artic.edu. Without it, the server returns a 403 error. Additionally, the image_id field necessary to construct the image URL is not returned by default and must be explicitly requested using the fields parameter. Both are straightforward fixes; the documentation would benefit from a section on image retrieval specifically that names them directly.

The Cleveland Museum of Art: queryable, well-organized, and genuinely committed to open access

The Cleveland Museum of Art has been among the most committed institutions in the open access movement, and its API reflects that commitment. The collection is queryable, well-organized, and returns useful metadata across more than 61,000 records, with no key or token required for access.

One small documentation discrepancy is worth knowing. The license field for open access works returns "CC0" in the API, while the documentation describes it as returning "Open Access." Code written directly from the documentation to filter for open access works will silently return nothing until this is corrected — a single string substitution once identified.

The Rijksmuseum: authenticated API and keyless entry point

The Rijksmuseum offers both a full authenticated API and a keyless endpoint, making it accessible to researchers at any level of commitment. The authenticated API is capable and well-behaved; researchers planning extended work with the collection will find registration worthwhile.

The keyless endpoint is a useful starting point, with one parameter inconsistency to know in advance. When filtering by artist, only the parameter "creator" returns results; "maker," "artist," and "principalMaker" all return 400 errors. A researcher querying the keyless endpoint for works by a specific artist will need this before they begin.

The Getty: excellent image quality, fully accessible, SPARQL, and Linked Art

The Getty's collection is publicly accessible without any API key, and its image quality is among the best available from any open-access institution. The IIIF endpoint supports arbitrary resolution, and the Linked Art 1.0 records are richly detailed. The collection is genuinely accessible once its architecture is understood.

That architecture differs from standard museum API conventions and is worth understanding before starting. Discovery works through a public SPARQL endpoint at https://data.getty.edu/museum/collection/sparql. Researchers can POST a query with the appropriate headers, no registration required, and receive object URIs in return. Metadata and images are then retrieved by dereferencing each URI with an Accept: application/ld+json header, which returns a full Linked Art 1.0 record containing image URLs, material classifications, attribution, dates, and rights information. A few practical notes: Getty returns artist names with parenthetical biographical information that requires cleaning before use; the made_of field should be checked to filter to oil paintings specifically; and only the Linked Art 1.0 endpoints, stable since mid-2024, are currently reliable. Older endpoints documented in many online posts no longer function.

WikiArt: different paths to the same collections

WikiArt occupies a distinctive and widely used place in the digital art history landscape. As an aggregator rather than an institutional collection, it has assembled an enormous quantity of reproductions — spanning centuries, movements, and geographies — that researchers and educators have come to rely on for browsing, teaching, and preliminary research. For many purposes it is genuinely useful: its coverage is broad, its interface is accessible, and it has introduced generations of students to works they might not otherwise have encountered. Its limitations become significant, however, when the task shifts from browsing to building. WikiArt's images are not uniformly in the public domain, and the platform does not operate under a consistent open license — rights status varies by work and is not always clearly indicated. Automated requests are actively blocked, making programmatic access unavailable regardless of intended use. For researchers assembling image datasets, these constraints are determinative: WikiArt cannot serve as a reliable source, and works that appear there can almost always be retrieved more dependably, with clearer rights status, through the originating institutional collections directly. The practical takeaway is not that WikiArt should be avoided, but that its role in a research workflow should be understood clearly — it is a discovery and reference tool, not a data source.

Conclusion: What the open-access landscape looks like in 2026, and where it's going

The collections described in this article are, taken together, genuinely accessible. A researcher who wants to work programmatically with Impressionist holdings across multiple institutions can do so. The images are there, the metadata is there, and the rights status on the overwhelming majority of pre-twentieth-century works is unambiguous. None of the technical challenges described here is demanding: a researcher with basic familiarity with HTTP requests and JSON, and with SPARQL for the Getty, can navigate all of it.

The next chapter of open access in art history is less about rights and more about legibility. Museums making the practical requirements of programmatic access as clear as the license terms will continue the project of making a collection technologically accessible. The institutions that have done that hard work of opening their collections are well-placed to lead that effort, and the researchers who benefit from it have reason to support them in doing so.

Kiersten Thamm

Dr. Thamm bridges art history and technology, researching their mutual influence and supporting historians using computational technology for new forms of knowledge production.