Experiment to plug Te Papa collections into ChatGPT

It would be hard to miss that Artificial Intelligence (AI) has seen an exponential rise in programming, availability, use, and debate in the last few years. Here at Te Papa, we’ve been looking at possible use cases for the Digital Museum, and investigating safe ways to connect the collections to visitors. In this blog, Collections Data Manager Gareth Watkins describes his experiments with ways Generative AI can tell richer stories and enable deeper connections to our collection database.

In my role as Collections Data Manager, I focus on ensuring our collections data is accessible, accurate, and meets international data standards. Given that the museum has well over one million collection items, and we’ve been collecting for over 150 years, this may seem a rather daunting task.

Thankfully maintaining collections data is a shared responsibility. Much of my role is about training teams in effectively using EMu – our collections management system, creating documentation and running collections data audits, all with the aim of encouraging people to input authoritative collection information into the appropriate fields in EMu.

I’ve often found that a valuable way of identifying possible data issues and opportunities is to imagine myself as a visitor, searching Te Papa’s collections. For example, if through my in-house knowledge I know that an item actually exists in our collection, how easy (or hard) is it to find it by searching Collections Online.

This year, with the rapid developments in Generative Artificial Intelligence (Gen AI), I’ve been able to take this a step further.

In 2023, we established an internal AI Guidance Group “to create a safe, transparent environment that allows Te Papa to explore the possibilities and responsibilities of Gen AI.” The group has developed guidelines covering bias, privacy, and data sovereignty issues. For my experiment, I am only using publicly available data.

My Gen AI experiment

I was interested to know, as a member of the general public, how easy it would be to link a Gen AI product with Te Papa’s already published collections data. In this case I wanted to create a customised version of ChatGPT.

The result – Deep Dive the Museum – I think is really exciting. As shown in the screenshot below when I typed “the moon”, ChatGPT didn’t output a list of search results, but it actually summarised the content sourced from Collections Online, gave context, and allowed me to ask questions about what’s in the collections:

Creating Deep Dive the Museum

Since 2018, Te Papa has made a subset of its collections data available by a public API (Application Programme Interface). In fact, it’s this API which feeds Collections Online. The API lets a remote computer query Te Papa’s collections data, and then do whatever it wants with that information – an ideal starting point for connecting with platforms like ChatGPT.

The creation of a public API at Te Papa was championed by Adrian Kingston, Head of Digital Channels. For years, Adrian has focussed on moving access to our collections beyond the physical walls of the museum into an open, digital space.

When the API was launched, Te Papa licensed all of the collections metadata (the information about an item, but not images of the item) under a Creative Commons CC-BY 4.0 license. This means that anyone can freely make use of the collections metadata published through the Te Papa API – even commercially.

The Te Papa API structures the output of the information to help the remote computer interpret it correctly. For example, there are specific fields in the API for Title, Description, and Production Date. Structured data is incredibly useful in many situations. However, I found ChatGPT preferred these pieces of data to be joined together and presented as a single text block.

To do this I needed to create a simple, intermediary API which for this blog I’ll call “My API”. My API receives the data from Te Papa and transforms it into a single text block. It then sends this text to ChatGPT. The image below shows what the data originally looked like in the Te Papa API (visualised via Collections Online), and what it was transformed into by My API.

Originally the data is structured into separate fields:

My API takes the data and transforms it into a single text block and sends it to ChatGPT:

I found another benefit of creating My API, was the ability to prioritise content supplied to ChatGPT. For example, I prioritised “topic” records (e.g. Saving the kākāpō) as they often go into more detail about a subject. This was then followed by information about specific collection items.

More useful discoveries

Another useful discovery was to compare the output of My API to what ChatGPT summarised. I found that I had to be very explicit in my instructions to ChatGPT, specifying that all results from Te Papa should be treated as relevant.

I learnt this when I did a search on “mollusca”. While the Te Papa API sent back hundreds of items, the text describing those items only contained the term “molluscs” – meaning ChatGPT didn’t make the connection and said there were no results. I added the following line to my instructions to ChatGPT, and instantly the API results were interpreted correctly: “Even if the original search term is not stated in the text, all of the results from the API are relevant.”

For me, another key element is to have ChatGPT show a link back to the original source page(s) on Collections Online. This allows the user to check the accuracy of the data and make sure that ChatGPT is not “hallucinating” by generating information that is incorrect or totally fabricated (this is a known issue for many Gen AI products currently available).

The experimental Deep Dive the Museum GPT is now freely available to users of ChatGPT (including free accounts). It is, as I’ve mentioned, experimental and is not designed to live a long life. However, it does demonstrate how Gen AI products can be used in positive way to interact with heritage collections. Needless to say – it all starts with good collections data.

The future of how Gen AI intersects with the heritage sector I believe is very exciting. Currently at Te Papa, teams are exploring the use of Gen AI for collection access and experience, individual productivity, and as part of our programme evaluation. We’re utilising different platforms and AI models to understand, engage with, and open up our collections… so watch this space!

9 Comments

  1. Kia ora Gareth, are any of the AI Reference Group developments available to view? I think other organisations would be really interested in how Te Papa is navigating this space (especially with visual AI-generated content). I see that Philip asked the same further up and some time in the past! I wonder if you could connect me with any information. Ngā mihi.

  2. This is exciting stuff Gareth – thank you for publicising what you’ve been doing. I’m keen to try in my own catalogue. You mention the work the internal AI guidance group has done, are the guidelines available to others?

    1. Author

      Thanks Philip, I’ve forwarded your details onto Craig Le Quesne, lead of the AI Guidance Group

  3. That’s really cool! I imagine this sort of interface for collections will become very popular in the future. Thanks for letting us see “under the hood”.

    1. Author

      Thanks David for the comment. I made a couple of discoveries after writing the blog. Firstly, I hadn’t considered before, the use of Gen AI to also interpret the original search query by the user. So I’ve tweaked the instructions for ChatGPT to take the original query – reduce it down to keywords, and then add related keywords before querying the Te Papa’s collections data. So, if the original user search was “Rita Angus”, ChatGPT would actually submit to Te Papa’s api “Rita Angus New Zealand artist paintings art”. This then gives Te Papa’s api more to work with. Secondly I discovered that if I add in a list of example prompts for the user to select from, e.g. “Please respond in Spanish”, Chat GPT will conduct the whole conversation in Spanish, including translating any text retrieved from the Te Papa api

  4. Brilliant piece explaining many things that have felt mysterious and kind of scary to me. Thank you!

    1. Author

      Thanks Emma-Jean

Leave a Reply

Your email address will not be published. Required fields are marked *