Gen AI collection descriptions – is it a pass?

In their first Gen AI blog, Gareth and Katie left you hanging as to how the descriptions created using Generative Artificial Intelligence turned out. Before they show an example and talk about how they assessed the content, there were a couple of discoveries they made about supplying images to the AI model.

Initially, we provided the AI model (OpenAI’s GPT-4o mini) with web links to images online. However, this proved unreliable, as the AI model sometimes timed out when it couldn’t retrieve the image in time. We found a more stable and effective approach was to store the images locally and package them with our text query. This method also allows us to manipulate the images before submission if needed. For example, we can crop them, convert them to black and white, or adjust the contrast to improve readability for the AI model.

Additionally, an unexpected consequence emerged when we asked the AI model to utilise a high level of image interpretation. While this approach provided richer descriptions, it also began recognising inscriptions on the objects. Unfortunately, the AI-derived inscriptions were not always accurate. For example, the AI model concluded that the partially visible inscription below was “Scientific and Clear,” rather than the correct “Scientific and Chem.”

A small box with a paper parcel in it that has been ripped open. Inside the paper is some bright sky-blue powder. Some of the powder has spilt into the box.
Copper Sulphate; maker unknown; Purchased 1997 with New Zealand Lottery Grants Board funds. Te Papa (GH005877)

To address this, we started providing the AI model with human-verified inscription information from our database. We then included the following instruction in the User prompt: “The object pictured contains a verified inscription, which should always be incorporated into your response.”

The AI response

Initially we ran 100 objects through this Gen AI description process. In total, it cost $1.25 NZD and took 15-minutes to complete. So, it cost just over 1-cent per description, and around 10 seconds to complete each description.

For this blog we’ll look at one example badge from the Silversmith’s collection.

A split image with the left side showing a golden coloured badge with an insignia pressed into it, and the right side showing the back view of the medal which is orange and has the words New Zealand Badges made from original dies Limited Edition Artillery Circa 1901.
Badge, made by Cimino Jewellers; manufacturer(s); about 1901. Purchased 1997 with New Zealand Lottery Grants Board funds. Te Papa (GH005881/63)

Firstly, as a comparison, here is a Gen AI description created by just supplying the front image of the badge and no other contextual information. The AI model responded:

“The object in the image is a rectangular, metallic badge or medallion with a polished surface that reflects light. The overall shape is slightly larger than a standard playing card and has rounded corners, giving it a smooth appearance. At the center of this piece is an emblem featuring a stylized cannon wheel, which symbolizes artillery. This wheel sits prominently above two crossed banners that curve around its base. Above the wheel rests an ornate crown, indicating royal association or honor.”

We then added a second image and contextual information from our database (title, production details, inscriptions etc). We also told it to spell in New Zealand English. The AI model’s response became far more focussed:

“The object is a badge, part of a silversmith’s tool set, crafted by Cimino Jewellers in New Zealand around 1901. It features a rectangular shape with rounded corners and has both metal and paper components. The front side showcases an embossed design that includes the emblem of artillery—a wheel encircled by decorative elements—topped with a crown. Below this central motif is the inscription “Royal NZ Artillery,” elegantly displayed on a ribbon-like banner. On its reverse side, there’s an orange paper backing which bears printed text stating “NEW ZEALAND BADGES,” along with additional information about being made from original dies and indicating it as part of a limited-edition series related to artillery dated circa 1901. The edges are slightly worn but maintain their overall structure without significant damage visible.”

While the AI model produced a far more focussed response, was it accurate, and was it useful?

The Assessment

For the 100 descriptions, we conducted an assessment process, focusing on:

  • Accuracy of the description and inscriptions
  • AI padding, or what ChatGPT refers to as “conversational fluff
  • AI speculation and bias

Each of these factors impacts trust and transparency – two principles at the core of everything we do at Te Papa.

We found that 100% of the descriptions required some level of human intervention. In 80% of cases, this involved minor tweaks, such as removing unnecessary words or sentences.

On average it took us around two minutes to assess and correct each description.

For example, the AI generated description for a well-worn pattern book speculated that “This item carries an aura of history through its physical condition – a testament to its former use by individuals such as Isaac Betteridge and Mr Frank Betteridge – and invites curiosity about what knowledge it might contain within its fragile pages.”

A very old hardback book with tattered pages and cover is sitting on a white surface at a slight angle.
Pattern Book. Purchased 1997 with New Zealand Lottery Grants Board funds. Te Papa (GH006614)

We kept the references to Isaac and Frank Betteridge but took out the suggestion of an ‘aura of history’. The resulting description is a bit less poetic, but more concise.

The remaining 20% required corrections to false information or verification of details from the database. For example, in some instances, we had to review the high-resolution image of the object to confirm an inscription.

Katie observed that the amount of conversational fluff varied depending on the type of collection object and the amount of contextual information provided. Visually interesting objects tended to have fewer issues. However, as Gareth remarked during a later debrief, while humans might perceive some text as “fluff,” it may contain valuable keywords that computers and search engines use when indexing websites.

The assessments also revealed inconsistencies in our collections data, which we have since been able to correct.

At this stage of the pilot project, we found that all descriptions required some level of human intervention before they could be considered ready for publication. But as Katie reflected, “It’s easier to cut and delete a sentence than to write an entire description myself. It’s certainly not ‘set and forget,’ but it’s fun, easy, and definitely worthwhile.”

So, the big question for us: are there ways we can reduce the human time spent checking and correcting the descriptions?

We think there are. We thought about grouping collection items together by object type (e.g. all of the chisels), so that we can be even more explicit in the prompts to help the AI model. And then we thought of adding another exciting step in the process … but we’ll cover that in the next blog.

Leave a Reply

Your email address will not be published. Required fields are marked *