Skip to main content

Semantic Librarian

A free AI-powered search engine for Australia's public archives.

One search across 11 Australian archive APIs simultaneously — Trove, PROV, Museums Victoria, NMA, and more. Semantic search finds records that keyword search misses, across hundreds of institutions and hundreds of millions of records. Free, non-commercial, attribution-first.

R&DResearch in progress — not available yet

How Semantic Librarian works

From question to discovery in one search.

What you can search

11 APIs. Hundreds of institutions. Hundreds of millions of records.

Each API is a gateway to dozens — sometimes thousands — of contributing organisations, datasets, and collections.

National Institutions

  • National Library of Australia
  • National Archives of Australia via Trove
  • National Museum of Australia
  • Australian War Memorial via Trove
  • National Gallery of Australia via Trove
  • National Film & Sound Archive via Trove
  • AIATSIS via Trove
  • CSIRO via Trove + ALA

State Libraries

  • State Library of NSW via Trove
  • State Library Victoria via Trove
  • State Library of Queensland via Trove
  • State Library of SA via Trove
  • State Library of WA via Trove
  • Libraries Tasmania via Trove

Museums

  • Melbourne Museum
  • Scienceworks
  • Immigration Museum
  • Royal Exhibition Building
  • Australian Museum via Trove + ALA
  • Powerhouse / MAAS via Trove
  • Queensland Museum via Trove + ALA

Government & Heritage

  • Public Record Office Victoria
  • Heritage Council of Victoria
  • Geoscience Australia
  • Dept of PM&C
  • State Records NSW via Trove
  • Parliament of Australia via Trove

Science & Biodiversity

  • Atlas of Living Australia
  • BirdLife Australia via ALA
  • iNaturalist Australia via ALA
  • Australasian Virtual Herbarium via ALA

Digital Humanities

  • ACMI
  • TLCMap / GHAP
  • AusStage via TLCMap
  • Colonial Frontier Massacres via TLCMap

Types of records

Newspaper articles (1803–present) Historical photographs & maps Museum objects & artefacts Government archives Heritage places & shipwrecks Aerial photography (1928–1996) Species & biodiversity records Historical placenames PM speeches & transcripts Film, TV & videogames Court & immigration records Indigenous cultural materials

Plus hundreds more

1,500+ Trove contributing organisations 9,757 ALA biodiversity datasets Hundreds of TLCMap community layers 37+ university repositories

What Semantic Librarian includes

  • Semantic search, not keyword matching

    Search by meaning, not exact words. 'Photos of Melbourne in the 1950s' finds records described as 'aerial photograph, City of Melbourne, 1952' — even though the words don't overlap. Historical language, inconsistent cataloguing, and spelling variations are no longer barriers.

  • One search, eleven archives

    A single query searches Trove, PROV, Museums Victoria, NMA, VHD, ALA, ACMI, Geoscience Australia, GHAP, PM Transcripts, and IIIF collections simultaneously — from Trove's 1,500+ partners to ALA's 9,757 datasets. No more opening 11 tabs or learning 11 different search interfaces.

  • Attribution built into the data model

    Every result carries its source, licence, and attribution text. CC-BY compliance is structural — the _attribution array in search responses ensures downstream consumers can display proper credit automatically.

  • A card catalogue, not a library

    Semantic Librarian indexes metadata only — titles, dates, descriptions, coordinates, subjects, thumbnails. It never stores full content. Every result links back to the original source for the actual document, image, or record.

Transparency

Metadata Only

Indexes titles, dates, and descriptions. Never stores full content from archives.

Attribution-First

Every result carries its source, licence, and attribution text. CC-BY compliance is structural.

Non-Commercial

Free research tool. No paid tiers, no premium features, no revenue model.

Good Neighbour

Polite rate limiting, descriptive headers, and respect for institutional policies.

Get notified when Semantic Librarian is ready

Leave your details and I'll let you know when it's available.

No spam, ever. Privacy Policy

"I kept falling down rabbit holes on Trove — reading old newspaper articles from the 1920s, browsing PROV government archives from 150 years ago, exploring heritage buildings on VHD. But finding things was painful. Each archive had its own search, its own quirks, its own pagination. I'd spend an hour searching one site, then start again on the next. The collections are extraordinary. The search experience is not."

Nathan
N
Nathan

Who It's For

🎓

Academic Researcher

"How do I find primary sources across multiple archives for my thesis?"

Researchers who need to search comprehensively across Australian archives for a specific topic, period, or region.

You need: Cross-archive discovery that finds records keyword search misses.

🔍

History Enthusiast

"What happened in my local area 100 years ago?"

History buffs who explore out of curiosity — browsing old newspaper articles, looking up heritage places, discovering photographs of their town.

You need: A single search that finds the interesting stuff hidden across multiple platforms.

🌳

Genealogist

"How do I find records of my ancestors across different archives?"

Family history researchers tracing lineage through immigration records, court documents, newspaper mentions, and government archives.

You need: Cross-referencing records across archives that would take hours to search individually.

📚

Educator or Student

"Where can I find primary sources for my Australian history assignment?"

Teachers and students looking for original documents, photographs, and artefacts to support learning and assignments.

You need: Easy access to verified, attributable primary sources without navigating 11 separate platforms.

Questions about Semantic Librarian

What is Semantic Librarian?
Semantic Librarian is a free, non-commercial research tool that gives AI a unified metadata library across 11 Australian historical archive platforms. It uses semantic search to help you find historical images, newspaper articles, documents, maps, and records — without manually searching each site one by one.
Why is it called Semantic Librarian?
The name describes exactly what it does — a librarian that understands meaning, not just keywords. 'Semantic' because it uses AI to search by meaning rather than exact word matching. 'Librarian' because it knows every collection, every catalogue, and can point you to things you didn't know existed. It doesn't create or store the works — it helps you discover what's out there, then sends you to the original source.
Which archives does it search?
Semantic Librarian connects to 11 archive APIs, but each API is a gateway to many more institutions. Trove alone aggregates content from 1,500+ contributing organisations including every state library, the National Archives, the Australian War Memorial, and dozens of universities. The Atlas of Living Australia draws from 9,757 datasets across hundreds of museums, herbaria, and citizen science platforms. In total, Semantic Librarian gives you unified search across hundreds of institutions and hundreds of millions of records.
Is Semantic Librarian available to use?
Not yet as a public tool. Semantic Librarian is in the workshop — I'm using it personally and validating that the search results are genuinely useful across all 11 sources. There's no public release planned until the tool consistently helps people find things they couldn't find before.
Does Semantic Librarian store the actual content from archives?
No. Semantic Librarian indexes metadata only — titles, dates, descriptions, coordinates, subjects, and thumbnails. It never stores full article text, high-resolution images, or complete documents. Every result links back to the original source for the actual content.
What types of records can Semantic Librarian search?
Newspaper articles from 1803 to the present, historical photographs and maps, museum objects and artefacts, government archives and correspondence, heritage places and shipwrecks, historical aerial photography from 1928 to 1996, species and biodiversity records, historical placenames with coordinates, Prime Ministerial speeches and transcripts, films, TV programmes, videogames, court and immigration records, and Aboriginal and Torres Strait Islander cultural materials.
How many records are searchable?
The 11 APIs provide access to hundreds of millions of records. Trove alone aggregates approximately 308 million records including 254.6 million newspaper and gazette articles. The Atlas of Living Australia holds 101.5 million species occurrence records. Geoscience Australia has 1.2 million historical aerial photographs. Museums Victoria holds 17.2 million items. The actual number searchable through Semantic Librarian's semantic index is smaller — currently around 2 million indexed records with metadata — but every result links back to the full record at the original source.
Why share this if it's not available?
The problem of siloed Australian archives is real and widely felt. Sharing the approach — semantic search across unified metadata, with proper attribution — might be useful to others thinking about similar challenges. And the principles behind it (metadata-only, attribution-first, non-commercial) are worth discussing openly.