Semantic Librarian
A free AI-powered search engine for Australia's public archives.
One search across 11 Australian archive APIs simultaneously — Trove, PROV, Museums Victoria, NMA, and more. Semantic search finds records that keyword search misses, across hundreds of institutions and hundreds of millions of records. Free, non-commercial, attribution-first.
How Semantic Librarian works
From question to discovery in one search.
Ask a question in plain English. Semantic Librarian searches 11 Australian archive APIs simultaneously — Trove, Museums Victoria, NMA, PROV, Geoscience Australia, and more. You discover results ranked by semantic relevance, each linking back to the original source with full attribution.
Ask
Plain English, not keywords
Search by meaning — historical language, inconsistent cataloguing, and spelling variations are no longer barriers
Search
11 archives today, expanding to 12+ in 2026
One query, every archive — no more opening 11 tabs
Discover
Ranked results with attribution
Every result links to the original source
One question. Eleven archives. Hundreds of millions of records.
What you can search
11 APIs. Hundreds of institutions. Hundreds of millions of records.
Each API is a gateway to dozens — sometimes thousands — of contributing organisations, datasets, and collections.
National Institutions
- National Library of Australia
- National Archives of Australia via Trove
- National Museum of Australia
- Australian War Memorial via Trove
- National Gallery of Australia via Trove
- National Film & Sound Archive via Trove
- AIATSIS via Trove
- CSIRO via Trove + ALA
State Libraries
- State Library of NSW via Trove
- State Library Victoria via Trove
- State Library of Queensland via Trove
- State Library of SA via Trove
- State Library of WA via Trove
- Libraries Tasmania via Trove
Museums
- Melbourne Museum
- Scienceworks
- Immigration Museum
- Royal Exhibition Building
- Australian Museum via Trove + ALA
- Powerhouse / MAAS via Trove
- Queensland Museum via Trove + ALA
Government & Heritage
- Public Record Office Victoria
- Heritage Council of Victoria
- Geoscience Australia
- Dept of PM&C
- State Records NSW via Trove
- Parliament of Australia via Trove
Science & Biodiversity
- Atlas of Living Australia
- BirdLife Australia via ALA
- iNaturalist Australia via ALA
- Australasian Virtual Herbarium via ALA
Digital Humanities
- ACMI
- TLCMap / GHAP
- AusStage via TLCMap
- Colonial Frontier Massacres via TLCMap
Types of records
Plus hundreds more
What Semantic Librarian includes
Semantic search, not keyword matching
Search by meaning, not exact words. 'Photos of Melbourne in the 1950s' finds records described as 'aerial photograph, City of Melbourne, 1952' — even though the words don't overlap. Historical language, inconsistent cataloguing, and spelling variations are no longer barriers.
One search, eleven archives
A single query searches Trove, PROV, Museums Victoria, NMA, VHD, ALA, ACMI, Geoscience Australia, GHAP, PM Transcripts, and IIIF collections simultaneously — from Trove's 1,500+ partners to ALA's 9,757 datasets. No more opening 11 tabs or learning 11 different search interfaces.
Attribution built into the data model
Every result carries its source, licence, and attribution text. CC-BY compliance is structural — the _attribution array in search responses ensures downstream consumers can display proper credit automatically.
A card catalogue, not a library
Semantic Librarian indexes metadata only — titles, dates, descriptions, coordinates, subjects, and thumbnail URLs (hot-linked, never re-hosted). It never stores image binaries, full-text, or audio. Every result links back to the original source for the actual document, image, or record.
Transparency
Metadata Only — Never Re-Hosting
Indexes titles, dates, and descriptions. Thumbnail URLs are hot-linked back to the source institution; no images, audio, or full-text are ever re-hosted.
Attribution-First
Every result carries its source, licence, and attribution text. CC-BY compliance is structural.
Non-Commercial — forever
Free research tool. No subscriptions, no paid tiers, no for-sale offerings — ever. If running costs ever justify it, the only commercial-adjacent move that's permitted is registering as a not-for-profit purely to receive donations toward infrastructure. The user-facing service stays identical either way.
Good Neighbour
Polite rate limiting, descriptive headers, and respect for institutional policies.
Takedown-Respecting
Honours per-record removal requests within 30 days, in line with source institutions' content policies. Cultural-sensitivity protocols apply for Indigenous content (per AIATSIS guidance and individual community wishes).
How we work with source institutions
Each source institution's API Terms of Use or open-data licence governs how we use their data. The commitments below apply across the board.
-
Apply formally where required. Trove (NLA), AWM (in flight 2026-04), and WA Museum Sandbox (in flight 2026-04) require API keys via formal application. Whatever rate-limits, attribution, content scope, and removal SLAs each institution requires, the project commits to them.
-
Index metadata only. No articleText from Trove (per Trove §4.4(e), enforced after the February 2025 Sherratt incident). No image binaries from AWM (per AWM §Copyright). No oral-history audio from anywhere. Thumbnails are hot-linked URLs, never re-hosted.
-
Honour takedowns within 30 days. Per AWM §Removal of content. The same policy applies as a courtesy across all sources.
-
Respect rate limits. Per-source sliding-window rate limiter, daily and weekly budgets, a 1100ms politeness floor, and exponential backoff on 429. A pre-commit linter blocks Trove-incompatible content classes from ever entering the index.
-
Attribute every record. Source name, licence, and a link to the original record on every search result. Many results carry the "Provided by [Institution]" logo where the institution provides one.
-
Cultural-sensitivity protocols. Indigenous content carries record-level advisories where applicable, in line with AIATSIS guidance and individual community wishes.
-
Non-commercial only — forever. Required by some source ToUs (notably AWM §Commercial Use); aligned with the project's own permanent commitment regardless. The only future commercial-adjacent move that's permitted is registering as a not-for-profit purely to receive donations toward infrastructure costs.
If you're an institutional partner with a question about how Semantic Librarian uses your data, contact nathan@littlebearapps.com.
Get notified when Semantic Librarian is ready
Leave your details and I'll let you know when it's available.
No spam, ever. Privacy Policy
"I kept falling down rabbit holes on Trove — reading old newspaper articles from the 1920s, browsing PROV government archives from 150 years ago, exploring heritage buildings on VHD. But finding things was painful. Each archive had its own search, its own quirks, its own pagination. I'd spend an hour searching one site, then start again on the next. The collections are extraordinary. The search experience is not."
Who It's For
Academic Researcher
"How do I find primary sources across multiple archives for my thesis?"
Researchers who need to search comprehensively across Australian archives for a specific topic, period, or region.
You need: Cross-archive discovery that finds records keyword search misses.
History Enthusiast
"What happened in my local area 100 years ago?"
History buffs who explore out of curiosity — browsing old newspaper articles, looking up heritage places, discovering photographs of their town.
You need: A single search that finds the interesting stuff hidden across multiple platforms.
Genealogist
"How do I find records of my ancestors across different archives?"
Family history researchers tracing lineage through immigration records, court documents, newspaper mentions, and government archives.
You need: Cross-referencing records across archives that would take hours to search individually.
Educator or Student
"Where can I find primary sources for my Australian history assignment?"
Teachers and students looking for original documents, photographs, and artefacts to support learning and assignments.
You need: Easy access to verified, attributable primary sources without navigating 11 separate platforms.
Questions about Semantic Librarian
What is Semantic Librarian?
Why is it called Semantic Librarian?
Which archives does it search?
Is Semantic Librarian available to use?
Does Semantic Librarian store the actual content from archives?
What types of records can Semantic Librarian search?
How many records are searchable?
Why share this if it's not available?
Also by Little Bear Apps