Home | Connectors | Prodigy | Prodigy - ArchivesSpace Integration and Automation
Data flow: ArchivesSpace to Prodigy
ArchivesSpace can export collection descriptions, finding aids, item titles, subject terms, and container metadata into Prodigy for structured text annotation. AI teams can label archival descriptions for entities such as people, organizations, places, dates, and collection themes to train NLP models that improve search, auto-suggest tags, and semantic discovery across archival records.
Business value: Faster retrieval of archival content, better metadata consistency, and improved user search experience for researchers and internal staff.
Data flow: ArchivesSpace to Prodigy to ArchivesSpace
Archival description records from ArchivesSpace can be sent to Prodigy for labeling by archivists and subject matter experts. The resulting training data can be used to build models that classify records by record type, subject area, sensitivity level, or access restrictions. Predicted labels can then be pushed back into ArchivesSpace as controlled metadata fields or review queues.
Business value: Reduces manual cataloging effort, improves consistency in archival classification, and speeds up processing of backlogs.
Data flow: ArchivesSpace to Prodigy
Digitized finding aids, correspondence descriptions, and collection notes from ArchivesSpace can be annotated in Prodigy to train named entity recognition models. These models can identify people, institutions, locations, and dates across large archival text corpora, supporting richer indexing and cross-collection linking.
Business value: Enables deeper discovery across collections and reduces the time archivists spend manually identifying recurring entities.
Data flow: ArchivesSpace to Prodigy
When ArchivesSpace manages references to digitized photographs, maps, manuscripts, or scanned documents, image assets and associated metadata can be routed into Prodigy for annotation. Teams can label image content, document condition issues, or page-level categories to train computer vision models for quality review, duplicate detection, or content-based retrieval.
Business value: Improves accuracy of digitization workflows, supports scalable quality control, and enhances visual search capabilities for digital archives.
Data flow: ArchivesSpace to Prodigy to ArchivesSpace
Archival descriptions and notes can be annotated in Prodigy to identify sensitive content such as personal data, legal restrictions, donor conditions, or privacy concerns. Models trained on this data can help flag records in ArchivesSpace for review before public release or digitization.
Business value: Lowers compliance risk, supports more reliable public access decisions, and helps prioritize records needing human review.
Data flow: ArchivesSpace to Prodigy to ArchivesSpace
ArchivesSpace subject headings, collection titles, and scope notes can be sampled into Prodigy for annotation against preferred terms, synonyms, and topical categories. The output can be used to train models that recommend standardized subject terms or map inconsistent legacy terminology to a controlled vocabulary.
Business value: Improves metadata quality, increases consistency across collections, and reduces cleanup work during archival processing.
Data flow: ArchivesSpace to Prodigy to ArchivesSpace
Incoming research request descriptions or reference notes linked to archival holdings can be annotated in Prodigy to train models that classify request topics, collection relevance, or urgency. The model can then help route requests to the right archivist or suggest likely collections in ArchivesSpace.
Business value: Shortens response times, improves service desk routing, and helps staff match researchers to relevant materials more efficiently.
Data flow: Bi-directional between ArchivesSpace and Prodigy
As archivists review metadata suggestions, access flags, or entity extractions generated from ArchivesSpace content, their corrections can be fed back into Prodigy as new labeled examples. This creates a continuous improvement loop where the annotation set grows over time and model performance improves with real archival feedback.
Business value: Supports sustainable AI adoption, keeps models aligned with institutional standards, and reduces rework across metadata and discovery workflows.