Rosetta Overview
description
Transcript of Rosetta Overview
1
Rosetta Overview
3
What is Rosetta?Rosetta is a complete digital asset management and preservation
solution that addresses the ever-growing need to collect, archive
and preserve the digitally-born and digitized materials stored at
academic institutions, research organizations, and government
institutions, ensuring data integrity and access over time.
4
Agenda
The Need1
Rosetta Solutions
2 The Challenges
3
Data Model4
5 Who’s Using Rosetta and How
The Need
6
Need for Digital Preservation
Today’s world is digital. If a file can’t be opened, probably the reasons are:1. Corrupted media2. Missing rendering application3. Un-identified file format
7
Need for Digital Preservation
All Kinds of institutions must preserve & provide long term access to information
LegalDocuments
Website Archives
MedicalRecords
Research Data
Cultural Heritage
Audiovisual
Digitized Collections
Museums
The Challenges
9
Challenges
Active preservation principles:1) Ensuring bit integrity
2) Ensuring content health • Format viability• Complete metadata• Provenance
3) OAIS compliant system
10
Challenge 1: Bit Integrity
• Fixity checks determine if data has changed or corrupted• Basic feature found in asset management as well as
preservation solutions• Does not guaranty data access – just that it has not changed
11
Challenge 2: Content Health
• Formats evolve rapidly and become obsolete• File access requirements:
• Positive ID of format e.g. pdf• SW application e.g. Acrobat reader
• Complete Metadata:• Technical metadata (e.g. size, resolution, compression,
etc)• Descriptive metadata (e.g. author, title, publisher, etc)• Provenance Metadata
12
Challenge 3: OAIS Compliant System – The Model
Rosetta Solutions
14
OAIS Compliant System – Rosetta
15
Rosetta Solutions - Key Features
Scalable
Open &Integrative
Ready to useConfiguration
Community DrivenKnowledge Base
ActivePreservation
Flexible Delivery
16
Rosetta Solutions – Community Knowledge Base
Library of formats with metadata and extraction tools Based on PRONOM global library Formats associated to applications and risks Supports integration with a global library Auto update format library with each SW version
17
Rosetta Solutions - Active Preservation
Manages preservation planning process from risk to action Allows evaluation and comparison of alternatives Based on best practices and recommended workflows Community knowledge sharing
ExecuteEvaluat
eIdentify
PermanentStorage
OperationalStorage
MigrationAction
……
18
Rosetta Solutions - Scalable
Proven scalable architecture capable of ingesting and processing millions of files/day
Scale wide and dedicate servers to particular roles
Flexible configuration to allow for growth Failures handled gracefully to minimize
manual intervention
19
Rosetta Ingest Module – Manual Deposits
20
Rosetta Solutions - Open & Integrative
Rosetta
SubmissionApps
ILS/CMSSystems
SearchEngines
Plug Ins (validation, migration,
enrichment, etc)
StorageAbstraction
21
Rosetta Solutions – Submission Applications
• Deposit work flows out of the box• Automated (ftp, NFS, etc)• Manual
• SDK (software development kit) with API’s allows building submission tools to interact with Rosetta deposit module
Automatic Submission App
Publisher (e.g. newspaper)
Rosetta
22
Rosetta Solutions - ILS/CMS Systems
• Synchronization with ILS / CMS systems• Interface uses integration standards such as SRU and OAI.
Other ILS
23
Rosetta Storage Abstraction
Rosetta
Storage Abstraction Layer
NFS NetApp IBM
Rosetta SDK allows to create plugins in order to interact with any storage
PluginPlugin
Plugin
24
Rosetta Solutions – Search Engines
• Publishing module allows information exchange with external systems• Allows publishing different object groups in different formats • Provides a set of API’s and SDK for access• OAI interface out of the box
…
Search engine agnostic
Data Model
26
PREMIS
• Preservation metadata: implementation strategies• International working group concerned with developing
metadata for use in digital preservation• Metadata for intellectual entities, events, agents and rights• Data model consisted of several entities:
• Intellectual entity• Representation• File• Bit-stream
27
METS
• Ex Libris has a METS profile that will be published and open.
• Each Intellectual Entity is one METS • Each representation is a file group• Structure map is on the representation level• Metadata stored for all levels descriptive as DMD and
preservation as AMD.
28
Data Model
Intellectual Entitya coherent set of content that is reasonably described as a unit, for example, a particular book, map, photograph, or database
Representation
1
Nis the set of files, including structural metadata, needed for a complete and reasonable rendition of an Intellectual Entity
File
1
N
is a named and ordered sequence of bytes that is known by an operating system
Bit-Stream
1
NA bit-stream is data within a file that has meaningful common properties for preservation purposes.
29
JPG JPG JPGPDF
Data Model Example - Book
IntellectualEntity
JP2 JP2 JP2 TIFF TIFF TIFF
RepresentationMaster
RepresentationModified Master
RepresentationAccess Copy
30
JPG
Data Model Example - Image
IntellectualEntity
JP2 TIFF
RepresentationMaster
RepresentationModified Master
RepresentationAccess Copy
Who’s Using Rosetta and How
32
Support for Digitization Projects
Bavarian State Library (BSB) - Current mass digitization projects • Public-Private-Partnership with Google
• more than 1 million books (in less than 10 years), more than 300 million pages
• Books printed in the 16th century • 37.000 titles; 7.500.000 pages
33
Preserving and Managing Local Dissertations
Offering additional alternative platform for non-published materials, for example: ETH Bibliothek’s e-collection
34
Special Collections
Ex Libris Ltd., 2010 - Internal and Confidential
35
Dedicated Web Sites for Special Collections using Primo
36
Flexible Delivery Mechanism
37
Preserving Cultural Heritage Collections
National Library of New Zealand’s Royal Ballet Photos
38
Digitally-Born Collections (Websites)
Ensure the library stays relevant in the digital eraNational Library of New Zealand Web Site Harvest
39
Selected Rosetta Customers
Background
Background
Collections in Rosetta
Key Areas of Collaboration
Zurich, Switzerland Leading technological
institution DataCite partners
Wellington, New Zealand Development partner Mandate for digital
preservation
Research data Special collections Dissertations
Nation’s Cultural heritage Private collections Websites
Universi ty
National Library
40
Selected Rosetta Customers
Background
Background
Collections in Rosetta
Key Areas of Collaboration
Binghamton, NY, USA Part of the SUNY system FTE: ~14K students Staff: 1.5FTE (not dedicated)
Munich, Germany Service providers for
Bavaria Part of the Google Books
project
Special collections (Edwin A.
Link collection) Born digital newsletters University photographs
Scanned manuscripts and
rare books Legal deposit documents Websites
Universi ty
State Library
41
Selected Rosetta Customers
Background
Background
Collections in Rosetta
Collections in Rosetta
Leuven, Belgium LIBIS services providers Replacing DigiTool Integrating with Aleph and
Primo
Special collections Faculty papers e-mails Video collections
Wellington, New Zealand Merged with the National
Library Integrating Archway
Legal documents Archival collections Government papers
National Archives
Service Providers
42
China Rosetta Test Server: rosetta.cceu.org.cn
http://rosetta.cceu.org.cn:1801/deposit
43
Thank You!