DCC Workshop on Digital Repositories: Preservation Activity

Notification of events related to digital curation

Moderator: Forum Moderators

DCC Workshop on Digital Repositories: Preservation Activity

Postby Joy Davidson » Fri Jul 22, 2005 10:29 am

Here are a list of practical preservation activities that were presented at the workshop. Please use the forum to post any questions or comments that you have on these practical approaches listed below or to generate new threads on other approaches.

-Dark Archive in the Sunshine State (DAITSS)
-Lots of Copies Keeps Stuff Safe (LOCKSS)
-DSpace Preservation Strategy
-A Shared Preservation Model for Institutional Repositories: SHERPA Digital Preservation
Joy Davidson
DCC Training Coordinator and ERPANET British Editor
Joy Davidson
 
Posts: 72
Joined: Fri Oct 22, 2004 2:44 pm
Location: University of Glasgow

Article: US Repository activity, September '05 issue of DLib

Postby Joy Davidson » Wed Oct 05, 2005 3:11 pm

For an update on current repository activity in the US, see this recent article in the September 2005 issue of DLIB magazine by Clifford Lynch and Joan Lippincott.

'Institutional Repository Deployment in the United States as of Early 2005'
http://www.dlib.org/dlib/september05/lynch/09lynch.html
Joy Davidson
DCC Training Coordinator and ERPANET British Editor
Joy Davidson
 
Posts: 72
Joined: Fri Oct 22, 2004 2:44 pm
Location: University of Glasgow

Comparison of EPrints, DSpace and Fedora

Postby Joy Davidson » Mon Nov 28, 2005 2:16 pm

I am doing a comparison of functionality and adopter experiences for the two most widely used institutional archive-creating software packages for repositories: EPrints and DSpace, and also Fedora (a minor player globally but possibly important in Australia). I am seeking your help in collecting information. Information about other packages would also be welcome.

(1) If you have used or compared any of this software, could you please
take the time to let me know what you consider the respective advantages/disadvantages of each to be, and for what purposes? I am also interested in features that you think are equivalent or readily achieved in each.

(2) The two major software packages explain their orientation as follows: EPrints puts a particular emphasis on OA content (preprints and postprints of institutional research output, plus theses), DSpace on digital curation in general. Fedora describes itself as repository storage layer software requiring custom front-ends for any purpose. If you have any specific comments on these overall orientations and whether they are appropriate, they would be very helpful too.

(3) While all these packages are free and open source, I would also be interested in any cost estimates in implementing the one you chose, how
many hours or dollars you spent on setup, how much maintenance you have to expend, and how reliable the software is (crashes, downtime, etc). Would you recommend it to someone else?

I will post a summary of the results (and maybe an interim report) on AmSci OA Forum, and may get back to you if I reed a bit more detail. Thank you in anticipation of a prompt response and a flood of emails. Email direct to me at Arthur.Sale@utas.edu.au if you want.

Arthur Sale
Professor of Computing (Research)
University of Tasmania
http://leven.comp.utas.edu.au/AuseAccess/
ahjs@ozemail.com.au
Joy Davidson
DCC Training Coordinator and ERPANET British Editor
Joy Davidson
 
Posts: 72
Joined: Fri Oct 22, 2004 2:44 pm
Location: University of Glasgow

Postby Stefan Strathmann » Mon Nov 28, 2005 3:07 pm

A extensive comparison of functionality of many institutional archive-creating software packages for repositories can be found in the nestor expertise called: "Vergleich bestehender Archivierungssysteme" http://www.langzeitarchivierung.de/down ... mat_03.pdf

Unfortunately the report is in German language.
Stefan Strathmann
 
Posts: 4
Joined: Thu Mar 10, 2005 2:58 pm
Location: Goettingen State and University Library, Germany (SUB)

Postby Maureen Pennock » Mon Nov 28, 2005 3:45 pm

The JISC Digital Repositories programme may also be of interest to you, in particular the IRRA project - Institutional Repositories and Research Assessment - which is investigating and developing institutional repository infrastructure for EPrints and DSpace.
JISC Digital Repositories Programme
IRRA website
Maureen Pennock
----------------------------------------------------------
Web Archive Preservation Project Manager
The British Library
----------------------------------------------------------
Maureen Pennock
 
Posts: 49
Joined: Thu Oct 20, 2005 12:37 pm
Location: British Library

More information on repositories and projects.

Postby Maureen Pennock » Tue Nov 29, 2005 11:27 am

More information from the JISC Digital Repositories Programme:

The Institutional Archives Registry at the University of Southampton may be useful for you. The registry has two functions: "(1) to monitor overall growth in the number of eprint archives and (2) to maintain a list of GNU EPrints sites (the software Southampton University has designed to facilitate self-archiving)". The registry can be filtered by country, software (e.g. GNU eprints, DSpace, Bepress etc.) or content type. It is possible to register an archive. The registry uses OAI-PMH.

The National Library of Wales have implemented Fedora and the RepoMMan project at the University of Hull has also looked at it.

There's a Mellon-funded project at Johns Hopkins University in the U.S. that is looking to "conduct an architecture and technology evaluation of repository software and services such as e-learning, e-publishing, and digital preservation". A Technology Analysis of Repositories and Services.

The JISCmail repositories listserv may also be able to assist. Post to jisc-repositories@jismail.ac.uk
Maureen Pennock
----------------------------------------------------------
Web Archive Preservation Project Manager
The British Library
----------------------------------------------------------
Maureen Pennock
 
Posts: 49
Joined: Thu Oct 20, 2005 12:37 pm
Location: British Library

Creating an Institutional Repository : LEADIRS Workbook

Postby Joy Davidson » Tue Nov 29, 2005 12:14 pm

This resource may be of use to participants of the DCC Digital Repositories workshop:

DSpace has published a document Creating an Institutional Repository : LEADIRS Workbook. Intended for managers and repository developers, this extensive document covers all aspects of building institutional repositories such as planning, choosing repository software platforms, legal and regulatory environmental and policy development and cost modelling. Each section has associated worksheets and key questions. There are many references to case studies which are used to highlight approaches to the various issues in developing an institutional repository.

Barton, M. and Waters, M. (2005). Creating an Institutional Repository : LEADIRS Workbook. DSpace.

Retrieved September 15, 2005 from : http://www.dspace.org/implement/leadirs.pdf
Joy Davidson
DCC Training Coordinator and ERPANET British Editor
Joy Davidson
 
Posts: 72
Joined: Fri Oct 22, 2004 2:44 pm
Location: University of Glasgow

archives or access

Postby simonfj » Wed Nov 30, 2005 1:29 am

I was reading the discussion.

Thought this may be of interest as it's from the video end of town.
[url]http://www.ercim.org/publication/ws-proceedings/DELOS6/upf.rtf[url]

One quote from this stopped me.

"we need to differentiate between digital archives, which are concerned with the timeless storage of digital materials, and digital libraries, which is primarily concerned with timely issues of access"

Comments?
simonfj
simonfj
 
Posts: 65
Joined: Tue Nov 15, 2005 2:32 pm

Not so sure about this paper

Postby Maureen Pennock » Wed Nov 30, 2005 10:28 am

Hmm, I found several issues in that document that I don't necessarily agree with, the blanket statement on the difference between archives and libraries being just one of them. The authors do not say why they need to differentiate, which is somewhat telling - what will be the effect of not making this difference, at least insofar as the UPF is concerned?

Preservation and access are inextricably interlinked. Yes, archives are concerned with the timeless storage of materials, but they are also concerned with providing timely access to the materials they are storing! Digital libraries must be able to provide timely access, but they must be able to provide access to materials through time. There are several ways in which this can be achieved, for example, by providing access to materials that others are storing. However. this does not mean that they are not also concerned with long term preservation.

I think there are better ways to approach the difference between libraries and archives.

BTW, does anyone know when this document was published? It doesn't have a date :(
Maureen Pennock
----------------------------------------------------------
Web Archive Preservation Project Manager
The British Library
----------------------------------------------------------
Maureen Pennock
 
Posts: 49
Joined: Thu Oct 20, 2005 12:37 pm
Location: British Library

DCC Workshop on Digital Repositories: Preservation Activity

Postby Jane_Stevenson » Wed Nov 30, 2005 10:48 am

I find it very frustrating that there is this constant assumption that archives (i.e. archive repositories) are primarily for long-term storage and not for access. I am an archivist, now working on the preservation of digital materials, and my experience is that although there is a constant debate about conservation issues versus access issues, the idea is that archives are kept primarily to be used. Now that the word 'archive' is being used a great deal more in different domains, rather than just for traditional paper-based archives, I'm beginning to get the impression that people have long held this belief that archive repositories are not really about access and that us archivists just sit in our offices and catalogue archives in order to store them away on a dim and distant shelf.
 
Well, that's my comment!
 
cheers,
Jane.
 

Jane Stevenson
===========
Archives Hub
Manchester Computing
The University of Manchester
Oxford Road
Manchester M13 9PL

From: simonfj [mailto:simonfj@cols.com.au]
Sent: 30 November 2005 00:29
To: events@forum.dcc.ac.uk
Subject: DCC Workshop on Digital Repositories: Preservation Activity




I was reading the discussion.

Thought this may be of interest as it's from the video end of town.
[url]http://www.ercim.org/publication/ws-proceedings/DELOS6/upf.rtf[url]

One quote from this stopped me.

"we need to differentiate between digital archives, which are concerned with the timeless storage of digital materials, and digital libraries, which is primarily concerned with timely issues of access"

Comments?



simonfj
Palmgrove Rd
Avalon
Australia



Jane_Stevenson
 
Posts: 10
Joined: Mon Aug 01, 2005 12:22 pm

Re: Not so sure about this paper

Postby simonfj » Thu Dec 01, 2005 6:32 pm

Thanks for this.

I guess the reasons for the need to differentiate come down to understanding the two perspectives. What does a library look like, what does an archive look like, and how does their process differ?

Both david and you might agree that an archive looks like this.
http://pandora.nla.gov.au/index.html

But a curator might not understand david because his library looks like this. http://www.omn.org/index.htm

The difference in mindsets - Geeks vs Luddites - and skills, is the real digital divide.

Thankfully, this is the "curation" center. My dictionary defines curator as "a protector of students under 25 against fraud".

So, Re: the date of the paper. I have chastised david. He'll probably reply next week.

regards,

Maureen Pennock wrote:Hmm, I found several issues in that document that I don't necessarily agree with, the blanket statement on the difference between archives and libraries being just one of them. The authors do not say why they need to differentiate, which is somewhat telling - what will be the effect of not making this difference, at least insofar as the UPF is concerned?

Preservation and access are inextricably interlinked. Yes, archives are concerned with the timeless storage of materials, but they are also concerned with providing timely access to the materials they are storing! Digital libraries must be able to provide timely access, but they must be able to provide access to materials through time. There are several ways in which this can be achieved, for example, by providing access to materials that others are storing. However. this does not mean that they are not also concerned with long term preservation.

I think there are better ways to approach the difference between libraries and archives.

BTW, does anyone know when this document was published? It doesn't have a date :(
simonfj
simonfj
 
Posts: 65
Joined: Tue Nov 15, 2005 2:32 pm

Re: Not so sure about this paper

Postby Maureen Pennock » Fri Dec 02, 2005 10:49 am

simonfj wrote:Thanks for this.

I guess the reasons for the need to differentiate come down to understanding the two perspectives. What does a library look like, what does an archive look like, and how does their process differ?


I think you've hit the nail on the head there Simon, as much of it is about process. But it's also about requirements. The two types of institutions have different requirements when it comes to preserving digital objects - or at least, the order of the requirements differs in priority - and many would also say that the digital objects themselves require different approaches. You may well need to take more stringent attempts to ensure that the authenticity of diverse types of digital objects in an archive can be assured, than you would in a library that doesn't have the same authenticity requirements and doesn't have the diverse range of object types that we can expect an archives to host.

Having said all that though, meeting these requirements clearly ties in with the processes that the institutions follow when managing their digital objects.

simonfj wrote:

Thankfully, this is the "curation" center. My dictionary defines curator as "a protector of students under 25 against fraud".


Lol. We need to get more students onside then!
Maureen Pennock
----------------------------------------------------------
Web Archive Preservation Project Manager
The British Library
----------------------------------------------------------
Maureen Pennock
 
Posts: 49
Joined: Thu Oct 20, 2005 12:37 pm
Location: British Library

Requirements

Postby simonfj » Mon Dec 05, 2005 11:58 pm

Thanks Maureen,

I must tell you though, it's really not so much the differences that i'm interested in. The thing that makes the convergence - between Digital Asset Managers on one side and Digital Librarians on the other - fascinating to me, is finding out what their global communities have in common.
simonfj
simonfj
 
Posts: 65
Joined: Tue Nov 15, 2005 2:32 pm

Re: Not so sure about this paper

Postby dmaccarn » Tue Dec 06, 2005 4:20 pm

This paper was published in July of 1998.
There is a web site that has an updated (2001) version at http://info.wgbh.org/upf

It's our view that access and preservation are two different beasts.
To provide access a system needs to be tuned to access. Metadata is in databases so it is easily retrieved. Assets are in a form that can be easily transfers or transcoded to fit the distribution environment.
Preservation is all about the preservation. The metadata should be stored with the assest. Even within the same file. The digital file formats should be easily migrated and easily read if the underlining systems change. Storage systems need to be watched and maintained.
Disaster needs to be planned for. If you operate an access-only system and the database is destroyed then how do you identify the assets? If a preservation system is destroyed are there pieces enough to rebuild it?

Thanks to simonfj for including me in the discussion.

Maureen Pennock wrote:Hmm, I found several issues in that document that I don't necessarily agree with, the blanket statement on the difference between archives and libraries being just one of them. The authors do not say why they need to differentiate, which is somewhat telling - what will be the effect of not making this difference, at least insofar as the UPF is concerned?

Preservation and access are inextricably interlinked. Yes, archives are concerned with the timeless storage of materials, but they are also concerned with providing timely access to the materials they are storing! Digital libraries must be able to provide timely access, but they must be able to provide access to materials through time. There are several ways in which this can be achieved, for example, by providing access to materials that others are storing. However. this does not mean that they are not also concerned with long term preservation.

I think there are better ways to approach the difference between libraries and archives.

BTW, does anyone know when this document was published? It doesn't have a date :(
dmaccarn
 
Posts: 1
Joined: Tue Dec 06, 2005 12:00 pm
Location: Boston, MA USA

Preservation and Access systems

Postby Maureen Pennock » Wed Dec 07, 2005 11:32 am

Hi David

Welcome to the forum :)

Thanks for your clarification on this issue. This is a very interesting topic, and one which archivists and librarians (and systems engineers) seem to disagree on regularly, and at varying levels of granularity! (wholly seperate/linked/one more important than the other etc.) I was at a conference recently where this was certainly the case.

So, can I further take it from your post that your approach is based on the need for completely different systems for preservation and access? Is it considered unfeasible to have a preservation system that can make access copies available upon request, for example, via some sort of an airlock or generating access copies on demand? Perhaps the focus on either preservation or access prohibits this from working successfully?

I am also curious about the need to store metadata in the same file as the digital object in a preservation system. I've heard it argued elsewhere that this makes it more complicated to maintain the guaranteed integrity and authenticity of the object when the file format of the object needs to be migrated. It also means that to update the metadata in such a case you would have to access and update the file in which the object was stored, which again may pose a risk to the perceived integrity/authenticity of the object.

It's interesting that you raise the need for disaster planning, as this seems to be an issue that is often overlooked, so thanks for bringing it up. I couldn't agree more. How does this fit into your framework - backed up copies of the system stored offsite at more than one location maybe? Or more than that?

Thanks for your input,
Maureen.
Maureen Pennock
----------------------------------------------------------
Web Archive Preservation Project Manager
The British Library
----------------------------------------------------------
Maureen Pennock
 
Posts: 49
Joined: Thu Oct 20, 2005 12:37 pm
Location: British Library

Next

Return to Events

Who is online

Users browsing this forum: No registered users and 0 guests

cron