Open Repositories 2010

Thursday, July 8, 2010

OR 2010: day three

getting practical

Historically, the OR conference starts with two days of general material, and then switches to user group meetings that attendees choose to go to based on their repository software of choice. As a result, today was DSpace, DSpace, DSpace.

I swear today's short post is not a cop-out. No, it is not. It is just that today's information really wasn't anything entirely new. The perhaps earthshattering news was that DuraSpace has long-range plans to develop DSpace and Fedora so that DSpace can actually sit on top of Fedora, if you will, taking advantage of the flexibility one can have with Fedora, but still achieving the DSpace turnkey solution. This is several years off, but it's also the first acknowledgment that the two software packages will, inevitably, merge under the new organization. But it's a long ways off, and nothing that I feel I need to worry about at the moment.

lunch

Smoked salmon, fried pork, and a giant chocolate mousse, topped with conversation about living in Boston (two of my table mates live in the Boston area) and New England in general. It was a good lunch.

user group session: DSpace 1.6

A nice show and tell of the statistics package in 1.6; it is so beautiful and informative compared to the University of Minho package our institution currently uses. The upgrade will feel positively luxurious. It even shows the different numbers for which specific bitstreams are downloaded.

And if that wasn't enough to make an institution want to upgrade their software, there was a presentation detailing an institution's upgrade from 1.3 to 1.6. When making that large a leap, they had to plan for several months to determine what customizations would continue to work, and indeed, which ones they might not even need any more.

user group session: DSpace repository manager session

To be honest, the only new information presented in this session was how the batch metadata editing feature works in DSpace 1.6. This is also a fabulous to-die-for feature that is definitely worth the update. It was followed by two rather elementary presentations on how DSpace works "under the hood" and how the development team is organized and how you can contribute; all of this would have been absolutely illuminating, transcending information to have had last year, when our institution was just starting to implement DSpace, but is now old hat.

With respect to the presenters, however, it is necessary to do this every year. There were definitely new implementers, or potential DSpace implementers, in the room who were very interested in the presentation. Perhaps in future years it should be titled "DSpace for newbies" or something like that.

Sitting in this very large auditorium surrounded by repository managers and developers reviewing the down-and-dirty of DSpace, I am reminded of how much DSpace implementation and upkeep is truly is a collaborative effort that should be taken on not just by a repository manager (usually a librarian) or a developer (usually someone from the IT department) but both parts of the whole. To truly have an effective DSpace instance that does what the particular institution's end-users need, the effort must be a collaborative one. A repository will not be effective or sustainable in the long term without this collaboration.

Wednesday, July 7, 2010

OR 2010: day two, part two

day two, session three : sustainability & business operations

Straight from the horse's mouth (the horse in this case being Thorny Staples from DuraSpace): proprietary vs. open source = cash vs. code. You have to either pay, or participate. Or, it seems from this collection of presentations, perhaps even both. Certainly, I think we're all wondering how or when repositories will become sustainable. I don't have much to say about this, other than that I'd be interested in seeing a chart comparing the level of investment in institutional repositories and data curation to the level of spending on e-resources. I know how expensive e-resources are. I don't think anyone has put numbers on the true cost to pay or participate.

day two, session four : open access policy

A very interesting presentation from the University Corporation from Atmospheric Research. She had fascinating images for us to see. Who doesn't enjoy looking at satellite images of our world? particularly at such an international conference.

Also very interesting because when they explored mandating open access at UCAR, library head Mary Marlino called the two largest publishers of UCAR's research output (a professional organization and a union) and talked to them about open access and how UCAR was moving in that direction. She said it was a difficult conversation, but in the end she got results -- a better relationship with them, certainly, and a 6-month embargo period instead of 2 or 5 years. A fine example of how conversation can achieve results.

As an aside - I had a conversation earlier today about an apparent debate regarding whether this field is young or not. I was surprised that anyone would be foolish enough to say this is a mature field.

It won't be a mature field until open access isn't a scary, bad word that is often cause for argument; when researchers publish in open access repositories as a matter of course; when we have not just theoretical standards, but tested best practices in place for digital preservation. Until then, we're still just toddling around, at best.

OR 2010: day two, part one

I should reiterate, in case you're just joining me, that this blog basically presents my impressions from the Open Repositories conference. It is not intended to be comprehensive -- nor could it be, since many sessions run concurrently. Enjoy!

day two, session one : digital preservation and archiving

It would seem that a great deal of work is still needed in the area of digital preservation. As one presenter pointed out, repository managers have plenty to do already without worrying about preservation, which is another discipline unto itself. I do wonder how many of us have digital preservation plans in place.

I enjoyed learning about the KeepIt project, an 18-month training project that is nearing completion in the U.K. It was reminiscent of the ICPSR digital preservation workshop I attended last fall, except it went much further, engaging the members over the several months and working with different tools to help aid in preservation activities. There seem to be an awful lot of possible preservation tools out there, but whether they work for your institution's needs is another question entirely. To that point, a presentation from the archives community described the challenges inherent in migrating traditional archiving habits to fit our digital world. While archivists are used to basing their archival descriptions on locations -- "Mr. Green's work is in Room B, Row 7, Box C, File 71" -- obviously, digital archives do not work this way. I thought before, and I think again, that it's a shame libraries and archives are not working more closely together on these questions, because in many ways we are asking the same questions, and it would help if we worked together. The idea of using OAI-ORE to link EAD records together was presented, and refuted by a couple of audience members who questioned whether this was a viable solution.

I thought it was an interesting point that while we think we are putting all of our resources in one place when we start an institutional repository, in fact we have resources in many, many places: course descriptions on the university website or catalog; syllabi in the course management system; student papers may or may not be in the repository; faculty papers may or may not be in the repository; other faculty work is most surely on a blog, wiki, or personal website someplace. So we're not keeping all of our records in one place, at all. But my question is: should we?

break - Back to the soda machine. And actually , it was nice in that I met someone who remembered my name because I was one of the few people with questions at a webinar she presented. That was too funny.

day two, session two : academic workflows

Generally speaking, this was a show-and-tell session for different tools and processes being used at the speakers' respective institutions. I think back to some of these from the last conference, which I was able to view online, and I have to think that it's really too bad more of these ideas aren't integrated directly into the software packages. I know why, of course -- one must really commit oneself as a developer to do such a thing, and most of us don't have the time or institutional support to do such things. It's really a shame. Sometimes the open source community feels just a bit like an incredibly intelligent, innovative octopus with unlimited tentacles and no brain. And just so no one is upset by my "no brain" statement -- by brain, I mean central leadership to pull all of the ideas in all of the different tentacles together.

And what's this talk about people who might want to publish in more than one repository? I've heard this idea more than once today. Ridiculous. Imagine how unwieldy that could get, and how completely unnecessary. If you must have a record, just point to the identifier of the object where it already lives in someone else's repository. Done.

lunch

Of course I have to write about lunch again. I had a very enjoyable luncheon with Deborah Kaplan from Tufts University. Lunch was much more efficiently served today -- it seemed yesterday we spent a lot of time waiting for each course, but today was whisk! whisk! whisk! very prompt. And tasty -- a small fish salad-type course, a main course of salmon, and dessert. And wine. Of course. Kudos to the Palacio de Congresos staff.

Tuesday, July 6, 2010

OR 2010: day one, part two

day one, session two : comparative analysis of repository software

Rather a misnomer; the first presentation was a very brief discussion of a comparative analysis facilitated by three researchers, but the other presentations basically presented other repository solutions that were developed in European countries, and so not as familiar to those of us from the States (such as PubMan, from Germany). Of course, the initial presentation was United States-centric, although I'm sure they did not mean to be; we just don't share information well across the pond and/or across languages. So the other presentations were interesting in that they showed a different side of things.

In any case, it's apparent that any comparative analysis definitely requires more analysis of the information than just bare facts. Simply stating that DSpace has the most active support community because it has the most listserv messages is perhaps not a positive discovery that indicates good support; in fact, it could be taken to mean that the software is more problematic to use and requires more questions as a result. Or perhaps its status as the most widely-implemented repository only naturally leads to more listserv messages from more implementers. In any case, initial results of the analysis are available on a blog at blogs.lib.purdue.edu/rep. Take a look – what's your opinion?

day one, session three : interoperability policy

I'm afraid the organizers stretched a bit when they organized these presentations under "interoperability policy"; only one seemed to meet that criteria. That one, however, was quite an interesting presentation. I believe I heard about it before: DL.org and the interoperability challenge, an organized effort to establish guidelines among six different working groups that each concentrate on an aspect of digital libraries: content, user, functionality, policy, quality, and architecture.

Our presenter pointed out that initially, just using the word "digital library" was an interoperability question in itself, as there are also digital repositories, institutional repositories, and so on that are all included in this umbrella. Considering this, you can only imagine the complexities involved in the rest of the work.

poster madness

"Describe your poster in one minute or less. There will be a giant online stopwatch that the audience will see. They are instructed to start clapping, whether you are finished or not, when five seconds are left. Go."

I have to say, it was very entertaining. The stopwatch is available at http://www.online-stopwatch.com (they used the Large Stopwatch, counting down).

poster reception

And finally, more food. And beer, and wine. The hors d'oeurves were fascinating. Some were along the lines of something I might see on Top Chef. I have never before eaten some sort of whipped avocado cream topped with one teeny-tiny shrimp (or prawn, as you prefer) with a teeny-tiny spoon. And for those of us who might be a little weirded out by such things, there were also what appeared to be Lay's potato chips.

They certainly did feed us well.

p.s. the Spain vs. Germany World Cup semifinal is tomorrow night. At the same time as the conference dinner. Crisis! Crisis! But not to worry; they have arranged to show the World Cup while we eat.

OR 2010: day one, part one

Open Repositories 2010
Official caveat: The following notes can't be considered a complete summary of the day; I don't believe anyone, at any conference, could honestly say that he didn't space out here or there during a presentation or two. The notes below should be taken simply as impressions that remained with me during and after day of the conference.

keynote

Sleepy, thirsty conference-goers (for there were no refreshments during registration) were treated to David De Roure's keynote presentation about a sort of social networking tool for researchers called myExperiment. While this particular project admittedly has absolutely no impact on my own work as a repository manager at a small New England university, I did think it was fascinating that this project focuses on sharing processes and workflows researchers use to make conclusions from the great masses of data resulting from their research, and not the research itself.

The speaker did point out that most scientists aren't willing to share, instead holding their processes dear to their hearts, but it does have 4,034 members at this time from all over the world, and shares, among other things, 1,165 workflows. From the scientific community, those sounded like good numbers to me, and I hope other scientists find it in their hearts to recognize that such information sharing can only benefit other individuals, as well as science as a whole.

individual programs

The programs split up at this point. After a generous half-hour break where beverages and pastries were served (pastries are so much better in Europe... not nearly as sticky or sweet) we chose which program we preferred to attend. I have to interject that I think I stood out as the one attendee who was desperate for a cola instead of un cafe. Fortunately I found a soda machine. But I digress.

day one, session one : citation & bibliography

Having apparently failed at persuading our best and brightest to publish exclusively in our academic institutional repositories, there seems to be a movement to instead design platforms that will automatically aggregate information that will showcase all of a faculty member's scholarly work, in one place, regardless of where it was published. The three speakers all had their own software developed to serve this purpose.

All addressed the problem of authority control for author names, and most are achieving solutions which, while not bulletproof, are certainly a great improvement over nothing at all. Particularly of interest to me, since I am American and just received numerous e-mails last week about its release, was BibApp 1.0, a solution in place at the University of Illinois at Urbana-Champaign (see it in practice). I believe it has great potential, and apparently, according to a question from the audience, Indiana University is also developing a similar type of solution.

While it is early in development, it could be very useful for universities and their departments to truly showcase their scholarly output in this way, whether or not it's available in an open access format. It's almost like collecting everyone's CV in a more accurate and public way.

Both the speaker and the audience member from IU did mention some faculty resistance to such a system, which amused me. Why the concern? Perhaps they are simply concerned that such a tool might become part of the tenure process, or make it easier for their department chairs to glance at a couple of web pages and say, "why aren't you doing more papers, like your colleague Mr. Green?" Who knows. I'm interested to know the true cause for their concerns.

lunch

Lunch was relatively more formal than any conference lunch I've enjoyed in the United States. We seated ourselves at rounds of ten and were served three courses: fish, beef, and a beautifully presented dessert. Bottled water and wine (wine!) was available on the table. I had the amazing luck to sit near a repository manager from the United States, so we were able to talk a little bit. The lunch was quite decent, although I'm sure it was the first time that most of my American colleagues had ever had a fish loaf for a first course. Such a shame the vegetarians missed out on that experience.

Friday, June 25, 2010

conference planning

Alice Platt, Digital Initiatives Librarian. I will be attending the Open Repositories 2010 conference next week in Madrid, Spain. The intention of this blog is to capture my in-the-moment impressions of conference sessions, and perhaps serve as an additional resource to anyone following the conference remotely.

From a professional standpoint, I expect this conference to provide exposure to the new developments in open access repositories, as well as show off some neat new tricks in DSpace 1.6. I imagine I will return home rather humbled; it is always a challenge to evaluate the cool exciting things you see at conferences and translate them to cool exciting things to implement at your home institution. I do see some rather elementary topics are covered, such as SWORD, which I've heard about for the past year, but don't yet completely understand. I'm glad they haven't forgotten those of us who haven't quite picked up on all the technical capabilities of DSpace.

Personally, I'm excited to go to Madrid and interact with so many developers and librarians. Should be good times.