or08: session1 (part 2)

again unedited/checked:

I’m tagging with or08:

http://blogsearch.google.com/blogsearch?hl=en&q=or08&ie=UTF-8&scoring=d
http://technorati.com/search/or08?authority=a4&language=en

On the margins of scholarship

Richard Davis, Uni London Computer Centre

flickr, good example of an online document repository

“flickr for eprints”
1 – the data i enter to be used by other applications
2 – rss feeds i can use elsewhere
3 – clickable keywords, leading to similar articles in other repositories.

linnean online

demo’d images in an eprints ir, allowed photos to be previewed on the same page, bookmarks, comments.

may scan in original comments.

sneep. social networking extensions for eprints.
jisc funded, tagging, bookmarking, open source, exploit eprint objects

about giving choice, let users find it, e.g. facebook app, if we make it will it will be useful for someone (in a way we can’t predict).

web2.0 is raising expectations of what websites should do.

or08: session1 part1

again, unedited or checked. draft notes:

ian mulvany -nature (who produce connotea) – speaker.
david kane – Waterford Institute of Technology

openid

semantic web, very useful to get data from, but hard to do and very few do it.

contributing: plan text easy, semantic web hard
data mining: semantic web easy, plain text hard.

talked about how nature are working on intergrating social tools with connotea.

“we want to connect repositories with connotea”.

connitea could act as an interchange with repositories.

showed how Waterford Institute of Technology catalogue uses tags etc, example of how social tools can be intergrated.

openid:
your signon can be a URI, like a URL or email address,
When you try to sign on with oenid, you are redirected to your openid provider (eg yahoo). Yours details
are not shared with the site you are trying to access, just keys.

security risk, one website (your open id provider) is the key to access to your details on many websites, i.e. phishing risk. something to be aware of.

of (?) allows one site to access things on another site (eg doffler access your photos on flickr) without you having to give the first site your login details for the second.

connotea now supports open id. think it is the future.

or08: Open Repositories 2008

I’m at Open Repositories 2008.

Got in to southampton central at 8:15am, for a 9 start, thought i had loads of time but wait for bus, journey, and wondering around campus only just got here for 9. so did just about everyone else, so not username and password allocated until coffee time :( [but now online, hence you seeing this!]

draft notes from the first session, unedited or checked.
peter murray-rust
repositories data

Believes data is the most important thing for scientists, as opposed to open access final full text.

“PDF destroys information” – pdf destroys information, Word contains data which pdf just losses, word files (and latex, xml) are useful for sciencetists as they can reuse the data, formula, metadata etc contained within, which is lost as a pdf file.

academic theses are one of the most important thing for institutions/researchers. electronic theses are going to be very powerful.

technical problems slowed the talk down.

showed pdb repository, protein data range, going since the 70s. showed rsearch he *put* in to the *repository* while working at glaxo.

message is that scientists are already putting stuff in repositories.

crystaleye, built/started by a postgrad. now has over 100,000 crystal structures. harvests from those that release their crystallography (acs, rcs, etc). links to paper via doi.

scientists will not put things in to repositories (presumes he means articles based IRs, as he has just been describing how scientists do!).

OSCAR text extraction. showed example of cutting the text of a PDF doc in to OSCAR, it produced a table of formula that were contained within the text.

Royal Society Chemistry: PROSPECT , semantic markup of papers.
SPRECTRa (cam/imperial), how can we capture data as part of the academic process

“do not try to invent electronic notebooks”, success rate approx zero. i.e. don’t try and make capturing data the integral part of their workflow.

No one knows when their paper gets published.

“get at the authoring process”, that is the key.