XML

Introducing the eyeLIMS project

Scientists usually share information with collaborators from all around the world. For that purpose, eyeOS (www.eyeos.org) provides an invaluable system to access and share documents, create and save data files or store crucial personal and professional information.

To see eyeOS widely used by scientists all around the world, we initiated the eyeLIMS project ! eyeLIMS is a community-driven project which aims at providing a Free, web-based, Open Source Laboratory Information Management System (LIMS) powered by eyeOS.


Medline XML to database parser?

I recently downloaded Medline in XML format - the goal is to load it in a relational database (like mySQL), index it, and then somehow save the world with the data. I'm pretty sure tons (relatively speaking) of people have done the same thing before (except maybe the save the world part), and I'd prefer not to reinvent the wheel if I don't have to.

Anyone know of a good XML->database parser for Medline? If not I guess I'll code one myself! Indexing tips / advice would also be appreciated (first time I'm playing with a 50+ GB database). I heard Lucene is the bomb for indexing such a large database...


Structured Blogging for laboratory journals

The newest release of the Structured Blogging plugins might be of interest to anyone using Wordpress or Movable Type for keeping a digital record of their work, as it includes a template for writing reviews of journal articles. This provides a form for filling in standard fields (article and journal titles, volume, issue and page numbers, that sort of thing) and produces nicely formatted output as well as auto-generating COinS links so that anyone reading the article review can get directly to the full text of the paper.

I've been wondering how difficult it would be to prescribe a standard format for entering the methods and results of scientific experiments into a weblog...


Content Creation and Text processing

Liam Quin from W3C has given a few useful tips relating to processing documents (eg error-prone re-typed or scanned text) into XML.

Many of these practises are important for the sort of text processing tasks that seem to come up in bioinformatics.

Article summary: use lots of small one-off scripts to make small changes, continually validate your output, briefly document your steps, automate steps with a meta-script or Makefile and keep input and output text seperate (.. well duh!).


BioDASH demo

I'm sitting in on the BioDASH demo, Eric Neumann is giving a broad strokes introduction to the semantic web, resources need metadata, basically if we can share and aggregate data and everything will be wonderful.

Now he's talking about semantic lenses, this I think is Haystack specific, he's talking about FOAF now. The slide he's using is the same one that was used in the bio-ontologies, for links see the comments here.


YeastHub: A Semantic Web Use Case for Integrating Data in the Life Sciences Domain

Yeast hub

This is the talk I was genuinely interested in seeing, it is supposedly all about semantic web...

- Time is right for using the semantic web...

So the introduction is fairly general, the web is full of heterogeneous data and access methods, we need metadata and we need a standard format to put our metadata in which in this case is RDF.

The speaker is now talking about the proliferation of all the different BioXML standards, for example MAGE-ML, SBML, etc. The problem in pathway databases is particularly bad with many formats describing the same thing. So according to the speaker we should unify on RDF/XML. Then we get all the so-called "stuff for free" e.g. inference, integration etc.


Data Integration and Visualization System for Enabling Conceptual Biology

I'm sitting in on the ontologies and database track at ISMB today, the wireless isn't working in the main conference room, I'm a little fuzzy from drinking German beer last night, but thankfully I didn't find myself wearing my lanyard to the bar or taking my laptop to dinner. Not that I can say the same for others...

Data integration introduction

- How do we control people ?
- How do we maintain consistency ?
- Theme of data integration has been around for a while (early 90s)
- Talking about RDF and data integration
- Raving on basically...

The presenter is Finnish, his accent is a little funny, kind of like listening to the chef from the muppets give a scientific talk, he's cool though...


Pubmed RSS

Via hublog: "NCBI's PubMed is to offer RSS feeds for searches. Only took them 2½ years."

Update: Pubmed RSS is now live. After doing a Pubmed search, click on the drop down menu for "send to" and RSS is now one of the options. The feature allows you to specify a name for the feed and number of items to include, see here for an example of the result. Pubmed is using RSS version 2.0 and dumping the content of the HTML citation display directly into the description tag. For most users this won't matter as the display in most aggregators, like bloglines for example (see the bioinformatics folder), will be the same as the regular Pubmed site. However using RSS 1.0 with available modules would have enabled more metadata to be added to the feeds, see for example the semantically rich feeds produced by Connotea. Which is great if you're a Semantic Web fan, which we all are right ?

Interestingly it looks like RSS export is going to be part of the Eutils web services (see the feed url). Now if RSS 1.0 export was available from all the NCBI databases then we would have instant RDF data.

I'll leave commenting on issues such as the correct identifiers to use in the Pubmed RSS (lsids ?) and whether or not the RSS 2.0 feeds are valid, use best practices (entity encoding the HTML in the description elements) etc. for later as I'm busy with ISMB related activities.


biobar 1.3 (a customizable search toolbar for bioinformatics)

I'm finally ready with the latest release of biobar. See Post 1547 or Post 1537. Biobar is a toolbar project for firefox/mozilla/netscape, which allows a user to easily search and retrieve data for a particular search term straight from many biological data collections. The latest release provides access to 38 biological databases . Based on user input from Nodalpoint and other users this release has the following salient features:


hi

hi
i m new to this blog.hope we share informations that benefits all.
three cheers to bioinformatics


Syndicate content