GeneNarrator: A Text-Mining System for Facilitating Functional Analysis of Genomic/Proteomic/Metabolomic Data

  1. Overview
  2. System requirement
  3. Download

1. Overview

GeneNarrator is an automatic text mining system to facilitate functional analysis of genomic/proteomic/metabolomic data.  A user provides GeneNarrator as input with a list of genes/proteins/metabolites (GPMs).  The system retrieves or samples from PubMed abstracts that contain at least one GPM in the list, and clusters them into hierarchical topics.  The GPMs are then weighted based on the distribution of their associated abstracts among the topics; and those with similar weight distributions are grouped together.  For each topic, a list of representative terms, sentences and abstracts is provided to help the user capture its biological meanings.

More details can be found in the user manual (last updated 04/30/2005) and this paper (in preparation).

2. System requirement

Linux (Linux is required for the "CrossBow" component. Other components can run under any platforms).
Enough memory: the more memory, the bigger dataset can be analyzed. 1GB is required for moderate sized dataset (~500 GPMs or ~10,000 abstracts).
Java 2 Platform, Standard Edition (J2SE v 1.4 or higher)

3. Download

Modified "bow" source code (last updated 09/25/2004)(original "bow" here)
GeneNarrator package (last updated 06/27/2005)
GeneNarrator Java source code (last updated 06/27/2005)
Run BowViewer directly from your web browser (Java WebStart required, included in J2SE 1.4.2)
A sample dataset (155 yeast genes from 10 pathways, last updated 12/18/2006)

Last modified: 12/19/2006

© 2004 Jing Ding. All Rights Reserved

Comment to: jing.ding@osumc.edu