Motivation Availability of the sequences of entire genomes shifts the scientific curiosity toward the identification of function of the genomes in large scale as in genome studies. In the near future, data produced about cellular processes at molecular level will accumulate with an accelerating rate as a result of proteomics studies. In this regard, it is essential to develop tools for storing, integrating, accessing, and analyzing this data effectively.
- Define an ontology for a comprehensive representation of cellular pathways.
- Develop software tools and construct an associated database using this ontology, providing an effective environment for pathway data integration, storage, access, visualization and analysis.
- Design methods for automatic population and annotation of the pathway database.
- Design methods for infering pathway activity using temporal data such as gene expression data.
- Develop techniques for effective visualization of pathway and gene expression data.
Results We have defined an ontology for a comprehensive representation of cellular events. The ontology enables integration of fragmented or incomplete pathway information and supports manipulation and incorporation of the stored data, as well as multiple levels of abstraction. Based on this ontology, we have designed integrated environments composed of a server-side, scalable database and client-side editors to provide an integrated, multi-user environment for visualizing and manipulating network of cellular events. These tools feature advanced querying and automated pathway layout within a user-friendly graphical interface.

We expect that PATIKA will produce valuable tools for rapid knowledge acquisition, micro array generated large-scale data interpretation, disease gene identification, and drug development.

Here is an article on Pathway Informatics and the PATIKA project that appeared in September 2006 issue of Genome Technology magazine.
