1. What's in the database?

The Pathway Interaction Database (PID) contains information about molecular interactions and biological processes in biomolecular pathways. All interactions are assembled into pathways, and can be accessed by performing searches for biomolecules, or processes, or by viewing predefined pathways. The Browse pathways page lists the predefined pathways and provides a good overview of the database content.

2. What's the source of data in PID?

There are three sources: NCI-Nature curated data, BioCarta and Reactome data. Between 2006 and 2012, The NCI-Nature curated data were created by Nature Publishing Group editors and reviewed by experts in the field. Biomolecules are annotated with UniProt protein identifiers and relevant post-translational modifications. Interactions are annotated with evidence codes and references. In contrast, BioCarta data from June 2004 was imported without expert review, and biomolecules are annotated with Entrez Gene identifiers without associated post-translational modifications, evidence codes or references. Human pathways from Reactome were first imported into the PID in December, 2007, and correspond to Reactome's Version 22 release. Reactome data is annotated with UniProt identifiers, post-translational modifications and references.

For more detailed information please read the Database content and Data representation sections.

3. How is data represented in the database?

Data is represented in a highly structured and granular data model, which is reviewed in Data representation.

Please also view the Network maps section for descriptions of the text-based and graphical representations of the interaction networks.

4. What sorts of search can I perform?

The simplest way to search the database is to browse the predefined pathways.

In addition, a search box on the homepage allows users to query multiple object types within the database. To query biomolecules, UniProt protein accession numbers, HUGO gene symbols, Entrez Gene identifiers, aliases listed in Entrez Gene, CAS numbers and compound names may be used. To query biological processes, Gene Ontology (GO) identifiers (entered as GO:xxxxxxx), GO biological process terms, NCI thesaurus terms, and NCI thesaurus identifiers may be used. A user may enter any combination of the above-mentioned terms and identifiers.

An Advanced search can be performed using biomolecule names or identifiers, pathway names, GO biological process terms or identifiers, NCI thesaurus terms or identifiers or any combination of these, with an option to limit the search by evidence-type tags, called evidence codes, and an option to include upstream and/or downstream interactions.

A Connected molecules search finds one path connecting two or more query biomolecules.

A Batch query allows users to upload potentially long lists of biomolecules, entered in a single column, and view their relationships in pathways or interaction network maps. Two lists can be uploaded simultaneously, with the biomolecules colored accordingly in the search results.

See Searching PID for more detailed query instructions.

5. What sorts of computational analyses can I perform?

On the Batch query page, you can upload long lists of UniProt and/or Entrez Gene names or identifiers derived from, for example, high throughput expression data, mass spectrometry data or any other high throughput data. You may then select to obtain interaction network maps for all biomolecules in your list or overlay your biomolecules onto predefined pathway(s). Biomolecules from the list(s) will be color-coded within the pathway or network map.

6. Does the PID provide programmatic access to data?

Yes. Access is provided through the caGrid interface to caBIO. Please click here for further information.

7. What if my pathway of interest is either not yet in the database or is out of date?

Please let us know by sending us a feedback.

8. Can I obtain a list of biomolecules and a list of references used in a pathway?

In the predefined pathway view you can click on the Molecule List tab located below the 'submit feedback for this pathway' link to obtain a list of biomolecules found within the pathway. Similarly, the References tab brings you to the list of references used to curate the pathway.

9. In which formats is the data available?

Pathways are available in graphical Joint Photographic Experts Group (JPG), Scalable Vector Graphics (SVG) and Silverlight formats, as well as text-based PID Extensible Markup Language (XML) and Biological Pathways Exchange (BioPAX) Level 2 pathway data exchange formats. The SVG format requires an SVG-aware web browser; thus, you may need to install the SVG plugin which is available at The Silverlight format also requires a plug-in which is available at: Please view the Output formats section for more detailed information on these output formats.

