Hidden Nodes Algorithm

We are now in a high-throughput expression profile era where the advent of advancing technology we face a fundamental biological issue: how to use expression profiles and recapitulate the role of key regulatory proteins that are functionally relevant to such profiles remain undetected due to the lack of changes in expression . This situation arises in instances such as:

  • Gene/ protein undergoes post-translational modification
  • Gene/ protein activity is altered or dependent on binding to second messengers or recruitment to a particular sub-cellular locale.
  • Role of genes/protein differs when part of a complex

This applies to drug targeting analyses where the drug targets may not change in expression but their altered activity is key to the disease outcome.

GeneGo’s new Hidden Node Algorithm is geared to identify “hidden” regulatory proteins and their pathways by analyzing disease- and condition specific molecular profiles in the context of the global protein interaction network.

How the Algorithm Works

The general concept is find “hidden” proteins by identifying sets of their likely downstream targets and assessing enrichment of such sets by differentially expressed genes or proteins. We call this procedure “topological scoring”.

First we build a dependency graph for each regulatory protein from the input data using the entire universe of protein-protein interactions from MetaBase™.

For example the Figure on the right shows a dependence graph for node X with of all nodes downstream of X, where X could be a potential regulator of all these nodes.

In real protein networks dependency graphs for many important nodes contain thousands of nodes and never significantly enriched. Second, we identify parts of this graph that can be more directly regulated by other signaling proteins (such as “a” in the figure below) and exclude them from consideration, as shown in the figure below.

The section marked in yellow is part of the X dependency graph that is as close to other node A as it is to X. In our method we attributed regulation of this part to A rather than X.

Finally, we calculate enrichment of the remaining part of the dependency graph with experimentally identified genes or proteins from a condition- or disease-specific profile of interest.

An example of this is shown below where JAK1 and STAT1 were identified as “hidden nodes” (in blue boxes) for a given expression profile. All objects originating from the expression profile are marked with red or blue circles in the upper right hand corner of the object. In this example JAK1 and STAT1 were drawn from MetaBase and not detected in the original expression profile but are key to the regulating downstream effects of PLAUR to targets listed on the right (ie ICAM, CCL2, TIMP1, Aquaporin 9). Thus, topological significance is essential to identifying important links in pathways that do not come up on high throughput genes.

JAK1 provides essential network conduit between PLAUR and many differentially expressed targets of STAT1

For a demonstration of the new Hidden Node Algorithm, download the demonstration webinar recording here.

The paper describing the algorrithm can be accessed by clicking here.

Click Here to access the Hidden Nodes Algorithm