Similarity   NEtwork   FUSION  (SNF)

 
 

Similarity Network Fusion (SNF) is a new computational method for data integration.  Briefly, SNF combines many different types of measurements (such as mRNA expression data, DNA methylation, miRNA expression and more - clinical data, questionnaires, image data, etc) for a given set of samples (e.g. patients). SNF first constructs a sample similarity network for each of the data types and then iteratively integrates these networks using a novel network fusion method. Working in the sample network space allows SNF to avoid dealing with different scale, collection bias and noise in different data types. Integrating data in a non-linear fashion allows SNF to take advantage of the common as well as complementary information in different data types.

 

Description

Example above illustrates SNF approach on the task of fusing two data types: mRNA expression and DNA methylation for the same cohort of patients. 1. Construct patient similarity matrices for each data type using pairwise correlation   2. Patient similarity matrices are equivalent to patient similarity networks where patients are nodes and edges represent patients’ pairwise similarities. 3. Starting with the patient similarity networks run the patient network fusion, iteratively updating each of the networks with the information from the other networks, making them more similar with each step. 4. The final fused network of patients to which the SNF process has converged. Edge color indicates which data type has contributed to the given similarity.   

CONTACT: For questions or comments about the code please contact Bo Wang

CODE:   R  SNFtool_v2.1.tar.gz  Matlab  SNFmatlab_v2.1.zip (see the Updates section for more details)

DATA: GBM.zip  Breast.zip  Colon.zip  Kidney.zip  Lung.zip

Disclaimer: these data were used in our paper. For more recent versions of the data collections for these cancers, please see the TCGA website directly

CITATION: B Wang, A Mezlini, F Demir, M Fiume, T Zu, M Brudno, B Haibe-Kains, A Goldenberg (2014) Similarity Network Fusion: a fast and effective method to aggregate multiple data types on a genome wide scale. Nature Methods. Online. Jan 26, 2014  

As of Jan 26, 2014 our paper is online at Nature Methods