SNFtool: Similarity Network Fusion
Download SNFtool: Similarity Network Fusion
Preview text
Package ‘SNFtool’
June 11, 2021
Type Package Title Similarity Network Fusion Version 2.3.1 Date 2021-06-10 Author Bo Wang, Aziz Mezlini, Feyyaz Demir, Marc Fiume, Zhuowen Tu, Michael Brudno, Ben-
jamin Haibe-Kains, Anna Goldenberg Maintainer Benjamin Brew Imports ExPosition, alluvial Description Similarity Network Fusion takes multiple views of a network and fuses them to-
gether to construct an overall status matrix. The input to our algorithm can be feature vectors, pairwise distances, or pairwise similarities. The learned status matrix can then be used for retrieval, clustering, and classification. License GPL NeedsCompilation no Repository CRAN Date/Publication 2021-06-11 08:40:15 UTC
R topics documented:
affinityMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 calNMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 chiDist2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 concordanceNetworkNMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Data1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Data2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 dataL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 displayClusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 displayClustersWithHeatmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 dist2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 estimateNumberOfClustersGivenGraph . . . . . . . . . . . . . . . . . . . . . . . . . . 12 getColorsForGroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 groupPredict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1
2
affinityMatrix
heatmapPlus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 label . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 plotAlluvial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 rankFeaturesByNMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 SNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 spectralClustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 standardNormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Index
28
affinityMatrix
Affinity matrix calculation
Description Computes affinity matrix from a generic distance matrix
Usage affinityMatrix(diff, K = 20, sigma = 0.5)
Arguments diff K sigma
Distance matrix Number of nearest neighbors Variance for local model
Value Returns an affinity matrix that represents the neighborhood graph of the data points.
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
References
B Wang, A Mezlini, F Demir, M Fiume, T Zu, M Brudno, B Haibe-Kains, A Goldenberg (2014) Similarity Network Fusion: a fast and effective method to aggregate multiple data types on a genome wide scale. Nature Methods. Online. Jan 26, 2014
calNMI
3
Examples
## First, set all the parameters: K = 20; ##number of neighbors, must be greater than 1. usually (10~30) alpha = 0.5; ##hyperparameter, usually (0.3~0.8) T = 20; ###Number of Iterations, usually (10~50)
## Data1 is of size n x d_1, ## where n is the number of patients, d_1 is the number of genes, ## Data2 is of size n x d_2, ## where n is the number of patients, d_2 is the number of methylation data(Data1) data(Data2)
## Calculate distance matrices(here we calculate Euclidean Distance, ## you can use other distance, e.g. correlation) Dist1 = (dist2(as.matrix(Data1),as.matrix(Data1)))^(1/2) Dist2 = (dist2(as.matrix(Data2),as.matrix(Data2)))^(1/2)
## Next, construct similarity graphs W1 = affinityMatrix(Dist1, K, alpha) W2 = affinityMatrix(Dist2, K, alpha)
calNMI
Mutual Information calculation
Description Calculate the mutual information between vectors x and y.
Usage calNMI(x, y)
Arguments x y
a vector a vector
Value Returns the mutual information between vectors x and y.
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
4
chiDist2
References
B Wang, A Mezlini, F Demir, M Fiume, T Zu, M Brudno, B Haibe-Kains, A Goldenberg (2014) Similarity Network Fusion: a fast and effective method to aggregate multiple data types on a genome wide scale. Nature Methods. Online. Jan 26, 2014
Examples
# How to use SNF with multiple views
# Load views into list "dataL" data(dataL) data(label)
# Set the other parameters K = 20 # number of neighbours alpha = 0.5 # hyperparameter in affinityMatrix T = 20 # number of iterations of SNF
# Normalize the features in each of the views if necessary # dataL = lapply(dataL, standardNormalization)
# Calculate the distances for each view distL = lapply(dataL, function(x) (dist2(x, x))^(1/2))
# Construct the similarity graphs affinityL = lapply(distL, function(x) affinityMatrix(x, K, alpha))
# Example of how to use SNF to perform subtyping # Construct the fused network W = SNF(affinityL, K, T) # Perform clustering on the fused network. clustering = spectralClustering(W,3); # Use NMI to measure the goodness of the obtained NMI = calNMI(clustering,label);
labels.
chiDist2
Pairwise Chi-squared distances
Description Wrapper function chi2Dist imported from ’ExPosition’ package. Computes the Chi-squared distances between all pairs of data point given
Usage chiDist2(A)
concordanceNetworkNMI
5
Arguments A
A data matrix where each row is a different data point
Value
Returns an N x N matrix where N is the number of rows in X. element (i,j) is the squared Chisquared distance between ith data point in X and jth data point in X.
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
Examples
## Data1 is of size n x d_1, ## where n is the number of patients, d_1 is the number of genes, ## Data2 is of size n x d_2, ## where n is the number of patients, d_2 is the number of methylation data(Data1) data(Data2)
## Calculate distance matrices(here we calculate Euclidean Distance, ## you can use other distance, e.g. correlation) Dist1 = chiDist2(as.matrix(Data1)) Dist2 = chiDist2(as.matrix(Data2))
concordanceNetworkNMI Concordance Network NMI calculation
Description
Given a list of affinity matrices, Wall, the number of clusters, return a matrix containing the NMIs between cluster assignments made with spectral clustering on all matrices provided.
Usage concordanceNetworkNMI(Wall, C)
Arguments Wall
C
List of matrices. Each element of the list is a square, symmetric matrix that shows affinities of the data points from a certain view.
Number of clusters
Value Returns an affinity matrix that represents the neighborhood graph of the data points.
6
Data1
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
Examples
# How to use SNF with multiple views
# Load views into list "dataL" data(dataL) data(label)
# Set the other parameters K = 20 # number of neighbours alpha = 0.5 # hyperparameter in affinityMatrix T = 20 # number of iterations of SNF # Normalize the features in each of the views. #dataL = lapply(dataL, standardNormalization)
# Calculate the distances for each view distL = lapply(dataL, function(x) (dist2(x, x)^(1/2)))
# Construct the similarity graphs affinityL = lapply(distL, function(x) affinityMatrix(x, K, alpha))
# an example of how to use concordanceNetworkNMI Concordance_matrix = concordanceNetworkNMI(affinityL, 3);
## The output, Concordance_matrix, ## shows the concordance between the fused network and each individual network.
Data1
Data1
Description Data1 dataset used to demonstrate the use of SNFtool.
Usage data(Data1)
Format A data frame with 200 observations on the following 2 variables. V1 a numeric vector V2 a numeric vector
Data2
7
Examples data(Data1)
Data2
Data2
Description Data2 dataset used to demonstrate the use of SNFtool.
Usage data(Data2)
Format A data frame with 200 observations on the following 2 variables. V3 a numeric vector V4 a numeric vector
Examples data(Data2)
dataL
dataL
Description Dataset used to provide an example of predicting the new labels with label propagation.
Usage data(dataL)
Format The format is: List of 2 $ : num [1:600, 1:76] 0.0659 0.0491 0.0342 0.0623 0.062 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:600] "V1" "V2" "V3" "V4" ... .. ..$ : NULL $ : int [1:600, 1:240] 0 0 0 0 0 0 0 0 0 0 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:600] "V1" "V2" "V3" "V4" ... .. ..$ : NULL
Examples data(dataL)
8
displayClusters
displayClusters
Plot given similarity matrix by clusters
Description Visualize the clusters in given similarity matrix
Usage displayClusters(W, group)
Arguments W group
Similarity matrix A vector containing the labels for each sample in W.
Value Plots given similarity matrix with patients ordered to form clusters.
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
Examples
## First, set all the parameters: K = 20; # number of neighbors, usually (10~30) alpha = 0.5; # hyperparameter, usually (0.3~0.8) T = 10; # Number of Iterations, usually (10~20)
## Data1 is of size n x d_1, ## where n is the number of patients, d_1 is the number of genes, ## Data2 is of size n x d_2, ## where n is the number of patients, d_2 is the number of methylation data(Data1) data(Data2)
## Here, the simulation data (SNFdata) has two data types. They are complementary to each other. ## And two data types have the same number of points. ## The first half data belongs to the first cluster; the rest belongs to the second cluster. truelabel = c(matrix(1,100,1),matrix(2,100,1)); ## the ground truth of the simulated data
## Calculate distance matrices ## (here we calculate Euclidean Distance, you can use other distance, e.g,correlation)
## If the data are all continuous values, we recommend the users to perform ## standard normalization before using SNF,
displayClustersWithHeatmap
9
## though it is optional depending on the data the users want to use. # Data1 = standardNormalization(Data1); # Data2 = standardNormalization(Data2);
## Calculate the pair-wise distance; ## If the data is continuous, we recommend to use the function "dist2" as follows Dist1 = (dist2(as.matrix(Data1),as.matrix(Data1)))^(1/2) Dist2 = (dist2(as.matrix(Data2),as.matrix(Data2)))^(1/2)
## next, construct similarity graphs W1 = affinityMatrix(Dist1, K, alpha) W2 = affinityMatrix(Dist2, K, alpha)
## These similarity graphs have complementary information about clusters. displayClusters(W1, truelabel); displayClusters(W2, truelabel);
displayClustersWithHeatmap Display the similarity matrix by clusters with some sample information
Description Visualize the clusters present in the given similarity matrix as well as some sample information.
Usage displayClustersWithHeatmap(W, group, ColSideColors=NULL, ...)
Arguments W group ColSideColors
...
Similarity matrix
A numeric vector containing the groups information for each sample in W such as the result of the spectralClustering function. The order should correspond to the sample order in W.
(optional) character vector of length ncol(x) containing the color names for a horizontal side bar that may be used to annotate the columns of x, used by the heatmap function, OR a character matrix with number of rows matching number of rows in x. Each column is plotted as a row similar to heatmap()’s ColSideColors by the heatmap.plus function.
other paramater that can be pass on to the heatmap (if ColSideColor is a NULL or a vector) or heatmap.plus function (if ColSideColors is matrix)
10
displayClustersWithHeatmap
Details
Using the heatmap or heatmap.plus function to display the similarity matrix For representation purpose, the similarity matrix diagonal is set to the median value of W, the matrix is normalised and W = W + t(W) is applied In this presentation no clustering method is ran the samples are ordered in function of their group label present in the group arguments.
Value
Plots the similarity matrix using the heatmap function. Samples are ordered by the clusters provided by the argument groups with sample information displayed with a color bar if the ColSideColors argument is informed.
Author(s) Florence Cavalli
Examples
## First, set all the parameters: K = 20; # number of neighbors, usually (10~30) alpha = 0.5; # hyperparameter, usually (0.3~0.8) T = 20; # Number of Iterations, usually (10~20)
## Data1 is of size n x d_1, ## where n is the number of patients, d_1 is the number of genes, ## Data2 is of size n x d_2, ## where n is the number of patients, d_2 is the number of methylation data(Data1) data(Data2)
## Here, the simulation data (SNFdata) has two data types. They are complementary to each other. ## And two data types have the same number of points. ## The first half data belongs to the first cluster; the rest belongs to the second cluster. truelabel = c(matrix(1,100,1),matrix(2,100,1)); ## the ground truth of the simulated data
## Calculate distance matrices ## (here we calculate Euclidean Distance, you can use other distance, e.g,correlation)
## If the data are all continuous values, we recommend the users to perform ## standard normalization before using SNF, ## though it is optional depending on the data the users want to use. # Data1 = standardNormalization(Data1); # Data2 = standardNormalization(Data2);
## Calculate the pair-wise distance; ## If the data is continuous, we recommend to use the function "dist2" as follows Dist1 = (dist2(as.matrix(Data1),as.matrix(Data1)))^(1/2) Dist2 = (dist2(as.matrix(Data2),as.matrix(Data2)))^(1/2)
## next, construct similarity graphs W1 = affinityMatrix(Dist1, K, alpha)
June 11, 2021
Type Package Title Similarity Network Fusion Version 2.3.1 Date 2021-06-10 Author Bo Wang, Aziz Mezlini, Feyyaz Demir, Marc Fiume, Zhuowen Tu, Michael Brudno, Ben-
jamin Haibe-Kains, Anna Goldenberg Maintainer Benjamin Brew
gether to construct an overall status matrix. The input to our algorithm can be feature vectors, pairwise distances, or pairwise similarities. The learned status matrix can then be used for retrieval, clustering, and classification. License GPL NeedsCompilation no Repository CRAN Date/Publication 2021-06-11 08:40:15 UTC
R topics documented:
affinityMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 calNMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 chiDist2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 concordanceNetworkNMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Data1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Data2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 dataL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 displayClusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 displayClustersWithHeatmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 dist2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 estimateNumberOfClustersGivenGraph . . . . . . . . . . . . . . . . . . . . . . . . . . 12 getColorsForGroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 groupPredict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1
2
affinityMatrix
heatmapPlus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 label . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 plotAlluvial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 rankFeaturesByNMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 SNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 spectralClustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 standardNormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Index
28
affinityMatrix
Affinity matrix calculation
Description Computes affinity matrix from a generic distance matrix
Usage affinityMatrix(diff, K = 20, sigma = 0.5)
Arguments diff K sigma
Distance matrix Number of nearest neighbors Variance for local model
Value Returns an affinity matrix that represents the neighborhood graph of the data points.
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
References
B Wang, A Mezlini, F Demir, M Fiume, T Zu, M Brudno, B Haibe-Kains, A Goldenberg (2014) Similarity Network Fusion: a fast and effective method to aggregate multiple data types on a genome wide scale. Nature Methods. Online. Jan 26, 2014
calNMI
3
Examples
## First, set all the parameters: K = 20; ##number of neighbors, must be greater than 1. usually (10~30) alpha = 0.5; ##hyperparameter, usually (0.3~0.8) T = 20; ###Number of Iterations, usually (10~50)
## Data1 is of size n x d_1, ## where n is the number of patients, d_1 is the number of genes, ## Data2 is of size n x d_2, ## where n is the number of patients, d_2 is the number of methylation data(Data1) data(Data2)
## Calculate distance matrices(here we calculate Euclidean Distance, ## you can use other distance, e.g. correlation) Dist1 = (dist2(as.matrix(Data1),as.matrix(Data1)))^(1/2) Dist2 = (dist2(as.matrix(Data2),as.matrix(Data2)))^(1/2)
## Next, construct similarity graphs W1 = affinityMatrix(Dist1, K, alpha) W2 = affinityMatrix(Dist2, K, alpha)
calNMI
Mutual Information calculation
Description Calculate the mutual information between vectors x and y.
Usage calNMI(x, y)
Arguments x y
a vector a vector
Value Returns the mutual information between vectors x and y.
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
4
chiDist2
References
B Wang, A Mezlini, F Demir, M Fiume, T Zu, M Brudno, B Haibe-Kains, A Goldenberg (2014) Similarity Network Fusion: a fast and effective method to aggregate multiple data types on a genome wide scale. Nature Methods. Online. Jan 26, 2014
Examples
# How to use SNF with multiple views
# Load views into list "dataL" data(dataL) data(label)
# Set the other parameters K = 20 # number of neighbours alpha = 0.5 # hyperparameter in affinityMatrix T = 20 # number of iterations of SNF
# Normalize the features in each of the views if necessary # dataL = lapply(dataL, standardNormalization)
# Calculate the distances for each view distL = lapply(dataL, function(x) (dist2(x, x))^(1/2))
# Construct the similarity graphs affinityL = lapply(distL, function(x) affinityMatrix(x, K, alpha))
# Example of how to use SNF to perform subtyping # Construct the fused network W = SNF(affinityL, K, T) # Perform clustering on the fused network. clustering = spectralClustering(W,3); # Use NMI to measure the goodness of the obtained NMI = calNMI(clustering,label);
labels.
chiDist2
Pairwise Chi-squared distances
Description Wrapper function chi2Dist imported from ’ExPosition’ package. Computes the Chi-squared distances between all pairs of data point given
Usage chiDist2(A)
concordanceNetworkNMI
5
Arguments A
A data matrix where each row is a different data point
Value
Returns an N x N matrix where N is the number of rows in X. element (i,j) is the squared Chisquared distance between ith data point in X and jth data point in X.
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
Examples
## Data1 is of size n x d_1, ## where n is the number of patients, d_1 is the number of genes, ## Data2 is of size n x d_2, ## where n is the number of patients, d_2 is the number of methylation data(Data1) data(Data2)
## Calculate distance matrices(here we calculate Euclidean Distance, ## you can use other distance, e.g. correlation) Dist1 = chiDist2(as.matrix(Data1)) Dist2 = chiDist2(as.matrix(Data2))
concordanceNetworkNMI Concordance Network NMI calculation
Description
Given a list of affinity matrices, Wall, the number of clusters, return a matrix containing the NMIs between cluster assignments made with spectral clustering on all matrices provided.
Usage concordanceNetworkNMI(Wall, C)
Arguments Wall
C
List of matrices. Each element of the list is a square, symmetric matrix that shows affinities of the data points from a certain view.
Number of clusters
Value Returns an affinity matrix that represents the neighborhood graph of the data points.
6
Data1
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
Examples
# How to use SNF with multiple views
# Load views into list "dataL" data(dataL) data(label)
# Set the other parameters K = 20 # number of neighbours alpha = 0.5 # hyperparameter in affinityMatrix T = 20 # number of iterations of SNF # Normalize the features in each of the views. #dataL = lapply(dataL, standardNormalization)
# Calculate the distances for each view distL = lapply(dataL, function(x) (dist2(x, x)^(1/2)))
# Construct the similarity graphs affinityL = lapply(distL, function(x) affinityMatrix(x, K, alpha))
# an example of how to use concordanceNetworkNMI Concordance_matrix = concordanceNetworkNMI(affinityL, 3);
## The output, Concordance_matrix, ## shows the concordance between the fused network and each individual network.
Data1
Data1
Description Data1 dataset used to demonstrate the use of SNFtool.
Usage data(Data1)
Format A data frame with 200 observations on the following 2 variables. V1 a numeric vector V2 a numeric vector
Data2
7
Examples data(Data1)
Data2
Data2
Description Data2 dataset used to demonstrate the use of SNFtool.
Usage data(Data2)
Format A data frame with 200 observations on the following 2 variables. V3 a numeric vector V4 a numeric vector
Examples data(Data2)
dataL
dataL
Description Dataset used to provide an example of predicting the new labels with label propagation.
Usage data(dataL)
Format The format is: List of 2 $ : num [1:600, 1:76] 0.0659 0.0491 0.0342 0.0623 0.062 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:600] "V1" "V2" "V3" "V4" ... .. ..$ : NULL $ : int [1:600, 1:240] 0 0 0 0 0 0 0 0 0 0 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:600] "V1" "V2" "V3" "V4" ... .. ..$ : NULL
Examples data(dataL)
8
displayClusters
displayClusters
Plot given similarity matrix by clusters
Description Visualize the clusters in given similarity matrix
Usage displayClusters(W, group)
Arguments W group
Similarity matrix A vector containing the labels for each sample in W.
Value Plots given similarity matrix with patients ordered to form clusters.
Author(s) Dr. Anna Goldenberg, Bo Wang, Aziz Mezlini, Feyyaz Demir
Examples
## First, set all the parameters: K = 20; # number of neighbors, usually (10~30) alpha = 0.5; # hyperparameter, usually (0.3~0.8) T = 10; # Number of Iterations, usually (10~20)
## Data1 is of size n x d_1, ## where n is the number of patients, d_1 is the number of genes, ## Data2 is of size n x d_2, ## where n is the number of patients, d_2 is the number of methylation data(Data1) data(Data2)
## Here, the simulation data (SNFdata) has two data types. They are complementary to each other. ## And two data types have the same number of points. ## The first half data belongs to the first cluster; the rest belongs to the second cluster. truelabel = c(matrix(1,100,1),matrix(2,100,1)); ## the ground truth of the simulated data
## Calculate distance matrices ## (here we calculate Euclidean Distance, you can use other distance, e.g,correlation)
## If the data are all continuous values, we recommend the users to perform ## standard normalization before using SNF,
displayClustersWithHeatmap
9
## though it is optional depending on the data the users want to use. # Data1 = standardNormalization(Data1); # Data2 = standardNormalization(Data2);
## Calculate the pair-wise distance; ## If the data is continuous, we recommend to use the function "dist2" as follows Dist1 = (dist2(as.matrix(Data1),as.matrix(Data1)))^(1/2) Dist2 = (dist2(as.matrix(Data2),as.matrix(Data2)))^(1/2)
## next, construct similarity graphs W1 = affinityMatrix(Dist1, K, alpha) W2 = affinityMatrix(Dist2, K, alpha)
## These similarity graphs have complementary information about clusters. displayClusters(W1, truelabel); displayClusters(W2, truelabel);
displayClustersWithHeatmap Display the similarity matrix by clusters with some sample information
Description Visualize the clusters present in the given similarity matrix as well as some sample information.
Usage displayClustersWithHeatmap(W, group, ColSideColors=NULL, ...)
Arguments W group ColSideColors
...
Similarity matrix
A numeric vector containing the groups information for each sample in W such as the result of the spectralClustering function. The order should correspond to the sample order in W.
(optional) character vector of length ncol(x) containing the color names for a horizontal side bar that may be used to annotate the columns of x, used by the heatmap function, OR a character matrix with number of rows matching number of rows in x. Each column is plotted as a row similar to heatmap()’s ColSideColors by the heatmap.plus function.
other paramater that can be pass on to the heatmap (if ColSideColor is a NULL or a vector) or heatmap.plus function (if ColSideColors is matrix)
10
displayClustersWithHeatmap
Details
Using the heatmap or heatmap.plus function to display the similarity matrix For representation purpose, the similarity matrix diagonal is set to the median value of W, the matrix is normalised and W = W + t(W) is applied In this presentation no clustering method is ran the samples are ordered in function of their group label present in the group arguments.
Value
Plots the similarity matrix using the heatmap function. Samples are ordered by the clusters provided by the argument groups with sample information displayed with a color bar if the ColSideColors argument is informed.
Author(s) Florence Cavalli
Examples
## First, set all the parameters: K = 20; # number of neighbors, usually (10~30) alpha = 0.5; # hyperparameter, usually (0.3~0.8) T = 20; # Number of Iterations, usually (10~20)
## Data1 is of size n x d_1, ## where n is the number of patients, d_1 is the number of genes, ## Data2 is of size n x d_2, ## where n is the number of patients, d_2 is the number of methylation data(Data1) data(Data2)
## Here, the simulation data (SNFdata) has two data types. They are complementary to each other. ## And two data types have the same number of points. ## The first half data belongs to the first cluster; the rest belongs to the second cluster. truelabel = c(matrix(1,100,1),matrix(2,100,1)); ## the ground truth of the simulated data
## Calculate distance matrices ## (here we calculate Euclidean Distance, you can use other distance, e.g,correlation)
## If the data are all continuous values, we recommend the users to perform ## standard normalization before using SNF, ## though it is optional depending on the data the users want to use. # Data1 = standardNormalization(Data1); # Data2 = standardNormalization(Data2);
## Calculate the pair-wise distance; ## If the data is continuous, we recommend to use the function "dist2" as follows Dist1 = (dist2(as.matrix(Data1),as.matrix(Data1)))^(1/2) Dist2 = (dist2(as.matrix(Data2),as.matrix(Data2)))^(1/2)
## next, construct similarity graphs W1 = affinityMatrix(Dist1, K, alpha)
Categories
You my also like
Structural and Semantic Similarity Measurement of UML Use
344.6 KB16.2K3.1KDisclosure Similarity and Future Stock Return Comovement
622.8 KB33.3K14.6KDetecting Paraphrases in Marathi Language
3.9 MB11.7K4KSemantic Textual Similarity of Sentences with Emojis
601.4 KB8K1.7KRoget’s Thesaurus and Semantic Similarity
60.9 KB7.3K3.5KUnsupervised learning: Clustering and Association Rules
751.1 KB37.7K17KData Mining – Partition based clustering approach for
404.8 KB51.9K19.7K2016 Top Crash Locations Report
4.8 MB1.9K595Solving a System of Linear Equations Using Matrices
47.9 KB44.1K5.7K