contextCluster {clusternomics} | R Documentation |
This function fits the context-dependent clustering model to the data using Gibbs sampling. It allows the user to specify a different number of clusters on the global level, as well as on the local level.
contextCluster(datasets, clusterCounts, dataDistributions = "diagNormal", prior = NULL, maxIter = 1000, burnin = NULL, lag = 3, verbose = FALSE)
datasets |
List of data matrices where each matrix represents a context-specific dataset. Each data matrix has the size N times M, where N is the number of data points and M is the dimensionality of the data. The full list of matrices has length C. The number of data points N must be the same for all data matrices. |
clusterCounts |
Number of cluster on the global level and in each context.
List with the following structure: |
dataDistributions |
Distribution of data in each dataset. Can be either a list of
length C where |
prior |
Prior distribution. If |
maxIter |
Number of iterations of the Gibbs sampling algorithm. |
burnin |
Number of burn-in iterations that will be discarded. If not specified,
the algorithm discards the first half of the |
lag |
Used for thinning the samples. |
verbose |
Print progress, by default |
Returns list containing the sequence of MCMC states and the log likelihoods of the individual states.
samples |
List of assignments sampled from the posterior,
each state |
logliks |
Log likelihoods during MCMC iterations. |
DIC |
Deviance information criterion to help select the number of clusters. Lower values of DIC correspond to better-fitting models. |
# Example with simulated data (see vignette for details) # Number of elements in each cluster groupCounts <- c(50, 10, 40, 60) # Centers of clusters means <- c(-1.5,1.5) testData <- generateTestData_2D(groupCounts, means) datasets <- testData$data # Fit the model # 1. specify number of clusters clusterCounts <- list(global=10, context=c(3,3)) # 2. Run inference # Number of iterations is just for demonstration purposes, use # a larger number of iterations in practice! results <- contextCluster(datasets, clusterCounts, maxIter = 10, burnin = 5, lag = 1, dataDistributions = 'diagNormal', verbose = TRUE) # Extract results from the samples # Final state: state <- results$samples[[length(results$samples)]] # 1) assignment to global clusters globalAssgn <- state$Global # 2) context-specific assignmnets- assignment in specific dataset (context) contextAssgn <- state[,"Context 1"] # Assess the fit of the model with DIC results$DIC