Commit fdbe27f5 authored by Berenger Bramas's avatar Berenger Bramas

Remove hmat

parent 7682976d
License Information
===================
The Open Source Clustering Software consists of several packages, which have
different licenses.
* Cluster 3.0 is a GUI-based program for Windows, Mac OS X, Linux, and Unix.
It is based on Michael Eisen's Cluster/TreeView code. Cluster 3.0 is covered
by Michael Eisen's original license, available at
http://rana.lbl.gov/EisenSoftwareSource.htm. The command-line version of
Cluster 3.0 is also covered by this license.
* Pycluster is an extension module to the scripting language Python. It is
covered by the Python License (same license as Python itself).
* Algorithm::Cluster, the interface to the scripting language Perl. It was
released under the Artistic License (same license as Perl itself).
* The routines in the C Clustering Library can also be used directly by calling
them from other C programs. In that case, the Python License applies.
In all cases, copyright notices must be retained in their original form.
Open Source Clustering Software
===============================
The Open Source Clustering Software consists of the most commonly used routines
for clustering analysis of gene expression data. The software packages below all
depend on the C Clustering Library, which is a library of routines for
hierarchical (pairwise single-, complete-, maximum-, and average-linkage)
clustering, k-means clustering, and Self-Organizing Maps on a 2D rectangular
grid. The C Clustering Library complies with the ANSI C standard.
Several packages are available as part of the Open Source Clustering Software:
* Cluster 3.0 is a GUI-based program for Windows, based on Michael Eisen's
Cluster/TreeView code. Cluster 3.0 was written for Microsoft Windows, and
subsequently ported to Mac OS X (Cocoa) and Unix/Linux. Cluster 3.0 can
also be used as a command line program.
* Pycluster (or Bio.Cluster if used as part of Biopython) is an extension
module to the scripting language Python.
* Algorithm::Cluster is an extension module to the scripting language Perl.
* The routines in the C Clustering Library can also be used directly by calling
them from other C programs.
INSTALLATION
============
See the INSTALL file in this directory.
VIEWING CLUSTERING RESULTS
==========================
We recommend using Java TreeView for visualizing clustering results.
Java TreeView is a Java version of Michael Eisen's Treeview program with
extended capabilities. In particular, it is possible to visualize k-means
clustering results in addition to hierarchical clustering results.
Java TreeView was written by Alok Saldanha at Stanford University; it can be
downloaded at http://jtreeview.sourceforge.net.
MANUAL
======
The routines in the C Clustering Library is described in the manual
(cluster.pdf). This manual also describes how to use the routines from Python
and from Perl. Cluster 3.0 has a separate manual (cluster3.pdf). Both of these
manuals can be found in the doc subdirectory. They can also be downloaded from
our website:
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/cluster.pdf;
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/cluster3.pdf.
LITERATURE
==========
M.J.L. de Hoon, S. Imoto, J. Nolan, and S. Miyano: "Open Source Clustering
Software", Bioinformatics 20(9): 1453-1454 (2004).
CONTACT
=======
Michiel de Hoon
University of Tokyo, Institute of Medical Science
Human Genome Center, Laboratory of DNA Information Analysis
Currently at
RIKEN Genomic Sciences Center
mdehoon 'AT' gsc.riken.jp
This source diff could not be displayed because it is too large. You can view the blob instead.
/******************************************************************************/
/* The C Clustering Library.
* Copyright (C) 2002 Michiel Jan Laurens de Hoon.
*
* This library was written at the Laboratory of DNA Information Analysis,
* Human Genome Center, Institute of Medical Science, University of Tokyo,
* 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan.
* Contact: mdehoon 'AT' gsc.riken.jp
*
* Permission to use, copy, modify, and distribute this software and its
* documentation with or without modifications and for any purpose and
* without fee is hereby granted, provided that any copyright notices
* appear in all copies and that both those copyright notices and this
* permission notice appear in supporting documentation, and that the
* names of the contributors or copyright holders not be used in
* advertising or publicity pertaining to distribution of the software
* without specific prior permission.
*
* THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFTWARE DISCLAIM ALL
* WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED
* WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL THE
* CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY SPECIAL, INDIRECT
* OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
* OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
* OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE
* OR PERFORMANCE OF THIS SOFTWARE.
*
*/
// @SCALFMM_PRIVATE
#ifdef WINDOWS
# include <windows.h>
#endif
#define CLUSTERVERSION "1.52a"
/* Chapter 2 */
double clusterdistance (int nrows, int ncolumns, double** data, int** mask,
double weight[], int n1, int n2, int index1[], int index2[], char dist,
char method, int transpose);
double** distancematrix (int ngenes, int ndata, double** data,
int** mask, double* weight, char dist, int transpose);
/* Chapter 3 */
int getclustercentroids(int nclusters, int nrows, int ncolumns,
double** data, int** mask, int clusterid[], double** cdata, int** cmask,
int transpose, char method);
void getclustermedoids(int nclusters, int nelements, double** distance,
int clusterid[], int centroids[], double errors[]);
void kcluster (int nclusters, int ngenes, int ndata, double** data,
int** mask, double weight[], int transpose, int npass, char method, char dist,
int clusterid[], double* error, int* ifound);
void kmedoids (int nclusters, int nelements, double** distance,
int npass, int clusterid[], double* error, int* ifound);
/* Chapter 4 */
typedef struct {int left; int right; double distance;} Node;
/*
* A Node struct describes a single node in a tree created by hierarchical
* clustering. The tree can be represented by an array of n Node structs,
* where n is the number of elements minus one. The integers left and right
* in each Node struct refer to the two elements or subnodes that are joined
* in this node. The original elements are numbered 0..nelements-1, and the
* nodes -1..-(nelements-1). For each node, distance contains the distance
* between the two subnodes that were joined.
*/
Node* treecluster (int nrows, int ncolumns, double** data, int** mask,
double weight[], int transpose, char dist, char method, double** distmatrix);
void cuttree (int nelements, Node* tree, int nclusters, int clusterid[]);
/* Chapter 5 */
void somcluster (int nrows, int ncolumns, double** data, int** mask,
const double weight[], int transpose, int nxnodes, int nynodes,
double inittau, int niter, char dist, double*** celldata,
int clusterid[][2]);
/* Chapter 6 */
int pca(int m, int n, double** u, double** v, double* w);
/* Utility routines, currently undocumented */
void sort(int n, const double data[], int index[]);
double mean(int n, double x[]);
double median (int n, double x[]);
double* calculate_weights(int nrows, int ncolumns, double** data, int** mask,
double weights[], int transpose, char dist, double cutoff, double exponent);
# check if compiling into source directories
STRING(COMPARE EQUAL "${CMAKE_SOURCE_DIR}" "${CMAKE_BINARY_DIR}" insource)
if(insource)
MESSAGE(FATAL_ERROR "${PROJECT_NAME} requires an out of source build. Goto ./Build and tapes cmake ../")
endif(insource)
project(ADDONS_HMAT_SCALFMM CXX C)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SCALFMM_CXX_FLAGS}")
# Active language
# -----------------------
ENABLE_LANGUAGE(CXX C)
MESSAGE(STATUS " CXX ${CMAKE_CXX_COMPILER_ID}" )
# Options
OPTION( SCALFMM_ADDON_HMAT "Set to ON to build ScalFMM HMat interface" OFF )
# if ask to build addon
if(SCALFMM_ADDON_HMAT)
list(APPEND FUSE_LIST "SCOTCH")
set(SCOTCH_NOT_FOUND $ENV{SCOTCH_LIB})
if(NOT SCOTCH_NOT_FOUND)
MESSAGE(STATUS " SCOTCH not found " )
else()
OPTION( SCALFMM_USE_SCOTCH "Set to ON to use scotch" ON )
if (SCALFMM_USE_SCOTCH)
include_directories($ENV{SCOTCH_INC})
SET(SCALFMM_LIBRARIES "${SCALFMM_LIBRARIES};-L$ENV{SCOTCH_LIB};-lscotch;-lscotcherr")
endif()
endif()
# first build lib scalfmmhmat
set(LIBRARY_OUTPUT_PATH ../lib/${CMAKE_BUILD_TYPE})
# Searching all cpp file
file( GLOB_RECURSE source_lib_files Src/*.cpp )
# Adding cpp files to project
add_library( scalfmmhmat ${source_lib_files} )
# Add blas library (even if it is set to off)
target_link_libraries( scalfmmhmat scalfmm)
# Adding the entire project dir as an include dir
INCLUDE_DIRECTORIES(
${SCALFMM_BINARY_DIR}/Src
${SCALFMM_SOURCE_DIR}/Src
${SCALFMM_INCLUDES}
)
# Install lib
install( TARGETS scalfmmhmat ARCHIVE DESTINATION lib )
# Install headers
SET(my_include_dirs "Src/Blocks" "Src/Clustering" "Src/Containers" "Src/Utils" "Src/Viewers" "CClusteringLibrary")
FOREACH(my_dir ${my_include_dirs})
file(GLOB
hpp_in_dir
${my_dir}/*.hpp ${my_dir}/*.h
)
INSTALL( FILES ${hpp_in_dir} DESTINATION include/ScalFmm/HMat/${my_dir} )
ENDFOREACH()
# Add C Clustering Library
file( GLOB_RECURSE ccl_lib_files CClusteringLibrary/*.c )
add_library( cclusteringlib ${ccl_lib_files} )
INCLUDE_DIRECTORIES(CClusteringLibrary/)
target_link_libraries( cclusteringlib scalfmm)
install( TARGETS cclusteringlib ARCHIVE DESTINATION lib )
# Tests
file( GLOB_RECURSE source_tests_files Tests/*.cpp )
INCLUDE_DIRECTORIES( ${SCALFMM_BINARY_DIR}/Src )
# Then build test files
SET(hmat_list_execs "")
foreach(exec ${source_tests_files})
get_filename_component(
execname ${exec}
NAME_WE
)
set(compile_exec "TRUE")
foreach(fuse_key ${FUSE_LIST})
file(STRINGS "${exec}" lines_fuse REGEX "@FUSE_${fuse_key}")
if(lines_fuse)
if( NOT SCALFMM_USE_${fuse_key} )
MESSAGE( STATUS "This needs ${fuse_key} = ${exec}" )
set(compile_exec "FALSE")
endif()
endif()
endforeach()
# Dependency are OK
if( compile_exec )
add_executable( ${execname} ${exec} )
# link to scalfmm and scalfmmhmat and cclusteringlib
target_link_libraries(
${execname}
${scalfmm_lib}
scalfmmhmat
cclusteringlib
${SCALFMM_LIBRARIES}
)
LIST(APPEND hmat_list_execs ${execname})
endif()
endforeach(exec)
add_custom_target(hmat DEPENDS ${hmat_list_execs})
endif()
// ===================================================================================
// Copyright ScalFmm 2011 INRIA, Olivier Coulaud, Berenger Bramas, Matthias Messner
// olivier.coulaud@inria.fr, berenger.bramas@inria.fr
// This software is a computer program whose purpose is to compute the FMM.
//
// This software is governed by the CeCILL-C and LGPL licenses and
// abiding by the rules of distribution of free software.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public and CeCILL-C Licenses for more details.
// "http://www.cecill.info".
// "http://www.gnu.org/licenses".
// ===================================================================================
//
// @SCALFMM_PRIVATE
//
#ifndef FACABLOCK_HPP
#define FACABLOCK_HPP
#include "Utils/FBlas.hpp"
#include "Utils/FAca.hpp"
/*! Choose either \a FULLY_PIVOTED_ACA or \a PARTIALLY_PIVOTED_ACA */
#define FULLY_PIVOTED_ACA
//#define PARTIALLY_PIVOTED_ACA
/*! Choose \a RECOMPRESSED_ACA */
#define RECOMPRESSED_ACA
template <class FReal, int ORDER = 14>
class FACABlock{
protected:
// members
FReal* block;
FReal* U;
FReal* VT;
int nbRows;
int nbCols;
int level;
unsigned int rank;
FReal accuracy;
FACABlock(const FACABlock&) = delete;
FACABlock& operator=(const FACABlock&) = delete;
public:
FACABlock()
: block(nullptr), U(nullptr), VT(nullptr), nbRows(0), nbCols(0), level(0), rank(0), accuracy(FMath::pow(FReal(10.0),static_cast<FReal>(-ORDER))) {
}
// ctor
template <class ViewerClass>
void fill(const ViewerClass& viewer, const int inLevel){
clear();
// Allocate memory
level = inLevel;
nbRows = viewer.getNbRows();
nbCols = viewer.getNbCols();
#if (defined FULLY_PIVOTED_ACA)
block = new FReal[nbRows*nbCols];
for(int idxRow = 0 ; idxRow < nbRows ; ++idxRow){
for(int idxCol = 0 ; idxCol < nbCols ; ++idxCol){
block[idxCol*nbRows+idxRow] = viewer.getValue(idxRow,idxCol);
}
}
#endif
// SVD specific (col major)
rank = std::min(nbRows,nbCols);
//S = new FReal[rank];
FReal* _U ;// = new FReal[nbRows*nbCols]; // Call to computeSVD() copies block into U
FReal* _V;// = new FReal[rank*nbCols];
#if (defined FULLY_PIVOTED_ACA)
// Perform fully pivoted ACA
FAca::fACA<FReal>(block, nbRows, nbCols, accuracy, _U, _V, rank);
#elif (defined PARTIAL_PIVOTED_ACA)
// TODO
// Perform partially pivoted ACA
//FAca::fACA<FReal>(viewer, nbRows, nbCols, accuracy, _U, _V, rank);
#endif
// display rank
std::cout << "rank after ACA=" << rank << " (" << nbRows << "," << nbCols << ")" << std::endl;
#if (defined RECOMPRESSED_ACA)
// Recompression by QR+SVD
FAca::recompress(_U,_V,nbRows,nbCols,accuracy,U,VT,rank);
// display rank after recompression
std::cout << " rank after QR+SVD=" << rank << std::endl;
#else
// Resize U and VT
U = new FReal[nbRows*rank];
for(int idxRow = 0 ; idxRow < nbRows ; ++idxRow)
for(int idxCol = 0 ; idxCol < rank ; ++idxCol)
U[idxCol*nbRows+idxRow] = _U[idxCol*nbRows+idxRow];
VT = new FReal[rank*nbCols];
for(int idxRow = 0 ; idxRow < rank ; ++idxRow)
for(int idxCol = 0 ; idxCol < nbCols ; ++idxCol)
VT[idxCol*rank+idxRow] = _V[idxRow*nbCols+idxCol]; // transposition
#endif
// Free memory
delete [] _U;
delete [] _V;
};
// dtor
~FACABlock(){
// Free memory
clear();
};
void clear(){
nbRows = 0;
nbCols = 0;
level = 0;
rank = 0;
delete[] block;
block = 0;
delete[] U;
U = 0;
delete[] VT;
VT = 0;
}
void gemv(FReal res[], const FReal vec[], const FReal scale = FReal(1.)) const {
//// Apply (dense) block
//FReal* res_dense = new FReal[nbRows];
//FBlas::copy(nbRows,res,res_dense);
//FBlas::gemva(nbRows, nbCols, scale, const_cast<FReal*>(block), const_cast<FReal*>(vec), res_dense);
// Apply low-rank block
FReal* VTvec = new FReal[rank];
FBlas::setzero(rank,VTvec);
// Apply VT
FBlas::gemv(rank, nbCols, scale, const_cast<FReal*>(VT), const_cast<FReal*>(vec), VTvec);
// Apply U
FBlas::gemva(nbRows, rank, scale, const_cast<FReal*>(U), const_cast<FReal*>(VTvec), res);
}
void gemm(FReal res[], const FReal mat[], const int nbRhs, const FReal scale = FReal(1.)) const {
//// Apply (dense) block
//FBlas::gemma(nbRows, nbCols, nbRhs, scale, const_cast<FReal*>(block), nbRows, const_cast<FReal*>(mat), nbCols, res, nbRows);
// Apply low-rank block
FReal* VTmat = new FReal[nbCols*nbRhs];
// Apply VT
FBlas::gemm(rank, nbCols, nbRhs, scale, const_cast<FReal*>(VT), rank, const_cast<FReal*>(mat), nbCols, VTmat, rank);
// Apply U
FBlas::gemma(nbRows, rank, nbRhs, scale, const_cast<FReal*>(U), nbRows, const_cast<FReal*>(VTmat), rank, res, nbRows);
}
int getRank() const{
return rank;
}
};
#endif // FACABLOCK_HPP
// [--END--]
// ===================================================================================
// Copyright ScalFmm 2011 INRIA, Olivier Coulaud, Berenger Bramas, Matthias Messner
// olivier.coulaud@inria.fr, berenger.bramas@inria.fr
// This software is a computer program whose purpose is to compute the FMM.
//
// This software is governed by the CeCILL-C and LGPL licenses and
// abiding by the rules of distribution of free software.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public and CeCILL-C Licenses for more details.
// "http://www.cecill.info".
// "http://www.gnu.org/licenses".
// ===================================================================================
//
// @SCALFMM_PRIVATE
//
#ifndef FDENSEBLOCK_HPP
#define FDENSEBLOCK_HPP
#include "Utils/FBlas.hpp"
template <class FReal>
class FDenseBlock{
protected:
// members
FReal* block;
int nbRows;
int nbCols;
int level;
FDenseBlock(const FDenseBlock&) = delete;
FDenseBlock& operator=(const FDenseBlock&) = delete;
public:
FDenseBlock()
: block(nullptr), nbRows(0), nbCols(0), level(0) {
}
// ctor
template <class ViewerClass>
void fill(const ViewerClass& viewer, const int inLevel){
clear();
// Allocate memory
level = inLevel;
nbRows = viewer.getNbRows();
nbCols = viewer.getNbCols();
block = new FReal[nbRows*nbCols];
for(int idxRow = 0 ; idxRow < nbRows ; ++idxRow){
for(int idxCol = 0 ; idxCol < nbCols ; ++idxCol){
block[idxCol*nbRows+idxRow] = viewer.getValue(idxRow,idxCol);
}
}
};
void resize(const int inNbRow, const int inNbCol){
if(inNbRow != nbRows ||
inNbCol != nbCols){
clear();
nbRows = inNbRow;
nbCols = inNbCol;
block = new FReal[nbRows*nbCols];
}
memset(block, 0, sizeof(FReal)*nbRows*nbCols);
}
// dtor
~FDenseBlock(){
// Free memory
clear();
};
void clear(){
nbRows = 0;
nbCols = 0;
nbCols = 0;
delete[] block;
block = 0;
}
int getNbRows() const{
return nbRows;
}
int getNbCols() const{
return nbCols;
}
FReal getValue(const int idxRow, const int idxCol) const{
return block[idxCol*nbRows+idxRow];
}
FReal& getValue(const int idxRow, const int idxCol) {
return block[idxCol*nbRows+idxRow];
}
void setValue(const int idxRow, const int idxCol, const FReal& val) {
block[idxCol*nbRows+idxRow] = val;
}
void gemv(FReal res[], const FReal vec[], const FReal scale = FReal(1.)) const {
FBlas::gemva(nbRows, nbCols, scale, const_cast<FReal*>(block), const_cast<FReal*>(vec), res);
}
void gemm(FReal res[], const FReal mat[], const int nbRhs, const FReal scale = FReal(1.)) const {
FBlas::gemma(nbRows, nbCols, nbRhs, scale, const_cast<FReal*>(block), nbRows, const_cast<FReal*>(mat), nbCols, res, nbRows);
}
};
#endif // FDENSEBLOCK_HPP
// [--END--]
// ===================================================================================
// Copyright ScalFmm 2011 INRIA, Olivier Coulaud, Berenger Bramas, Matthias Messner
// olivier.coulaud@inria.fr, berenger.bramas@inria.fr
// This software is a computer program whose purpose is to compute the FMM.
//
// This software is governed by the CeCILL-C and LGPL licenses and
// abiding by the rules of distribution of free software.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public and CeCILL-C Licenses for more details.
// "http://www.cecill.info".
// "http://www.gnu.org/licenses".
// ===================================================================================
//
// @SCALFMM_PRIVATE
//
#ifndef FSVDBLOCK_HPP
#define FSVDBLOCK_HPP
#include "Utils/FBlas.hpp"
/*
* Compute SVD $A=USV'$ and return $V'$, $S$ and $U$.
* \param A contains input M x N matrix to be decomposed
* \param S contains singular values $S$
* \param U contains $U$
* \param VT contains $V'$
*/
template<class FReal>
static void computeSVD(const FSize nbRows, const FSize nbCols, const FReal* A, FReal* S, FReal* U, FReal* VT){
// verbose
const bool verbose = false;
// copy A
//is_int(size*size);
FBlas::copy(int(nbRows*nbCols),A,U);
// init SVD
const FSize minMN = std::min(nbRows,nbCols);
const FSize maxMN = std::max(nbRows,nbCols);
//const FSize LWORK = 2*4*minMN; // for square matrices
const FSize LWORK = 2*std::max(3*minMN+maxMN, 5*minMN);
FReal *const WORK = new FReal [LWORK];
// singular value decomposition
if(verbose) std::cout << "\nPerform SVD...";
// SO means that first min(m,n) lines of U overwritten on VT and V' on U (A=VTSU)
// AA means all lines
// nothing means OS (the opposite of SO, A=USVT)
//is_int(size); is_int(LWORK);
const unsigned int INFOSVD
= FBlas::gesvd(int(nbRows), int(nbCols), U, S, VT, int(minMN)/*ldVT*/,
int(LWORK), WORK);
if(verbose) {
if(INFOSVD!=0) {std::cout << " failed!" << std::endl;}
else {std::cout << " succeed!" << std::endl;}
}
// free memory
delete[] WORK;
}
/*
* Compute SVD $A=USV'$ and return $V'$, $S$ and $U$.
* \param
*/
template<class FReal>
static void computeNumericalRank(int &rank, const FReal* S, const FReal epsilon){
// verbose
const bool verbose = false;
// init
const FSize maxRank = rank;
FReal sumSigma2 = FReal(0.0);
for(int idxRow = 0 ; idxRow < rank ; ++idxRow)
sumSigma2+=S[idxRow]*S[idxRow];
FReal SqrtSumSigma2 = std::sqrt(sumSigma2);
// set rank to 1
rank = 1;
// increase
FReal sumSigma2r = S[0]*S[0];
while(std::sqrt(sumSigma2r)<(FReal(1.)-epsilon)*SqrtSumSigma2 && rank<maxRank){
sumSigma2r+=S[rank]*S[rank];
rank++;
}
//std::cout << "std::sqrt(sumSigma2r)=" << std::sqrt(sumSigma2r) << std::endl;
//std::cout << "std::sqrt(sumSigma2)=" << std::sqrt(sumSigma2) << std::endl;
//std::cout << "R/S=" << (std::sqrt(sumSigma2)-std::sqrt(sumSigma2r))/std::sqrt(sumSigma2) << std::endl;
}
template <class FReal, int ORDER = 14>
class FSVDBlock{
protected:
// members
FReal* block;
FReal* U;
FReal* S;
FReal* VT;