Our laboratory provides [Tools] for high-dimension, low-sample-size (HDLSS) data. Please read [License] and use tools only if you agree. For more details on the analytical method, please refer to relevant manuals and papers.
- Package Installation
- Tools
- License
Use the following command in the terminal to install packages locally.
git clone https://github.com/Aoshima-Lab/HDLSS-Tools.gitThe "Noise-Reduction Methodology (NRM)" gives estimators of the eigenvalues, eigenvectors, and principal component scores.
Reference : K. Yata, M. Aoshima, Effective PCA for High-Dimension, Low-Sample-Size Data with Noise Reduction via Geometric Representations, Journal of Multivariate Analysis, 105 (2012) 193-215.
DOI: [10.1016/j.jmva.2011.09.002]
The "Cross-Data-Matrix (CDM) Methodology" gives estimators of the eigenvalues, eigenvectors, and principal component scores.
Reference : K. Yata, M. Aoshima, Effective PCA for High-Dimension, Low-Sample-Size Data with Singular Value Decomposition of Cross Data Matrix, Journal of Multivariate Analysis, 101 (2010) 2060-2077.
DOI: [10.1016/j.jmva.2010.04.006]
The "Automatic Sparse PCA (A-SPCA)" gives estimators of the eigenvalues and eigenvectors.
Reference : K. Yata, M. Aoshima, Automatic Sparse PCA for High-Dimensional Data, Statistica Sinica, 35 (2025) 1069-1090. DOI: [10.5705/ss.202022.0319] [Supplement]
The "Extended Cross-Data-Matrix (ECDM) Methodology" gives an estimator of
Reference : K. Yata, M. Aoshima, High-Dimensional Inference on Covariance Structures via the Extended Cross-Data-Matrix Methodology, Journal of Multivariate Analysis, 151 (2016) 151-166.
DOI: [10.1016/j.jmva.2016.07.011]
The "Automatic Sparse Estimation" provides sparse estimators of cross-covariance matrices and mean vectors. This method automatically determines the sparsification threshold.
Reference : T. Umino, K. Yata and M. Aoshima, Automatic sparse estimation of the high-dimensional cross-covariance matrix, Journal of Multivariate Analysis, (2025) (in press).
DOI: [10.1016/j.jmva.2025.105590]
The "PC-scores-based Outlier Detection (PC-OD)" identifies outliers based on the PC scores. The algorithm is provided in section 3.2 of Nakayama et al. (2024).
Reference : Y. Nakayama, K. Yata and M. Aoshima, Test for High-Dimensional Outliers with Principal Component Analysis, Japanese Journal of Statistics and Data Science, 7 (2024) 739–766.
DOI : [10.1007/s42081-024-00255-0]
The "Bias-Corrected Support Vector Machine (BC-SVM)" provides bias-corrected classification for high-dimensional, low-sample-size data. The algorithm is described in the following references:
Reference : Y. Nakayama, K. Yata, and M. Aoshima, Support vector machine and its bias correction in high-dimension, low-sample-size settings, Journal of Statistical Planning and Inference, 191 (2017) 88–100.
DOI: [10.1016/j.jspi.2017.05.005]Y. Nakayama, K. Yata, and M. Aoshima, Bias-corrected support vector machine with Gaussian kernel in high-dimension, low-sample-size settings, Annals of the Institute of Statistical Mathematics, 72 (2020) 1257–1286.
DOI: [10.1007/s10463-019-00727-1]
The "Distance-Based Discriminant Analysis (DBDA)" provides high-dimensional discriminant analysis for multiclass data. The algorithm is provided in Aoshima and Yata (2014).
Reference : M. Aoshima and K. Yata, A distance-based, misclassification rate adjusted classifier for multiclass, high-dimensional data, Annals of the Institute of Statistical Mathematics (2014).
DOI : [10.1007/s10463-013-0435-8]
The "Geometrical quadratic discriminant analysis(GQDA)" provides high-dimensional discriminant analysis for multiclass data. The algorithm is provided in Aoshima and Yata (2015).
Reference : M. Aoshima and K. Yata, Geometric Classifier for Multiclass, High-Dimensional Data, Sequential Anal, 34, 279-294. (2015).
DOI : [10.1080/07474946.2015.1063256]
The "Data Transformation" method provides tools for transforming high-dimensional data and estimating the spiked eigenvalues in HDLSS settings. The algorithm is provided in Aoshima and Yata (2018).
Reference : M. Aoshima, K. Yata, Two-Sample Tests for High-Dimension, Strongly Spiked Eigenvalue Models, Statistica Sinica, 28 (2018), 43-62
DOI: [10.5705/ss.202016.0063]
The "Covariance Structure Test" module provides hypothesis tests for high-dimensional covariance structures based on the Extended Cross-Data-Matrix (ECDM) methodology.
Reference: A. Ishii, K. Yata and M. Aoshima, Hypothesis tests for high-dimensional covariance structures, Annals of the Institute of Statistical Mathematics, 73 (2021), 599-622.
DOI: [10.1007/s10463-020-00760-5]
Copyright (C) <2026> <Makoto Aoshima>
This work is licensed under the Creative Commons Attribution-NoDerivatives 4.0 International license.
To view a copy of this license, visit https://creativecommons.org/licenses/by-nd/4.0/ or
send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
Makoto Aoshima, University of Tsukuba
aoshima@math.tsukuba.ac.jp
https://www.math.tsukuba.ac.jp/~aoshima-lab/index.html