PCAfold issueshttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues2017-11-10T22:10:14Zhttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/1Finish moving over Matlab functions to python2017-11-10T22:10:14ZElizabeth ArmstrongFinish moving over Matlab functions to python[AND] Determine if functions are necessary before moving over:
- [x] plot_convergence: explained variance over npc's
- [x] principal_variables
- [x] r2converge
- [x] set_retained_eigenvalues
- [x] test: can use pickle to save the PCA da...[AND] Determine if functions are necessary before moving over:
- [x] plot_convergence: explained variance over npc's
- [x] principal_variables
- [x] r2converge
- [x] set_retained_eigenvalues
- [x] test: can use pickle to save the PCA data (like pca_blessed.mat)
- [x] eq
- [x] ne
- [x] write2file: renamed to write_file_for_cpp in python for clarity
- [x] u_scores: determine if necessary and clarify in documentation
- [x] w_scores: determine if necessary and clarify in documentationhttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/2Add documentation for all the PCA python functions2017-10-21T17:50:02ZElizabeth ArmstrongAdd documentation for all the PCA python functionshttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/3Preprocessing data for PCA2020-09-12T16:53:48ZElizabeth ArmstrongPreprocessing data for PCACan add more complex data preprocessing as needed such as had in zdcCan add more complex data preprocessing as needed such as had in zdchttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/4Testing/Examples of how to use PCA2017-10-21T17:50:22ZElizabeth ArmstrongTesting/Examples of how to use PCAhttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/5Testing python PCA2020-09-12T16:53:19ZElizabeth ArmstrongTesting python PCA* [ ] add test x2eta -> eta2x gives back correct x
* [ ] Gitlab CI for python tests - to run tests automatically with a commit* [ ] add test x2eta -> eta2x gives back correct x
* [ ] Gitlab CI for python tests - to run tests automatically with a commithttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/6Optimize PCA functions2020-09-12T16:54:20ZElizabeth ArmstrongOptimize PCA functionsMake PCA process data faster where possibleMake PCA process data faster where possiblehttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/7Set up continuous built/test2021-05-11T15:55:00ZJames SutherlandSet up continuous built/testAs part of this, we should also automate the readthedocs page. However, the present way of doing this involves a secret key that would end up being stored in the repository, which may not be ideal.As part of this, we should also automate the readthedocs page. However, the present way of doing this involves a secret key that would end up being stored in the repository, which may not be ideal.https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/8Add new manifold characterization diagnostics2021-01-27T23:38:05ZJames SutherlandAdd new manifold characterization diagnostics* [x] Add the $`\hat{D}`$ metric and any associated functionality.
* [x] Add/update example(s) that show how to use this, including the fractional sampling to detect overlap.
* [x] Update the [readthedocs](https://pcafold.readthedocs....* [x] Add the $`\hat{D}`$ metric and any associated functionality.
* [x] Add/update example(s) that show how to use this, including the fractional sampling to detect overlap.
* [x] Update the [readthedocs](https://pcafold.readthedocs.io/) documentation.Elizabeth ArmstrongElizabeth Armstrong2020-10-24https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/9Fix the documentation build at Read the Docs2020-10-29T09:56:45ZKamila Zdybalkamilazdybal@gmail.comFix the documentation build at Read the DocsFix the documentation build at Read the Docs following the [update 20.2.4](https://pip.pypa.io/en/stable/news/#id1) in how `pip` resolves dependencies.Fix the documentation build at Read the Docs following the [update 20.2.4](https://pip.pypa.io/en/stable/news/#id1) in how `pip` resolves dependencies.Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.comhttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/10Add `idx` fixes and checks in `preprocess.DataSampler`2021-06-04T19:21:03ZKamila Zdybalkamilazdybal@gmail.comAdd `idx` fixes and checks in `preprocess.DataSampler`Needs fixing:
- When `verbose=True`, only flattened `idx` array is accepted. Should also allow `(n_observations,1)` arrays.
Needs adding:
- Add check for integer values in `idx`.
- Prevent from passing multidimensional `idx`.Needs fixing:
- When `verbose=True`, only flattened `idx` array is accepted. Should also allow `(n_observations,1)` arrays.
Needs adding:
- Add check for integer values in `idx`.
- Prevent from passing multidimensional `idx`.Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.comhttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/11Add comparison of two objects of `RegressionAssessment` class2021-08-19T20:29:40ZKamila Zdybalkamilazdybal@gmail.comAdd comparison of two objects of `RegressionAssessment` classAdd comparison of two objects of `RegressionAssessment` class:
- an in-class function that will take another object of `RegressionAssessment` class and will bold or color entries that have smaller or larger values than the corresponding...Add comparison of two objects of `RegressionAssessment` class:
- an in-class function that will take another object of `RegressionAssessment` class and will bold or color entries that have smaller or larger values than the corresponding values in the current object.
This can be done using `termcolor` for raw text, and using a proper `.style.apply` for `pandas.DataFrame`. See [this notebook](https://gitlab.multiscale.utah.edu/kamila/phd-python/-/blob/master/docs/jupyter-tutorials/highlight-maximum-value-in-a-DataFrame.ipynb) for an example.Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.com2021-08-31https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/12Float formatting doesn't work in `pandas.DataFrame` when comparing two object...2021-08-26T16:40:20ZKamila Zdybalkamilazdybal@gmail.comFloat formatting doesn't work in `pandas.DataFrame` when comparing two objects of the `RegressionAssessment` classFix the formatting.Fix the formatting.Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.com2021-09-11https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/13Add support for global norm in analysis.stratified_normalized_root_mean_squar...2021-09-14T18:00:35ZKamila Zdybalkamilazdybal@gmail.comAdd support for global norm in analysis.stratified_normalized_root_mean_squared_errorThe functionality should be similar to what we have in `analysis.stratified_coefficient_of_determination`.The functionality should be similar to what we have in `analysis.stratified_coefficient_of_determination`.Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.com2021-09-30https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/14Add smarter formatting of `pandas.DataFrame`2021-08-26T16:39:54ZKamila Zdybalkamilazdybal@gmail.comAdd smarter formatting of `pandas.DataFrame`For instance, cluster populations should be displayed as `int`.For instance, cluster populations should be displayed as `int`.Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.com2021-09-11https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/15Value comparison should be independent of float formatting2023-11-03T09:11:31ZKamila Zdybalkamilazdybal@gmail.comValue comparison should be independent of float formattingOtherwise, comparisons are made after truncation to user-specified format, sometimes leading to weird results.
Affects functions:
- `RegressionAssessment.print_metrics`
- `RegressionAssessment.print_stratified_metrics`Otherwise, comparisons are made after truncation to user-specified format, sometimes leading to weird results.
Affects functions:
- `RegressionAssessment.print_metrics`
- `RegressionAssessment.print_stratified_metrics`Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.com2024-01-31https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/16Fix the "name 'sigma_peak' is not defined" NameError2021-09-02T13:50:03ZKamila Zdybalkamilazdybal@gmail.comFix the "name 'sigma_peak' is not defined" NameErrorCode that reproduces the issue:
```python
import numpy as np
from PCAfold import analysis
X = np.random.rand(100,5)
X_source = np.random.rand(100,5)
variable_names = ['X1', 'X2', 'X3', 'X4', 'X5']
scaling='auto'
bandwidth_values = ban...Code that reproduces the issue:
```python
import numpy as np
from PCAfold import analysis
X = np.random.rand(100,5)
X_source = np.random.rand(100,5)
variable_names = ['X1', 'X2', 'X3', 'X4', 'X5']
scaling='auto'
bandwidth_values = bandwidth_values = np.logspace(-4, 2, 50)
(selected_variables, costs) = analysis.manifold_informed_feature_selection(X, X_source, variable_names, scaling, bandwidth_values, target_manifold_dimensionality=2)
```Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.comhttps://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/17Create a separate class for sample PCA2022-05-17T16:06:48ZKamila Zdybalkamilazdybal@gmail.comCreate a separate class for sample PCAThe class will be called `SamplePCA`. It should swallow the following standalone functions:
- `pca_on_sampled_data_set`
- `analyze_centers_change`
- `analyze_eigenvector_weights_change`
- `analyze_eigenvalue_distribution`
- `equilibrate...The class will be called `SamplePCA`. It should swallow the following standalone functions:
- `pca_on_sampled_data_set`
- `analyze_centers_change`
- `analyze_eigenvector_weights_change`
- `analyze_eigenvalue_distribution`
- `equilibrate_cluster_populations`
The last thing left to do:
- [x] Update the tutorials with the new classes.Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.com2022-05-31https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/18Stratified regression metrics table should also show min and max observation ...2021-09-29T09:26:26ZKamila Zdybalkamilazdybal@gmail.comStratified regression metrics table should also show min and max observation in each binStratified regression metrics table should also show min and max observation in each bin.Stratified regression metrics table should also show min and max observation in each bin.Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.com2021-09-30https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/19Allow the user to pre-select the printed regression metrics2021-09-28T20:49:52ZKamila Zdybalkamilazdybal@gmail.comAllow the user to pre-select the printed regression metricsAllow the user to only print the relevant metrics by setting a new `metrics` input list. This affects functions `RegressionAssessment.print_metrics` and `RegressionAssessment.print_stratified_metrics`.
- [x] `print_metrics` update done ...Allow the user to only print the relevant metrics by setting a new `metrics` input list. This affects functions `RegressionAssessment.print_metrics` and `RegressionAssessment.print_stratified_metrics`.
- [x] `print_metrics` update done on commit [`9aa14ea6c2b2fb02aab069bdb9e11d1ec1f7175c`](https://gitlab.multiscale.utah.edu/common/PCAfold/-/commit/9aa14ea6c2b2fb02aab069bdb9e11d1ec1f7175c).
- [x] `print_stratified_metrics` update done on commit [`cdf6dbb743d89ded61e86b1baebfa7a9b71313e9`](https://gitlab.multiscale.utah.edu/common/PCAfold/-/commit/cdf6dbb743d89ded61e86b1baebfa7a9b71313e9).Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.com2021-09-30https://gitlab.multiscale.utah.edu/common/PCAfold/-/issues/20Add 2D scalar regression plotting function2022-07-11T12:13:15ZKamila Zdybalkamilazdybal@gmail.comAdd 2D scalar regression plotting functionAdd 2D scalar regression plotting function in the `analysis` module.Add 2D scalar regression plotting function in the `analysis` module.Kamila Zdybalkamilazdybal@gmail.comKamila Zdybalkamilazdybal@gmail.com2022-06-30