Gaia
|
The Gaussianize analyzer fits each descriptor into a gaussian distribution. More...
#include <gaussianize.h>
Public Member Functions | |
Gaussianize (const ParameterMap ¶ms) | |
Transformation | analyze (const DataSet *data, const Region ®ion) const |
Public Member Functions inherited from gaia2::Analyzer | |
Analyzer (const ParameterMap ¶ms) | |
virtual Transformation | analyze (const DataSet *dataset) const |
void | checkDataSet (const DataSet *dataset) const |
Checks that the given dataset is valid. More... | |
void | checkMinPoints (const DataSet *dataset, int n) const |
Checks that the given dataset as at least the specified number of points. More... | |
const Region & | checkFixedLength (const Region ®ion, const PointLayout &layout) const |
Checks that the given Region only contains fixed-length descriptors and throws an exception if not. More... | |
Protected Attributes | |
int | _maxDistSize |
Protected Attributes inherited from gaia2::Analyzer | |
ParameterMap | _params |
QStringList | _descriptorNames |
QStringList | _exclude |
Additional Inherited Members | |
Public Attributes inherited from gaia2::Analyzer | |
QString | name |
Name for the algorithm, usually the key that was used to instantiate it from the factory. | |
QStringList | validParams |
List of valid parameters this analyzer accepts. More... | |
The Gaussianize analyzer fits each descriptor into a gaussian distribution.
It does it by sorting all the values, and giving them new values as if they were forming a perfect gaussian distribution. In doing so it completely discards the value it had before (only keeps the rank), and as such may be not as correct or precise as a BoxCox transformation, but it is much faster to compute.
NB: This algorithm does not yet work with multi-segments points.
distributionSize | when analyzing the distribution of a lot of values, it is not necessary to keep all points as reference, but only a subset of them. This value represent the maximum number of reference values used to model the distribution. In most cases, 10'000 points should give a precise enough distribution while still being efficient to compute. (default: 10000). |