public abstract class CompressedSizeEstimator extends Object
Modifier and Type | Method and Description |
---|---|
List<CompressedSizeInfoColGroup> |
computeCompressedSizeInfos(Collection<int[]> columnLists)
Compression Size info from list of specified columns
|
List<CompressedSizeInfoColGroup> |
computeCompressedSizeInfos(Collection<int[]> columnLists,
int k)
Multi threaded version of extracting Compression Size info from list of specified columns
|
CompressedSizeInfo |
computeCompressedSizeInfos(int k)
Multi threaded version of extracting Compression Size info
|
CompressedSizeInfoColGroup |
estimateCompressedColGroupSize()
Method used for compressing into one type of colGroup
|
EstimationFactors |
estimateCompressedColGroupSize(ABitmap ubm,
int[] colIndexes)
Method used to extract the CompressedSizeEstimationFactors from an constructed UncompressedBitmap.
|
static EstimationFactors |
estimateCompressedColGroupSize(ABitmap ubm,
int[] colIndexes,
int nrRows,
CompressionSettings cs) |
CompressedSizeInfoColGroup |
estimateCompressedColGroupSize(int[] colIndexes)
Method for extracting Compressed Size Info of specified columns, together in a single ColGroup
|
abstract CompressedSizeInfoColGroup |
estimateCompressedColGroupSize(int[] colIndexes,
int nrUniqueUpperBound)
A method to extract the Compressed Size Info for a given list of columns, This method further limits the
estimated number of unique values, since in some cases the estimated number of uniques is estimated higher than
the number estimated in sub groups of the given colIndexes.
|
CompressedSizeInfoColGroup |
estimateJoinCompressedSize(CompressedSizeInfoColGroup g1,
CompressedSizeInfoColGroup g2)
Join two analyzed column groups together.
|
int |
getNumColumns() |
int |
getNumRows() |
String |
toString() |
public int getNumRows()
public int getNumColumns()
public CompressedSizeInfo computeCompressedSizeInfos(int k)
k
- The concurrency degree.public List<CompressedSizeInfoColGroup> computeCompressedSizeInfos(Collection<int[]> columnLists, int k)
columnLists
- The specified columns to extract.k
- The parallelization degreepublic List<CompressedSizeInfoColGroup> computeCompressedSizeInfos(Collection<int[]> columnLists)
columnLists
- The specified columns to extract.public CompressedSizeInfoColGroup estimateCompressedColGroupSize()
public CompressedSizeInfoColGroup estimateCompressedColGroupSize(int[] colIndexes)
colIndexes
- The columns to group together inside a ColGrouppublic abstract CompressedSizeInfoColGroup estimateCompressedColGroupSize(int[] colIndexes, int nrUniqueUpperBound)
colIndexes
- The columns to extract compression information fromnrUniqueUpperBound
- The upper bound of unique elements allowed in the estimate, can be calculated from the
number of unique elements estimated in sub columns multiplied together. This is
flexible in the sense that if the sample is small then this unique can be manually
edited like in CoCodeCostMatrixMult.public CompressedSizeInfoColGroup estimateJoinCompressedSize(CompressedSizeInfoColGroup g1, CompressedSizeInfoColGroup g2)
g1
- First groupg2
- Second grouppublic EstimationFactors estimateCompressedColGroupSize(ABitmap ubm, int[] colIndexes)
ubm
- The UncompressedBitmap, either extracted from a sample or from the entire datasetcolIndexes
- The columns that is compressed together.public static EstimationFactors estimateCompressedColGroupSize(ABitmap ubm, int[] colIndexes, int nrRows, CompressionSettings cs)
Copyright © 2021 The Apache Software Foundation. All rights reserved.