mirror of
https://github.com/MariaDB/server.git
synced 2025-07-29 05:21:33 +03:00
Introduce analyze_sample_percentage variable
The variable controls the amount of sampling analyze table performs. If ANALYZE table with histogram collection is too slow, one can reduce the time taken by setting analyze_sample_percentage to a lower value of the total number of rows. Setting it to 0 will use a formula to compute how many rows to sample: The number of rows collected is capped to a minimum of 50000 and increases logarithmically with a coffecient of 4096. The coffecient is chosen so that we expect an error of less than 3% in our estimations according to the paper: "Random Sampling for Histogram Construction: How much is enough?” – Surajit Chaudhuri, Rajeev Motwani, Vivek Narasayya, ACM SIGMOD, 1998. The drawback of sampling is that avg_frequency number is computed imprecisely and will yeild a smaller number than the real one.
This commit is contained in:
@ -40,6 +40,20 @@ NUMERIC_BLOCK_SIZE NULL
|
||||
ENUM_VALUE_LIST DEFAULT,COPY,INPLACE,NOCOPY,INSTANT
|
||||
READ_ONLY NO
|
||||
COMMAND_LINE_ARGUMENT OPTIONAL
|
||||
VARIABLE_NAME ANALYZE_SAMPLE_PERCENTAGE
|
||||
SESSION_VALUE 100.000000
|
||||
GLOBAL_VALUE 100.000000
|
||||
GLOBAL_VALUE_ORIGIN COMPILE-TIME
|
||||
DEFAULT_VALUE 100.000000
|
||||
VARIABLE_SCOPE SESSION
|
||||
VARIABLE_TYPE DOUBLE
|
||||
VARIABLE_COMMENT Percentage of rows from the table ANALYZE TABLE will sample to collect table statistics. Set to 0 to let MariaDB decide what percentage of rows to sample.
|
||||
NUMERIC_MIN_VALUE 0
|
||||
NUMERIC_MAX_VALUE 100
|
||||
NUMERIC_BLOCK_SIZE NULL
|
||||
ENUM_VALUE_LIST NULL
|
||||
READ_ONLY NO
|
||||
COMMAND_LINE_ARGUMENT REQUIRED
|
||||
VARIABLE_NAME AUTOCOMMIT
|
||||
SESSION_VALUE ON
|
||||
GLOBAL_VALUE ON
|
||||
|
Reference in New Issue
Block a user