mirror of
https://github.com/facebook/rocksdb.git
synced 2025-09-19 04:21:39 +03:00
Summary: add option MultiScanArgs::use_async_io option and implementation for using ReadAsync() for multiscan. Read requests are submitted during Prepare() and polled during actual scanning. Pull Request resolved: https://github.com/facebook/rocksdb/pull/13932 Test Plan: - updated existing unit test to use async_io. - crash test: `python3 -u ./tools/db_crashtest.py whitebox --iterpercent=60 --prefix_size=-1 --prefixpercent=0 --readpercent=0 --test_batches_snapshots=0 --use_multiscan=1 --read_fault_one_in=0 --kill_random_test=88888 --interval=60 --multiscan_use_async_io=1 --mmap_read=0` Benchmark: - Default multiscan benchmark: ``` Set up: /db_bench --benchmarks="fillseq,compact" --disable_wal=1 --threads=1 --num_levels=1 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=1000 --write_buffer_size=268435456 Without async IO: ./db_bench --db="/tmp/rocksdbtest-543376/dbbench" --use_existing_db=1 --benchmarks=multiscan --disable_auto_compactions=1 --seek_nexts=100 --threads=32 --duration=10 --statistics=1 --use_direct_reads=1 --multiscan_use_async_io=0 multiscan : 415.569 micros/op 75805 ops/sec 10.355 seconds 784968 operations; (multscans:24999) rocksdb.read.async.micros COUNT : 0 With asycn IO: ./db_bench --db="/tmp/rocksdbtest-543376/dbbench" --use_existing_db=1 --benchmarks=multiscan --disable_auto_compactions=1 --seek_nexts=100 --threads=32 --duration=10 --statistics=1 --use_direct_reads=1 --multiscan_use_async_io=1 multiscan : 413.236 micros/op 76044 ops/sec 10.375 seconds 788968 operations; (multscans:24999) rocksdb.read.async.micros COUNT : 3916499 Similar performance. ``` - Larger scan, more scans per multiscan, do not coalesce IO so that async IO can progress while scanning, and use one thread: ``` multiscan_stride = 1000 multiscan_size = 100 seek_nexts = 1000 ./db_bench --db="/tmp/rocksdbtest-543376/dbbench" --use_existing_db=1 --benchmarks=multiscan --disable_auto_compactions=1 --threads=1 --duration=10 --statistics=0 --use_direct_reads=1 --cache_size=2097152 --multiscan_size=100 --multiscan_stride=1000 --seek_nexts=1000 --seed=1 --multiscan_coalesce_threshold=0 --multiscan_use_async_io=0 Without async IO: multiscan : 20495.205 micros/op 48 ops/sec 10.002 seconds 488 operations; (multscans:488) With async IO: multiscan : 18337.883 micros/op 54 ops/sec 10.013 seconds 546 operations; (multscans:546) ~10% improvement in throughput ``` Reviewed By: xingbowang Differential Revision: D82077818 Pulled By: cbi42 fbshipit-source-id: 66e32cf4039183c4841827409286dfbaa6dfbcd8