1
0
mirror of https://github.com/facebook/rocksdb.git synced 2025-09-19 04:21:39 +03:00
Files
rocksdb/file
Changyu Bi 2620c85638 Support async IO for MultiScan (#13932)
Summary:
add option MultiScanArgs::use_async_io option and implementation for using ReadAsync() for multiscan. Read requests are submitted during Prepare() and polled during actual scanning.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/13932

Test Plan:
- updated existing unit test to use async_io.
- crash test: `python3 -u ./tools/db_crashtest.py whitebox --iterpercent=60 --prefix_size=-1 --prefixpercent=0 --readpercent=0 --test_batches_snapshots=0 --use_multiscan=1 --read_fault_one_in=0 --kill_random_test=88888 --interval=60 --multiscan_use_async_io=1 --mmap_read=0`

Benchmark:
- Default multiscan benchmark:
```
Set up: /db_bench --benchmarks="fillseq,compact" --disable_wal=1 --threads=1 --num_levels=1 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=1000 --write_buffer_size=268435456

Without async IO:
./db_bench --db="/tmp/rocksdbtest-543376/dbbench" --use_existing_db=1 --benchmarks=multiscan --disable_auto_compactions=1 --seek_nexts=100 --threads=32 --duration=10 --statistics=1 --use_direct_reads=1 --multiscan_use_async_io=0

multiscan    :     415.569 micros/op 75805 ops/sec 10.355 seconds 784968 operations; (multscans:24999)
rocksdb.read.async.micros COUNT : 0

With asycn IO:
./db_bench --db="/tmp/rocksdbtest-543376/dbbench" --use_existing_db=1 --benchmarks=multiscan --disable_auto_compactions=1 --seek_nexts=100 --threads=32 --duration=10 --statistics=1 --use_direct_reads=1 --multiscan_use_async_io=1

multiscan    :     413.236 micros/op 76044 ops/sec 10.375 seconds 788968 operations; (multscans:24999)
rocksdb.read.async.micros COUNT : 3916499

Similar performance.
```

- Larger scan, more scans per multiscan, do not coalesce IO so that async IO can progress while scanning, and use one thread:
```
multiscan_stride = 1000
multiscan_size = 100
seek_nexts = 1000

./db_bench --db="/tmp/rocksdbtest-543376/dbbench" --use_existing_db=1 --benchmarks=multiscan --disable_auto_compactions=1 --threads=1 --duration=10 --statistics=0 --use_direct_reads=1  --cache_size=2097152 --multiscan_size=100 --multiscan_stride=1000 --seek_nexts=1000 --seed=1 --multiscan_coalesce_threshold=0  --multiscan_use_async_io=0

Without async IO:
multiscan    :   20495.205 micros/op 48 ops/sec 10.002 seconds 488 operations; (multscans:488)

With async IO:
multiscan    :   18337.883 micros/op 54 ops/sec 10.013 seconds 546 operations; (multscans:546)

~10% improvement in throughput
```

Reviewed By: xingbowang

Differential Revision: D82077818

Pulled By: cbi42

fbshipit-source-id: 66e32cf4039183c4841827409286dfbaa6dfbcd8
2025-09-15 11:39:45 -07:00
..