Summary: In `writeCryptoAndAckDataToSocket`, add an additional `writeProbingDataToSocket` call at the end that is limited to the number of CRYPTO frame-containing packets just written, gated by the new `TransportSetttings` field `immediatelyRetransmitInitialPackets`.
Reviewed By: mjoras
Differential Revision: D64485616
fbshipit-source-id: f0927a3796767700fd46673195e1cd4e1bbbcbeb
Summary: This kinda makes sense to just use as a default.
Reviewed By: kvtsoy, sharmafb
Differential Revision: D64066392
fbshipit-source-id: 0915f163c0483af6bec014bde61e82b6ee2ac6cb
Summary: As in title. Without this the keepalive option is less effective since a single PING packet loss can cause issues.
Reviewed By: kvtsoy
Differential Revision: D63397494
fbshipit-source-id: 7ef6b6f54189609e3a96409ac9c035c475dc0569
Summary: This is my second attempt at D61871891. This time, I ran `xplat/cross_plat_devx/somerge_maps/compute_merge_maps.py`, which generated quic/somerge_defs.bzl
Reviewed By: kvtsoy
Differential Revision: D61975459
fbshipit-source-id: bec62acb2b400f4a102574e8c882927f41b9330e
Summary: `PacketEvent` is a very inaccurate and misleading name. We're basically using this as an identifier for cloned packets, so `ClonedPacketIdentifier` is a much better.
Reviewed By: kvtsoy
Differential Revision: D61871891
fbshipit-source-id: f9c626d900c8b7ab7e231c9bad4c1629384ebb77
Summary:
**Context**
The `BufAccessor` is used to access a contiguous section of memory. Right now, it works with a `Buf` under the hood.
**Overall plan**
The plan is to change the `BufAccessor` to use a `uint8_t*` instead. Since we're using section of contiguous memory, there's no need to use a chained buffer abstraction here. This'll move us closer to deprecating the usage `folly::IOBuf`.
**What this diff is doing**
Most use cases of the `BufAccessor` look like the following:
```
auto buf = bufAccessor.obtain();
// Do something with buf, like calling trimEnd
bufAccessor.release(buf)
```
I'm adding APIs to the `BufAccessor` so that there's no need to `obtain()` and `release()` the `Buf`. We'd instead just call an API on the `BufAccessor`, which would call that same API on the underlying `folly::IOBuf`. Later on, we'll change the `BufAccessor` to use a `uint8_t*` under the hood.
I'm currently leaving in the `obtain()`, `release()`, and `buf()` APIs because Fizz and the AsyncUDPSocket expect `folly::IOBuf` as inputs in many of their APIs. Once those callsites are migrated off `folly::IOBuf`, we can remove these APIs.
Reviewed By: mjoras
Differential Revision: D60973166
fbshipit-source-id: 52aa3541d0c4878c7ee8525d70ac280508b61e24
Summary:
Timely reaction to congestion requires relaying any CE marks to the sender as soon as possible.
This change schedules an ack to be sent whenever incoming packets are received with CE marks. This will only happen when the readEcnOnIngress option is enabled.
Reviewed By: mjoras
Differential Revision: D58423959
fbshipit-source-id: 30f8cf8b11d0446985c2d87d7df67c24c0d5afdf
Summary: Useful to track so we can optimize it.
Reviewed By: kvtsoy
Differential Revision: D58196435
fbshipit-source-id: c15c409f998430fd0c6acde0539d0345123a9e15
Summary: We have a lot of optionals that are either integral values or std::chrono::microseconds. These end up wasting memory, where we can instead store sentinel values to encode whether the value is there or not. This reduces the effective range of the type by one value, but that is an acceptable tradeoff.
Reviewed By: kvtsoy
Differential Revision: D57684368
fbshipit-source-id: b406b86011f9b8169b6e5e925265f4829928cc63
Summary:
The idea here is to make it so we can swap out the type we are using for optionality. In the near term we are going to try swapping towards one that more aggressively tries to save size.
For now there is no functional change and this is just a big aliasing diff.
Reviewed By: sharmafb
Differential Revision: D57633896
fbshipit-source-id: 6eae5953d47395b390016e59cf9d639f3b6c8cfe
Summary:
Remove cmsgs from each outstanding packet. Storing this as a map is excessive. Instead store it as a packed enum. This supports one kind of mark per packet.
If we ned to we could in the future support multiple types of marks per packet by using a bitset-style flag, e.g. each bit index representing a different mark. For now just use an enum.
Reviewed By: jbeshay
Differential Revision: D57397865
fbshipit-source-id: 6d34215c9d7e39537c44c6c304a8ce3a5883541e
Summary: This is effectively an unused field encoding duplicated information, but it was widespread.
Reviewed By: kvtsoy
Differential Revision: D57289922
fbshipit-source-id: ca1499e2576e5ae28e3880b865a29c2b8d9a3d1b
Summary:
The existing batch writers do not handle failed writes to the AsyncUDPSocket. A packet that fails to be written is detected as a packet loss later when feedback is received from the peer. This negatively impacts the congestion controller because of the fake loss signal, and artificially inflates the number of retransmitted packets/bytes.
This change adds a new batch writer (SinglePacketBackpressuretBatchWriter) that retains the buffers when a write fails. For subsequent writes, the writer retries the same buffer. No new packets are scheduled until the retried buffer succeeds.
Notes:
- To make sure that retry writes are scheduled, the write callback is installed on the socket when a buffer needs to be retried.
- The retries are for an already scheduled packet. The connection state reflects the timing of the first attempt. This could still have an impact on rtt samples, etc. but it this is a milder impact compared to fake losses/retranmissions.
- Any changes outside of the batch writer only impact the new batch writer. Existing batch writers do not use the fields and are not affected by the changes in this diff.
Reviewed By: kvtsoy
Differential Revision: D57597576
fbshipit-source-id: 9476d71ce52e383c5946466f64bb5eecd4f5d549
Summary:
The time between iterations is not significant, so we can just call `Clock::now` once in the beginning and reuse the same value.
I ran a canary with some counters to get an idea of the amount of time between the start of the first iteration and the end of the last iteration (see D57510979), and:
* Edge p100: 1500 us
* olb p100: 1900 us
* Edge p99: 413 us
* olb p99: 396 us
The wins we're seeing are 0.13% relative CPU.
Reviewed By: kvtsoy
Differential Revision: D57594650
fbshipit-source-id: 9d0f827564179745cd83eb6ca211df68d3f23f8b
Summary: This setting is no longer needed.
Reviewed By: mjoras
Differential Revision: D57112554
fbshipit-source-id: 4720dd864f24ac21a775419522254195c5ea215f
Summary:
std::deque by default allocates a large block on the heap for managing its state. This has a fixed memory cost both per connection (because of the crypto streams) and per stream. CircularDeque by comparison does not have this overhead and is only 32 bytes per structure.
For example:
```
"size": 656,
"name": "readBuffer",
"typePath": ["a0", "conn_", "ptr_val", "cryptoState", "ptr_val", "initialStream", "readBuffer"],
"typeNames": ["std::deque<quic::StreamBuffer, std::allocator<quic::StreamBuffer>>"],
```
This should save about 6kB per connection with no streams, and additional memory per stream.
Reviewed By: jbeshay, hanidamlaj
Differential Revision: D56578219
fbshipit-source-id: ab2b529fa9a4169bea6862b11ccbf178c6f5abb1
Summary:
std::deque by default allocates a large block on the heap for managing its state. This has a fixed memory cost both per connection (because of the crypto streams) and per stream. CircularDeque by comparison does not have this overhead and is only 32 bytes per structure.
For example:
```
"size": 656,
"name": "readBuffer",
"typePath": ["a0", "conn_", "ptr_val", "cryptoState", "ptr_val", "initialStream", "readBuffer"],
"typeNames": ["std::deque<quic::StreamBuffer, std::allocator<quic::StreamBuffer>>"],
```
This should save about 6kB per connection with no streams, and additional memory per stream.
Reviewed By: kvtsoy
Differential Revision: D56496459
fbshipit-source-id: ec4049614939f885f64481bd374c81e74ad47c66
Summary: This reduces the number of stats callbacks when processing multiple packets in rapid succession.
Reviewed By: mjoras
Differential Revision: D56315022
fbshipit-source-id: 750024301f28b21e3125c144ead6f115706736a4
Summary:
The current ackVisitor passed to processAckFrame() gets executed with every acked frame.
Not every check in that visitor needs to be performed for every frame.
This refactors it into two different visitors, one that is called once per acked packet and another that is called once per acked frame.
Reviewed By: mjoras
Differential Revision: D56090734
fbshipit-source-id: 7b0877684e439f9e88c0aae7a294245570cf8309
Summary: Neither QUIC not TransportMonitor is using the `packetsInflight` field of the `OutstandingPacketMetadata`
Reviewed By: hanidamlaj
Differential Revision: D55926288
fbshipit-source-id: 32efd66add1e6374a8d3446ff635fe582b36e644
Summary: Neither QUIC not TransportMonitor is using the `totalBodyBytesSent` field of the `OutstandingPacketMetadata`
Reviewed By: hanidamlaj
Differential Revision: D55897240
fbshipit-source-id: 521f8a016f838dd1fe0593daa7a4e45c4fd222cf
Summary:
In order to exercise the key update path more frequently, this adds an option to initiate one key update early into the connection after a fixed number of packets is written. After that, keys are updated periodically at a significantly lower frequency.
Currently the defaults are to initiate the first key update after 500 packets, then periodically every 1M packets (which is under the confidentiality limit).
Reviewed By: mjoras
Differential Revision: D54082404
fbshipit-source-id: 3edd7489ceb8c643a319edb06e006803f1e736ba
Summary:
Allow the server/client transport to initiate periodic key update. It's defaulted to being disabled.
The new logic for initiating and verifying a key update was handled correctly by the peer is consolidated in QuicTransportFunctions.
Reviewed By: mjoras
Differential Revision: D53109624
fbshipit-source-id: 0c3a944978fc0e0a84252da953dc116aa7c26379
Summary:
This stack adds key update support to Mvfst client and server. This diff adds the main logic for detecting key updates in the QuicReadCodec. When an update is successful, the server transport reacts to it by updating the write phase and cipher.
The high level design is as follows:
- The QuicReadCodec is responsible for detecting incoming key update attempts by the peer, as well as tracking any ongoing locally-initiated key updates.
- Upon detecting a successful key update, the QuicReadCodec updates its state. The Server/Client transport reacts to this change by updating its write phase and cipher.
- A locally initiated key update starts with updating the write phase and key, and signaling the read codec that a key update has been initiated.
- The read codec keeps this in a pending state until a packet is successfully received in the new phase.
- Functions for syncing the read/write phase on incoming key updates, as well as initiating and verifying outgoing key updates are abstracted in QuicTransportFunctions and are used by both the client and server transports.
- Common handshake functions used for rotating the keys are now in HandshakeLayer that is shared by both client and server handshakes.
Reviewed By: mjoras
Differential Revision: D53016559
fbshipit-source-id: 134e965dabd62917193544a9655a4eb8868ab7f8
Summary: This is similar to the previous commit. We use a stack-based IOBuf instead of allocating one on the heap. This saves CPU because allocations/deallocations on the stack are a lot cheaper.
Reviewed By: hanidamlaj
Differential Revision: D53101722
fbshipit-source-id: dd59a7eca6498db19472a62f954db3e2f2f27a42
Summary: Creating an IOBuf on the heap when we use `folly::IOBuf::wrapBuffer` is expensive.
Reviewed By: hanidamlaj
Differential Revision: D52506216
fbshipit-source-id: eed2b77beae0419b542b0461303785cc175e3518
Summary:
This is the major transition that updates mvfst code to use the new interfaces. The new Folly implementations of the interfaces maintain all the existing behavior of folly types so this should not introduce any functional change. The core changes are:
- Update the BatchWriters to use the new interfaces.
- Update the FunctionLooper to use the new interfaces.
- Change QuicServerTransport to take the folly types and wrap them in the new types for use in the QuicTransportBase.
The rest of the diff is for updating all the existing uses of the QuicTrasnport to initialize the necessary types and pass them to the QUIC transport instead of directly passing folly types.
Reviewed By: mjoras
Differential Revision: D51413481
fbshipit-source-id: 5ed607e12b9a52b96148ad9b4f8f43899655d936
Summary:
- Remove setCustomTransportParameter, which (based on the quic v19 rfc), verifies whether a parameter is within the private range [0xff00, 0xffff]
> Values with the first byte in the range 0x00 to 0xfe (in hexadecimal) are assigned via the Specification Required policy [RFC8126].
- Consolidating adding MaxStreamGroups transport parameter into all other transport parameters extension.
More specifically, `QuicClientTransport::maybeEnableStreamGroups()` logic is now moved into `QuicClientTransport::setSupportedExtensionTransportParameters()`
Reviewed By: mjoras
Differential Revision: D50461610
fbshipit-source-id: 802b546c8364586cdcf36a230b156ca140c57ce4
Summary:
This diff changes `QuicAsyncUDPSocketWrapper` so that it is an abstraction layer that inherits from `QuicAsyncUDPSocketType`, instead of simply being a container with aliases.
- Key changes in `QuicAsyncUDPSocketWrapper.h`, the rest of the updates switch us from using `QuicAsyncUDPSocketType` to `QuicAsyncUDPSocketWrapper`.
- It's difficult to mock the UDP socket today given that we expose the entire `folly::AsyncUDPSocket` type to the higher layers of the QUIC stack. This complicates testing and emulation because any mock / fake has to implement low level primitives like `recvmmsg`, and because the `folly::AsyncUDPSocket` interface can change over time.
- Pure virtual functions will be defined in `QuicAsyncUDPSocketWrapper` in a follow up diff to start creating an interface between the higher layers of the mvfst QUIC stack and the UDP socket, and this interface will abstract away lower layer details such as `cmsgs` and `io_vec`, and instead focus on populating higher layer structures such as `NetworkData` and `ReceivedPacket` (D48714615). This will make it easier for us to mock or fake the UDP socket.
This diff relies on changes to `folly::MockAsyncUDPSocket` introduced in D48717389.
--
This diff is part of a larger stack focused on the following:
- **Cleaning up client and server UDP packet receive paths while improving testability.** We currently have multiple receive paths for client and server. Capabilities vary significantly and there are few tests. For instance:
- The server receive path supports socket RX timestamps, abet incorrectly in that it does not store timestamp per packet. In comparison, the client receive path does not currently support socket RX timestamps, although the code in `QuicClientTransport::recvmsg` and `QuicClientTransport::recvmmsg` makes reference to socket RX timestamps, making it confusing to understand the capabilities available when tracing through the code. This complicates the tests in `QuicTypedTransportTests`, as we have to disable test logic that depends on socket RX timestamps for client tests.
- The client currently has three receive paths, and none of them are well tested.
- **Modularize and abstract components in the receive path.** This will make it easier to mock/fake the UDP socket and network layers.
- `QuicClientTransport` and `QuicServerTransport` currently contain UDP socket handling logic that operates over lower layer primitives such `cmsg` and `io_vec` (see `QuicClientTransport::recvmmsg` and `...::recvmsg` as examples).
- Because this UDP socket handling logic is inside of the mvfst transport implementations, it is difficult to test this logic in isolation and mock/fake the underlying socket and network layers. For instance, injecting a user space network emulator that operates at the socket layer would require faking `folly::AsyncUDPSocket`, which is non-trivial given that `AsyncUDPSocket` does not abstract away intricacies arising from the aforementioned lower layer primitives.
- By shifting this logic into an intermediate layer between the transport and the underlying UDP socket, it will be easier to mock out the UDP socket layer when testing functionality at higher layers, and inject fake components when we want to emulate the network between a mvfst client and server. It will also be easier for us to have unit tests focused on testing interactions between the UDP socket implementation and this intermediate layer.
- **Improving receive path timestamping.** We only record a single timestamp per `NetworkData` at the moment, but (1) it is possible for a `NetworkData` to have multiple packets, each with their own timestamps, and (2) we should be able to record both userspace and socket timestamps.
Reviewed By: jbeshay, hanidamlaj
Differential Revision: D48717388
fbshipit-source-id: 4f34182a69ab1e619e454da19e357a6a2ee2b9ab
Summary:
This limits the writableBytes by the available tokens (ie `ThrottlingSignalProvider::bytesToSend`), if the connection is being throttled and that value is provided by the throttling signal provider.
Note that these throttling signals to the transport are provided only for the connections that belong to a specific QE experiment.
Reviewed By: silver23arrow
Differential Revision: D47850261
fbshipit-source-id: 8c52e66198db6d1fee252cacea69b82963a1601a
Summary:
This will allow for the new SinglePacketInplaceBatchWriter to be used
on clients
Reviewed By: mjoras
Differential Revision: D47742872
fbshipit-source-id: da2869d4ea7693f2ced4f6d7901216a20da29fa7