Summary:
Previously, we maintained state and counters to count both, header and body
bytes together. This commit introduces additional counters and state to keep
track of just the body bytes that were sent and acked etc. Body bytes received
will be implemented later.
Reviewed By: bschlinker
Differential Revision: D27312049
fbshipit-source-id: 33f169c9168dfda625e86de45df7c00d1897ba7e
Summary: As in title. There's a bug here somewhere with empty write loops we need to find.
Reviewed By: yangchi
Differential Revision: D27279100
fbshipit-source-id: e1d26fbf8d6df1590d464a6504a8b940b46794e0
Summary:
Previously we would only send probes for the first space which had one available, i.e. Initial before Handshake before AppData. Since we only have one PTO timer this can lead to situations where we perpetually probe with only Initials, which can significantly delay the handshake if we should have probed with Handshakes.
With this diff we will keep the single PTO timer but aggressively write more probes from all spaces if they are available.
Additionally this refactors some counters into EnumArrays
Reviewed By: yangchi
Differential Revision: D27235199
fbshipit-source-id: ef3614a833bf0f02f5806846a1335fa7ac2a4dc8
Summary: try to land this again without the compiler flags change
Differential Revision: D26958681
fbshipit-source-id: d00659aaf819dbb2942da8b41deab3d108a19f0f
Summary:
Update retransmission queue in the stream after BufferMeta is written,
for both new data and lost data cases.
Reviewed By: mjoras
Differential Revision: D26260987
fbshipit-source-id: 15a4fa17826426b4b972b63cf370fe791b3101ff
Summary:
Real stream data and BufferMeta represented data will be scheduled by different
schedulers. For that reason, this diff adds another PriorityQueue into the stream
manager.
Reviewed By: mjoras
Differential Revision: D26132498
fbshipit-source-id: c69cb671c9a9f975d82efab8f1244a2f3c6c9297
Summary:
- Adds counter of the number of ack-eliciting packets sent; useful for understanding lost / retransmission numbers
- Adds the following to `OutstandingPacketMetadata`
- # of packets sent and # of ack-eliciting packets sent; this enables indexing of each packet when processed by an observer on loss / RTT event, including understanding its ordering in the flow of sent packets.
- # of packets in flight; useful during loss analysis to understand if self-induced congestion could be culprit
- Fixes the inflightBytes counter in `OutstandingPacketMetadata`; it was previously _not_ including the packets own bytes, despite saying that it was.
Right now we are passing multiple integers into the `OutstandingPacket` constructor. I've switched to passing in `LossState` to avoid passing in two more integers, although I will need to come back and clean up the existing ones.
Differential Revision: D25861702
fbshipit-source-id: e34c0edcb136bc1a2a6aeb898ecbb4cf11d0aa2c
Summary:
I think this should just work without the trailing `_E`. It was added
when we mixed up our own union based variant and boost::variant. Some compiler
flags didn't like that. Now we no longer have mixed up cases, this should be
fine
Reviewed By: lnicco
Differential Revision: D25589393
fbshipit-source-id: 6430dc20f8e81af0329d89e6990c16826da168b8
Summary:
We currently have a count of the number of packets retransmitted, and the number of packets marked as lost spuriously, but we do *not* have the total number of packets sent or lost, which is necessary to calculate statistics such as % of packets retransmitted % of losses that are spurious.
At the moment, we do not count loss events for D6D packets. Will likely add a separate counter for those in a subsequent diff.
Reviewed By: xttjsn
Differential Revision: D25540531
fbshipit-source-id: 80db729eb8c91f7805d342f4aab302a5b3ca4347
Summary: This reduces dependencies for both testing and instrumentation.
Reviewed By: mjoras
Differential Revision: D23997313
fbshipit-source-id: 5eb3a790c7bb2569dc1e941e3911ad4aac4e9258
Summary: As a drive-by: fixed a bug where I didn't cancel the pending event when timeouts expire.
Reviewed By: mjoras
Differential Revision: D23910767
fbshipit-source-id: 233e590c17a6c6b7a5f4a251bd8f78f6dfae3c0b
Summary:
Due to high number of RTT samples I refactored the OutstandingPacket to
split the packet data and packet metrics. We know have access to metrics
without the need of saving the packet data.
Pull Request resolved: https://github.com/facebookincubator/mvfst/pull/178
Reviewed By: mjoras
Differential Revision: D23711641
Pulled By: bschlinker
fbshipit-source-id: 53791f1f6f6e184f37afca991a873af05909fbd2
Summary:
This is the transport function that sends d6d probe to socket. It
- checks if there's the pending event `sendD6DProbePacket`, which will be set by upon d6d's probe timer expiration.
- uses the current probeSize to instantiate a D6DProbeScheduler, which will be changed by a small d6d state machine.
Reviewed By: yangchi
Differential Revision: D23193967
fbshipit-source-id: dfe6ce831cfd6ff0470e801644b95d4e8da34e87
Summary:
it turns out some client can delay ack for long time, randomly, which
may make us miss a good bandwidth update
Reviewed By: mjoras
Differential Revision: D23686348
fbshipit-source-id: 6524d08b6db03ede59a72049b4af98609740463a
Summary:
According to the spec, loss of d6d probe packet should not affect congestion
control, but AFAIU its ack should be considered a normal ack and has all the
implications of an ack (e.g. sample srtt, increment largest ack num).
Additionally, although d6d probe uses ping frame, neither sending a d6d probe
nor receiving the ack should have any bearing on either pendingEvents.sendPing
or pendingEvents.cancelPingTimeout.
To differentiate a d6d probe from a normal packet in tx/rx path, the isD6DProbe
flag is added. I also added a new struct to store d6d specific states (1 field
in this diff, more to come in subsequent diffs). In updateConnection, we
identify a d6d probe by comparing the packet sequence num with the sequence num
stored in the lastProbe field of the d6d state.
Reviewed By: mjoras
Differential Revision: D22551400
fbshipit-source-id: 85ec30c185666c3d5cf827bf03b4f92e6f22d4ec
Summary: This was probably a premature optimization and introduces complexity for dubious gain. Additionally a sequence of losses could potentially cause multiple updates to be delayed.
Reviewed By: yangchi
Differential Revision: D23628058
fbshipit-source-id: d6cf70baec8c34f0209ea791dadc724795fe0c21
Summary:
This is following a similar pattern than what was done for the client side.
Pull Request resolved: https://github.com/facebookincubator/mvfst/pull/160
Reviewed By: yangchi
Differential Revision: D23560951
Pulled By: xttjsn
fbshipit-source-id: 351417cbfa3230112fff4c4de59b307f88389cf6
Summary:
First step towards d6d. Semantically we need to separate the old `udpSendPacketLen` into `peerMaxPacketSize` as well as `currPMTU`. The former is directly tied to the peer's max_packet_size transport parameter whereas the second is controlled by d6d. To get the actual udp mss, call `conn_->getUdpSendPacketLen()`, which will use the minimum of the two if d6d is enabled, otherwise it will fallback to use `peerMaxPacketSize` only.
During processClientInitialParams and processServerInitialParams, we no longer need to check whether `canIgnorePathMTU` is set because that logic is moved to `setUdpSendPacketLen`. If d6d is enabled, we set both `peerMaxPacketSize` and `currPMTU` to `packetSize` because receiving an initial packet of size x indicates both that the peer accepts x-sized packet and that the PMTU is at least x.
Many call sites and tests are changed.
Faebook:
For now, d6d is considered enabled if `canIgnorePathMTU==false` and `turnoffPMTUD==true`. Down the road, from semantic & practical POV at least one of them should be renamed to something like `enableD6D`, since enabling d6d implies turning off PMTUD and that we should not ignore PMTU. We can keep one for the sake of testing.
Reviewed By: mjoras
Differential Revision: D22049806
fbshipit-source-id: 7a9b30b7e2519c132101509be56a9e63b803dc93
Summary: In the spec these are now lowercase.
Reviewed By: avasylev
Differential Revision: D22104880
fbshipit-source-id: adc9bbcd7bd9e7babb1946648d2867ec7dc9336b
Summary:
The problem with Ping being a simple frame:
(1) All SimpleFrames are in the same scheduler. So sending ping means we may
also send other frames which can be problematic if we send in Initial or
Handshake space
(2) Ping isn't retranmisttable. But other Simple frames are. So we are
certainly setting this wrong when we send pure Ping packet today.
That being said, there are cases where we need to treat Ping as retransmittable.
One is when it comes to update ack state: If peer sends us Ping, we may want to
Ack early rather than late. so it makes sense to treat Ping as retransmittable.
Another place is insertion into OutstandingPackets list. When our API user sends
Ping, then also add a Ping timeout. Without adding pure Ping packets into OP list,
we won't be able to track the acks to our Pings.
Reviewed By: mjoras
Differential Revision: D21763935
fbshipit-source-id: a04e97b50cf4dd4e3974320a4d2cc16eda48eef9
Summary:
On loss timer, currently we knock all handshake packets out of the OP
list and resend everything. This means miss RTT sampling opportunities during
handshake if loss timer fires, and given our initial loss timer is likely not a
good fit for many networks, it probably fires a lot.
This diff keeps handshake packets in the OP list, and add packet cloning
support to handshake packets so we can clone them and send as probes.
With this, the handshake alarm is finally removed. PTO will take care of all
packet number space.
The diff also fixes a bug in the CloningScheduler where we missed cipher
overhead setting. That broke a few unit tests once we started to clone
handshake packets.
The writeProbingDataToSocket API is also changed to support passing a token to
it so when we clone Initial, token is added correctly. This is because during
packet cloning, we only clone frames. Headers are fresh built.
The diff also changed the cloning behavior when there is only one outstanding
packet. Currently we clone it twice and send two packets. There is no point of
doing that. Now when loss timer fires and when there is only one outstanding
packet, we only clone once.
The PacketEvent, which was an alias of PacketNumber, is now a real type that
has both PacketNumber and PacketNumberSpace to support cloning of handshake
packets. I think in the long term we should refactor PacketNumber itself into a
real type.
Reviewed By: mjoras
Differential Revision: D19863693
fbshipit-source-id: e427bb392021445a9388c15e7ea807852ddcbd08
Summary:
The AckScheduler right now has two modes: Immediate mode which always
write acks into the current packet, pending mode which only write if
needsToSendAckImmediately is true. The FrameScheduler choose the Immdiate mode
if there are other data to write as well. Otherwise, it chooses the Pending
mode. But if there is no other data to write and needsToSendAckImmediately is
false, the FrameScheduler will end up writing nothing.
This isn't a problem today because to be on the write path, the shouldWriteData
function would make sure we either have non-ack data to write, or
needsToSendAckImmediately is true for a packet number space. But once we allow
packets in Initial and Handshake space to be cloned, we would be on the write
path when there are probe quota. The FrameScheduler's hasData function doesn't
check needsToSendAckImmediately. It will think it has data to write as long as
AckState has changed, but can ends up writing nothing with the Pending ack
mode.
I think given the write looper won't be schedule to loop when there is no
non-ack data to write and needsToSendAckImmediately is true, it's safe to
remove Pending ack mode from AckScheduler.
Reviewed By: mjoras
Differential Revision: D22044741
fbshipit-source-id: 26fcaabdd5c45c1cae12d459ee5924a30936e209
Summary: 0 is now a valid packet number, so we should make these optional. In cases where they are needed to construct packet builder, it should be safe to use 0 as default since it's only used for computing `twiceDistance` in PacketNumber.cpp.
Reviewed By: yangchi
Differential Revision: D21948454
fbshipit-source-id: af9fdc3e28ff85f1594296c4d436f24685a0acd6
Summary:
During PTO, when we try to write new data, we have been limiting to
only stream data and crypto data. This diff adds a few more types of data to
write during probes if they are available: Simple frame, Window updates, Rst
and Blocked frames.
Right now Acks won't be included here.
Reviewed By: mjoras
Differential Revision: D21460861
fbshipit-source-id: d75a296e96375690eee5e609efd7d1b5c103e8da
Summary: This is just in wrong place. There can be multiple frames per packet that we may be implicitly ACKing.
Reviewed By: lnicco
Differential Revision: D21493553
fbshipit-source-id: 89befab01c5a27d400089478717110bca8137d99
Summary: Without doing this the loss buffer can potentially carry data indefinitely.
Reviewed By: yangchi
Differential Revision: D21484470
fbshipit-source-id: 85ac9f274c9badb51eb431b85f67a654c399a018
Summary:
Currently the packet builder contructor will encode the packet
builder. This is fine when the builder creates its own output buffer. If later
on we decides not to use this builder, or it fails to build packet, the buffer
will be thrown away. But once the builder uses a buffer provided by caller, and
will be reused, we can no longer just throw it away if we decide not to use
this builder. So we have to delay the header encoding until we know we will use
the builder.
This is still not enough to solve the case where we want to use this builder,
it builds, then it fails . For that, we will need to retreat the tail position
of the IOBuf.
Reviewed By: mjoras
Differential Revision: D21000658
fbshipit-source-id: 4d758b3e260463b17c870618ba68bd4b898a7d4c
Summary: This is useful when you want to ensure that the IOBuf you pass in is encrypted inplace, as opposed to potentially creating a new one.
Reviewed By: yangchi
Differential Revision: D21135253
fbshipit-source-id: 89b6e718fc8da1324685c390c721a564bb77d01d
Summary:
As it turns out, the extra indirection from storing a unique_ptr is not worse than the gain from using an `F14ValueMap` versus an `F14VectorMap`.
This reduces the `find` cost measurably in profiles, and doesn't appear to have any real negative effects otherwise.
Reviewed By: yangchi
Differential Revision: D20923854
fbshipit-source-id: a75c4649ea3dbf0e6c89ebfe0d31d082bbdc31fd
Summary:
Currently we return the exact writable bytes number from a real
congestion controller or Path Challenger. This diff round the number up to the
nearest multiple of packet length. Doing so can greatly reduce weird bytes
counting/checking bugs we have around packet writing.
Reviewed By: mjoras
Differential Revision: D20265678
fbshipit-source-id: 2973dde3acc4b2008337127482185f34e16efb43
Summary:
our convention has always been to put the mock in the test dir under
the real class
Reviewed By: lnicco
Differential Revision: D20104476
fbshipit-source-id: 5215ffc9af7a6d7a5ac41109723a71f68f852af7
Summary: When we don't use NiceMock we end up with a ton of spam in failing tests for every callback that we didn't EXPECT. This makes failed test output extremely noisy.
Reviewed By: sharmafb
Differential Revision: D19977113
fbshipit-source-id: 1a083fba13308cd3f2859da364c8106e349775bb
Summary:
(1) The first change is the pacing rate calculation is simplified. It
removes the interval calculation and just uses the timer tick as the interval.
Then it calculates the burst size from there. For most cases these two
calculation should land at the same result, except when the
`cwnd < minBurstSize * tick / RTT`. In that case, the current calculation would
spread writes evenly across one RTT, assuming no new Ack arrives during the RTT;
while the new calculation uses the first a few ticks to finish the cwnd amount
of data.
(2) Then this diff changes how we compensate late timer. Now the pacer will
maintain a nextWriteTime_ and lastWriteTime_, which makes it easier to
calculate time elapsed since last write. Then each time writer tries to write,
it will be allowed to write timeElapsed * pacingRate. This is much more
intuitive than the current logic.
(3) The diff also adds pacing limited tracking into the pacer. An expected
pacing rate is cached when pacing rate is refreshed by congestion controller.
Then with packets sent out, Pacer keeps calculating the current send rate. When
the send rate is lower, Pacer sets pacingLimited_ to true. Otherwise false.
Only when the connection is not pacing limited, the lastWriteTime_ will be
packet sent time, otherwise it will be set to the last nextWriteTime_. In other
words: if the send rate is lower than expected, we use the expected send time
instead of real send time to calculate time elapsed, to allow higher late
timer compenstation, to give pacer a chance to catch up.
(4) Finally this diff removes the token collecting behavior in the pacer. I
think having tokens increaed, instead of reset, when an ack refreshes the pacing
rate or when we compensate late time, is quite confusing to some people. After
all the above changes, I found tperf can still sustain good throughput without
always increase tokens, and rally actualy gives even better results. So i think
we can remove this part of the pacer that's potentially very confusing to
people who don't know how we got there.
Reviewed By: mjoras
Differential Revision: D19252744
fbshipit-source-id: b83e4a01fc812fc52117f3ec0f5c3be1badf211f
Summary:
Previously we stored an `IntervalSet` in each `WriteAckFrame`. We don't need to do this, as by the time we are writing an `ACK` the `IntervalSet` is complete. Instead of bothering with the `IntervalSet` operations, we can simply serialize it to a reverse-sorted `vector.`
Additionally this has some micro-optimizations for filling the ACK frame, with a new function for getting varint size.
Reviewed By: yangchi
Differential Revision: D19397728
fbshipit-source-id: ba6958fb36a4681edaa8394b1bcbbec3472e177d