Summary: We don't know if this pointer remains valid after the callback.
Reviewed By: yangchi
Differential Revision: D22709445
fbshipit-source-id: 7802ab08052b06af0268652a44a080d9d9673122
Summary:
Adds support for timestamping on TX (TX byte events). This allows the application to determine when a byte that it previously wrote to the transport was put onto the wire.
Callbacks are processed within a new function `QuicTransportBase::processCallbacksAfterWriteData`, which is invoked by `writeSocketDataAndCatch`.
Reviewed By: mjoras
Differential Revision: D22008855
fbshipit-source-id: 99c1697cb74bb2387dbad231611be58f9392c99f
Summary:
Introduces framework for handling callbacks when a `ByteEvent` occurs.
- Adds `ByteEventCallback`, a new class that can be used to notify a callback implementation about delivery (ACK) of a byte and is used in the subsequent diff for notifications about TX of a byte.
- Adds `ByteEvent` and `ByteEventCancellation`, both of which are used to communicate ancillary information when a ByteEvent happens to the callback implementation. This makes it easier to add additional fields (such as kernel timestamps, information about ACK delays, etc) in the future, without requiring the function signatures of all callback implementations to be updated.
- For the moment, `DeliveryCallback` inherits from `ByteEventCallback` so that existing implementations do not need to be adjusted; `DeliveryCallback` implements `ByteEventCallback`'s methods and transforms the received `ByteEvents` into the expected callback. I will deprecate `DeliveryCallback` in a subsequent diff.
- Refactors existing logic for handling delivery callbacks to be generic and capable of handling different types of byte events. This allows the same logic for registering and cancelling byte events to be reused for ACK and TX timestamps; custom logic is only required to decide when to trigger the callback.
Reviewed By: mjoras
Differential Revision: D21877803
fbshipit-source-id: 74f344aa9f83ddee2a3d0056c97c62da4a581580
Summary:
Adds `QuicSocket::InstrumentationObserver`, an observer that can be registered to receive various transport events.
The ultimate goal of this class is to provide an interface similar to what we have through TCP tracepoints. This means we need to be able to register multiple callbacks.
- Initially, the first event exposed through the callback is app rate limited. In the future, we will expose retransmissions (which are loss + TLP), loss events (confirmed), spurious retransmits, RTT measurements, and raw ACK / send operations to enable throughput and goodput measurements.
- Multiple callbacks can be registered, but a `folly::small_vector` is used to minimize memory overhead in the common case of between 0 and 2 callbacks registered.
- We currently have a few different callback classes to support instrumentation, including `QuicTransportStatsCallback` and `QLogger`. However, neither of these meet our needs:
- We only support installing a single transport stats callback and QLogger callback, and they're both specialized to specific use cases. TransportStats is about understanding in aggregation how often an event (like CWND limited) is occurring, and QLogger is about logging a specific event, instead of notifying a callback about an event and allowing it to decide how to proceed.
- Ideally, we can find a way to create a callback class that handles all three cases; we can start strategizing around that as we extend `InstrumentationObserver` and identify overlap.
Differential Revision: D21923745
fbshipit-source-id: 9fb4337d55ba3e96a89dccf035f2f6978761583e
Summary:
Notify connection callback implementation when the transport becomes app rate limited
- The callback implementation is not required to define a handler for this callback (not pure virtual).
- D21923745 will add an instrumentation callback class that can be used to allow other components to receive this signal. I'm extending to the connection callback because the overhead seems low, and we already have the connection callback registered for `HQSession`.
Reviewed By: mjoras, lnicco
Differential Revision: D21923744
fbshipit-source-id: 153696aefeab82b7fd8a6bc299c011dcce479995
Summary:
Adds `QuicSocket::LifecycleObserver`, an observer that is notified when a socket is closed and destroyed.
- Can be used by instrumentation that ties its lifetime to that of the socket.
- Multiple observer can be registered, but a `folly::small_vector` is used to minimize memory overhead in the common case of 0 - 2 being registered.
- `folly::AsyncSocket` has a matching interface being added (D21613750), so instrumentation can follow the same paradigm for both QUIC and TCP.
- In the future, will extend to also be triggered when a transport becomes connected, similar to what we have for `AsyncSocket`.
Reviewed By: mjoras
Differential Revision: D21653651
fbshipit-source-id: 0aa374cc0dd9b9c11d81b49597326bd53264531d
Summary: Without this option the read callbacks come in an somewhat random order based on stream ID. This enforces ordering by stream ID, which is useful for some applications.
Reviewed By: yangchi
Differential Revision: D22442837
fbshipit-source-id: cd97de9760deff7b19841f46e23da2004df31492
Summary:
Callbacks notifying that max stream limit has been increased and new streams can be created.
This get triggered every time limit gets bumped, doesn't track if limit was previously exhausted or not.
Reviewed By: mjoras
Differential Revision: D22339814
fbshipit-source-id: 6bcd0b52adb61e1ac440d11559dc8544ad7aa1ac
Summary:
The problem with Ping being a simple frame:
(1) All SimpleFrames are in the same scheduler. So sending ping means we may
also send other frames which can be problematic if we send in Initial or
Handshake space
(2) Ping isn't retranmisttable. But other Simple frames are. So we are
certainly setting this wrong when we send pure Ping packet today.
That being said, there are cases where we need to treat Ping as retransmittable.
One is when it comes to update ack state: If peer sends us Ping, we may want to
Ack early rather than late. so it makes sense to treat Ping as retransmittable.
Another place is insertion into OutstandingPackets list. When our API user sends
Ping, then also add a Ping timeout. Without adding pure Ping packets into OP list,
we won't be able to track the acks to our Pings.
Reviewed By: mjoras
Differential Revision: D21763935
fbshipit-source-id: a04e97b50cf4dd4e3974320a4d2cc16eda48eef9
Summary:
On loss timer, currently we knock all handshake packets out of the OP
list and resend everything. This means miss RTT sampling opportunities during
handshake if loss timer fires, and given our initial loss timer is likely not a
good fit for many networks, it probably fires a lot.
This diff keeps handshake packets in the OP list, and add packet cloning
support to handshake packets so we can clone them and send as probes.
With this, the handshake alarm is finally removed. PTO will take care of all
packet number space.
The diff also fixes a bug in the CloningScheduler where we missed cipher
overhead setting. That broke a few unit tests once we started to clone
handshake packets.
The writeProbingDataToSocket API is also changed to support passing a token to
it so when we clone Initial, token is added correctly. This is because during
packet cloning, we only clone frames. Headers are fresh built.
The diff also changed the cloning behavior when there is only one outstanding
packet. Currently we clone it twice and send two packets. There is no point of
doing that. Now when loss timer fires and when there is only one outstanding
packet, we only clone once.
The PacketEvent, which was an alias of PacketNumber, is now a real type that
has both PacketNumber and PacketNumberSpace to support cloning of handshake
packets. I think in the long term we should refactor PacketNumber itself into a
real type.
Reviewed By: mjoras
Differential Revision: D19863693
fbshipit-source-id: e427bb392021445a9388c15e7ea807852ddcbd08
Summary: Updating signature of writeChain to stop returning IOBuf, as implementation never actually returns back buffer and always writes full buffer.
Reviewed By: mjoras
Differential Revision: D21740607
fbshipit-source-id: f473ed8f3c6c6cbe2dd5db8f1247c912f3e77d0b
Summary:
as title. Instead of checking against the packet size limit, this
leaves a 10 bytes room since we have a bug that writes out packets that's
slightly larger than udpSendPacketLen. Most of such packet will be 1-2 bytes
larger than original packets.
Reviewed By: mjoras
Differential Revision: D21642386
fbshipit-source-id: 6ca68d48828cb7f8ee692e0d5f452f5389a56bfd
Summary: As in title. Also explicitly do some copying of values out of the stream manger.
Reviewed By: yangchi
Differential Revision: D21705460
fbshipit-source-id: 2399b8561a2aa3b6d2b64d154f56ceff22c40186
Summary: There's no particular reason to wait for the drain period before removing state. By doing this a failed migration will immediately trigger the server to drop state, triggered a stateless reset signal to the client sooner.
Reviewed By: yangchi, lnicco
Differential Revision: D21643179
fbshipit-source-id: a60ca2c92935a3e6ba773d7936c25317733f7b97
Summary: The spec suggests doing this, and it is a better semantic than only considering the local one.
Reviewed By: yangchi
Differential Revision: D21433433
fbshipit-source-id: c38abc04810eb8807597991ce8801d81f9edc462
Summary:
Some of these are relevant to performance, others are just cleanups of the API usage.
The performance diffs are generally eschewing multiple lookups into the underlying structures, by reusing the iterator from `find`.
The cleanups are mostly getting rid of checks against `getStream`, which cannot return `nullptr`.
Reviewed By: yangchi
Differential Revision: D20987998
fbshipit-source-id: 1b95fd8b14da934fc921527aa858dcebf80ec8e9
Summary:
We always add stuff into peekable streams set since everything with a
read buffer is peekable. This makes the this set potentially big, and the
find_if quite expensive. For apps that isn't interested in peeking, this is a
waste of time.
Reviewed By: mjoras, lnicco
Differential Revision: D20972192
fbshipit-source-id: d696ded936140c622e019d608c72a646df405111
Summary: Doing an extra lookup here is wasteful.
Reviewed By: yangchi, lnicco
Differential Revision: D20848227
fbshipit-source-id: 8a1fee28597fae3cf1ac284a1bd781936ddff931
Summary:
This is a pretty confusing API behavior. This means that when reading from stream A, stream B could potentially be removed if it doesn't have a read callback set. This effectively means you cannot safely call read from onNew*Stream.
There's no real reason the check for a closed stream here, as the streams will get cleaned up from the onNetworkData or read looper calls.
Reviewed By: yangchi
Differential Revision: D20848090
fbshipit-source-id: 38eba9e5a24b6ddf5bf05ac75787dda9c001c1c6
Summary:
Right now we use un-sanitized error message on onConnectionError only.
This also uses it in app callbacks
Reviewed By: lnicco
Differential Revision: D20843327
fbshipit-source-id: c81896d41b712a7165ac6f6b381d3687ecca2a3a
Summary: It's not a great practice to leak the excpetion string via a conn close, but it is useful for the app to be able to report what the exception string was.
Reviewed By: yangchi
Differential Revision: D20628591
fbshipit-source-id: bf6eb5f33f516cec0034caed53da998643fcc120
Summary:
This ensures they are available to the whole stack rather than the transport only. The validator needs it in the server case, and will soon need it in the client case, so that seems appropriate to make it available.
Pull Request resolved: https://github.com/facebookincubator/mvfst/pull/117
Reviewed By: yangchi
Differential Revision: D20536366
Pulled By: mjoras
fbshipit-source-id: a76d369c0a82b9be1f985aed1f33f7a6b338a2ae
Summary:
Currnetly there isn't a way for apps to unregister a pending
WriteCallbacks for a stream. resetStream() does that if the transport isn't in
Closed state. This diff adds such support even if transport is already in
Closed state. This solves the problem where app has a class that's both stream
ReadCallback and stream WriteCallback and readError would kill the callback
object itself. The new API gives such class a chance to remove itself from the
transport.
Reviewed By: mjoras
Differential Revision: D20545067
fbshipit-source-id: 81d9f025310769aadef062711a49adc47a0639d0
Summary:
This somewhat contrains what applications using mvfst can do with their errors, as it makes the semantics of onConnectionEnd vs onConnectionError tied to the value of GenericApplicationError::NO_ERROR.
For now, we believe this is fine, and fixes a case of connection close mis classification by HQSession.
Reviewed By: yangchi
Differential Revision: D20550139
fbshipit-source-id: eec7d90c33141bfa7f1280bf5b569818890d1130
Summary: Previously we would end up writing a non-application close when the application calls close(folly::none). This isn't correct, as those cases should be an application error with no error.
Reviewed By: afrind
Differential Revision: D20518529
fbshipit-source-id: fe069cccc32bd550fb3ec599694110955816993f
Summary: Resetting a stream has the semantics of abandoning all writes on that stream. As such, the application resetting the stream is an implicit signal that it does not care about further writes. It is reasonable then for an application to not expect further write callbacks to trigger.
Reviewed By: lnicco
Differential Revision: D20462859
fbshipit-source-id: b6701e6a262d618c5cd93fd1531095a134f6554e
Summary:
Similar to the exiting empty write loop callback. The new API will
trigger when we read from socket but back with empty hands.
Reviewed By: lnicco
Differential Revision: D20130432
fbshipit-source-id: 9b61310b4ea4c5c7999742c5a8761a831f20f7b7
Summary:
(1) The first change is the pacing rate calculation is simplified. It
removes the interval calculation and just uses the timer tick as the interval.
Then it calculates the burst size from there. For most cases these two
calculation should land at the same result, except when the
`cwnd < minBurstSize * tick / RTT`. In that case, the current calculation would
spread writes evenly across one RTT, assuming no new Ack arrives during the RTT;
while the new calculation uses the first a few ticks to finish the cwnd amount
of data.
(2) Then this diff changes how we compensate late timer. Now the pacer will
maintain a nextWriteTime_ and lastWriteTime_, which makes it easier to
calculate time elapsed since last write. Then each time writer tries to write,
it will be allowed to write timeElapsed * pacingRate. This is much more
intuitive than the current logic.
(3) The diff also adds pacing limited tracking into the pacer. An expected
pacing rate is cached when pacing rate is refreshed by congestion controller.
Then with packets sent out, Pacer keeps calculating the current send rate. When
the send rate is lower, Pacer sets pacingLimited_ to true. Otherwise false.
Only when the connection is not pacing limited, the lastWriteTime_ will be
packet sent time, otherwise it will be set to the last nextWriteTime_. In other
words: if the send rate is lower than expected, we use the expected send time
instead of real send time to calculate time elapsed, to allow higher late
timer compenstation, to give pacer a chance to catch up.
(4) Finally this diff removes the token collecting behavior in the pacer. I
think having tokens increaed, instead of reset, when an ack refreshes the pacing
rate or when we compensate late time, is quite confusing to some people. After
all the above changes, I found tperf can still sustain good throughput without
always increase tokens, and rally actualy gives even better results. So i think
we can remove this part of the pacer that's potentially very confusing to
people who don't know how we got there.
Reviewed By: mjoras
Differential Revision: D19252744
fbshipit-source-id: b83e4a01fc812fc52117f3ec0f5c3be1badf211f
Summary:
All instancesi of LIKELY and UNLIKELY probably should be removed. We will
add them back in if we see pathologies in performance profiles.
Reviewed By: mjoras
Differential Revision: D19163441
fbshipit-source-id: c4c2494d18ecfd28f00af1e68ecaf1e85c1a2e10
Summary:
Currently, before server generate the destination CID, we route packets with client's address, port and client's source connection ID. But now that client can use 0-len source connection ID, the different connections from the same client address and port will be routed to the same server connections.
This diff changes it to use client's initial destination connection ID as part of the routing key.
Reviewed By: udippant
Differential Revision: D19268354
fbshipit-source-id: 837f5bd2f1e3a74957afacf7aabad922b1719219
Summary:
In QuicExceptions, in the case where the toString method was able to statically determine the response strings, we simply return the string literals in a folly::StringPiece instead of unnecessarily copying them into std::string.
Some toString methods had some dynamically generated responses and thus could not be updated. Added a TODO explaining the fact.
Reviewed By: mjoras
Differential Revision: D19192117
fbshipit-source-id: d9e5f202f9bf240009e8b8fd16f337b0506fbeb0
Summary: This is sanitizing our error strings so that we do not leak them on the wire in connection close reasons.
Reviewed By: yangchi
Differential Revision: D18657317
fbshipit-source-id: 06cdd983fd2c9cade77f8410e124920e4cdfac59
Summary:
Previously we track them since we thought we can get some additional
RTT samples. But these are bad RTT samples since peer can delays the acking of
pure acks. Now we no longer trust such RTT samples, there is no reason to keep
tracking pure ack packets.
Reviewed By: mjoras
Differential Revision: D18946081
fbshipit-source-id: 0a92d88e709edf8475d67791ba064c3e8b7f627a
Summary:
In the current client code we read one packet, go back to epoll, and then read
another packet. This is not very efficient.
This changes it so that we can read multiple packets in one go from an epoll
callback.
This only performs changes on the client
Reviewed By: mjoras
Differential Revision: D18797962
fbshipit-source-id: 81be82111064ade4fe3a07b1d9d3d01e180f29f5
Summary: The state machine logic is quite abstruse, this modifies it to make it more readable.
Reviewed By: siyengar
Differential Revision: D18488301
fbshipit-source-id: c6fd52973880931e34904713e8b147f56d0c4629
Summary:
F14 should be faster and have lower memory urilization for near-empty sets and maps. For most H3 connections these are mosotly going to be near-empty, so CPU wins will likely be minimal.
For usecases that have extremely high numbers of streams, there are likely going to be CPU wins.
Reviewed By: yangchi
Differential Revision: D18484047
fbshipit-source-id: 7f5616d6d6c8651ca5b03468d7d8895d1f51cb53
Summary:
QLogConnectionMigrationEvent:
Allow observing client-side ConnectionMigration attempts (replacing the
socket), and observing the server-side changing the peer address it
is writing to.
QLogPathValidationEvent:
Allow observing successful/failed path validation attempts.
Success is considered as a correct PathResponse being returned.
A Failure is only published on the timeout expiring, not an invalid
PathChallenge frame being returned (we do not cancel this).
There are already QlogEvents for PathChallenge/PathResponse that
can be observed.
Reviewed By: JunqiWang
Differential Revision: D18340999
fbshipit-source-id: 512108f82a6e082021c0bd3254f108c128b17ba3
Summary: inspect PN, and when it reaches 2^62 -2 trigger a transport close through a pending event.
Reviewed By: yangchi
Differential Revision: D18239661
fbshipit-source-id: 1a218678099016693149e12ff121e2a39b95aecc