1
0
mirror of https://gitlab.isc.org/isc-projects/bind9.git synced 2025-04-18 09:44:09 +03:00

42960 Commits

Author SHA1 Message Date
Mark Andrews
ebe751f88c Add dns_zone_getzoneversion
Returns the EDNS ZONEVERSION for the zone.  Return the database
specific version otherwise return a type 0 version (serial).
2025-03-24 22:16:09 +00:00
Mark Andrews
5c126a7e66 Add dns_db_getzoneversion
Provides a database method to return a database specific EDNS
ZONEVERSION option.  The default EDNS ZONEVERSION is serial.
2025-03-24 22:16:09 +00:00
Mark Andrews
998b93acca Add EDNS ZONEVERSION option counter 2025-03-24 22:16:09 +00:00
Mark Andrews
2356b75e8c Add support for EDNS ZONEVERSION to dig
This add the +[no]zoneversion option to dig which adds the
EDNS ZONEVERSION option to requests.
2025-03-24 22:16:09 +00:00
Mark Andrews
ec3cbc5468 Extend message code to display ZONEVERSION 2025-03-24 22:16:09 +00:00
Mark Andrews
b2c2b6755e Check EDNS ZONEVERSION when parsing OPT record 2025-03-24 22:16:09 +00:00
Michal Nowak
687cbc99f4 fix: ci: Set more lenient respdiff limits
After !9950, respdiff's maximal disagreement percentage needs to be
adjusted as target disagreements between the tested version of the
"main" branch and the reference one jumped for the respdiff,
respdiff:asan, and respdiff:tsan jobs from on average 0.07% to 0.16% and
from 0.12% to 0.17% for the respdiff-third-party job.

In !9950, we concluded setting MAX_DISAGREEMENTS_PERCENTAGE to double
the average disagreement percentage works fine in the CI.

Merge branch 'mnowak/more-lenient-respdiff-limits' into 'main'

See merge request isc-projects/bind9!10293
2025-03-24 14:11:03 +00:00
Michal Nowak
9acc0c8543 Set more lenient respdiff limits
After !9950, respdiff's maximal disagreement percentage needs to be
adjusted as target disagreements between the tested version of the
"main" branch and the reference one jumped for the respdiff,
respdiff:asan, and respdiff:tsan jobs from on average 0.07% to 0.16% and
from 0.12% to 0.17% for the respdiff-third-party job.

In !9950, we concluded setting MAX_DISAGREEMENTS_PERCENTAGE to double
the average disagreement percentage works fine in the CI.
2025-03-24 14:10:26 +00:00
Mark Andrews
49ecb158d4 fix: dev: Fix adbname reference
Call `dns_adbname_ref` before calling `dns_resolver_createfetch` to
ensure `adbname->name` remains stable for the life of the fetch.

Closes #5239

Merge branch '5239-fix-adb-reference-counting' into 'main'

See merge request isc-projects/bind9!10290
2025-03-21 00:26:25 +00:00
Mark Andrews
8e7229f641 Fix gaining adbname reference
Call dns_adbname_ref before calling dns_resolver_createfetch to
ensure adbname->name remains stable for the life of the fetch.
2025-03-20 23:25:29 +00:00
Evan Hunt
3415392d01 fix: dev: Optimize key ID check when searching for matching keys
When searching through a DNSKEY or KEY rrset for the key matching a particular algorithm and ID, it's a waste of time to convert every key into a `dst_key` object; it's faster to compute the key ID from the rdata, then do the full key conversion after determining that we've found the right key. This optimization was already used in the validator, but it's been refactored for code clarity, and is now also used in query.c and message.c.

Merge branch 'each-refactor-key-search' into 'main'

See merge request isc-projects/bind9!10258
2025-03-20 18:25:05 +00:00
Evan Hunt
2e6107008d optimize key ID check when searching for matching keys
when searching a DNSKEY or KEY rrset for the key that matches
a particular algorithm and ID, it's a waste of time to convert
every key into a dst_key object; it's faster to compute the key
ID by checksumming the region, and then only do the full key
conversion once we know we've found the correct key.

this optimization was already in use in the validator, but it's
been refactored for code clarity, and is now also used in query.c
and message.c.
2025-03-20 18:22:58 +00:00
Evan Hunt
341b962665 move dns_zonekey_iszonekey() to dns_dnssec module
dns_zonekey_iszonekey() was the only function defined in the
dns_zonekey module, and was only called from one place. it
makes more sense to group this with dns_dnssec functions.
2025-03-20 18:22:58 +00:00
Alessio Podda
d3db9ccf53 chg: dev: Switch symtab to use fxhash hashing
This merge request resolves some performance regressions introduced
with the change from isc_symtab_t to isc_hashmap_t.

The key improvements are:

1. Using a faster hash function than both isc_hashmap_t and
   isc_symtab_t. The previous implementation used SipHash, but the
   hashflood resistance properties of SipHash are unneeded for config
   parsing.
2. Shrinking the initial size of the isc_hashmap_t used inside
   isc_symtab_t. Symtab is mainly used for config parsing, and the
   when used that way it will have between 1 and 50 keys, but the
   previous implementation initialized a map with 128 slots.
   By initializing a smaller map, we speed up mallocs and optimize for
   the typical case of few config keys.
3. Slight optimization of the string matching in the hashmap, so that
   the tail is handled in a single load + comparison, instead of byte
   by byte.
   Of the three improvements, this is the least important.

Merge branch 'alessio/fxhash-symtab' into 'main'

See merge request isc-projects/bind9!10204
2025-03-20 13:00:12 +00:00
alessio
e1e10adc3a Switch symtab to use fxhash hashing
This merge request resolves some performance regressions introduced
with the change from isc_symtab_t to isc_hashmap_t.

The key improvements are:

1. Using a faster hash function than both isc_hashmap_t and
   isc_symtab_t. The previous implementation used SipHash, but the
   hashflood resistance properties of SipHash are unneeded for config
   parsing.
2. Shrinking the initial size of the isc_hashmap_t used inside
   isc_symtab_t. Symtab is mainly used for config parsing, and the
   when used that way it will have between 1 and ~50 keys, but the
   previous implementation initialized a map with 128 slots.
   By initializing a smaller map, we speed up mallocs and optimize for
   the typical case of few config keys.
3. Slight optimization of the string matching in the hashmap, so that
   the tail is handled in a single load + comparison, instead of byte
   by byte.
   Of the three improvements, this is the least important.
2025-03-20 11:26:09 +01:00
Matthijs Mekking
d2214cb704 fix: usr: Fix several small DNSSEC timing issues
The following small issues related to `dnssec-policy` have been fixed:
- In some cases the key manager inside BIND 9 could run every hour, while it could have run less often.
- While `CDS` and `CDNSKEY` records will be removed correctly from the zone when the corresponding `DS` record needs to be updated, the expected timing metadata when this will happen was never set.
- There were a couple of cases where the safety intervals are added inappropriately, delaying key rollovers longer than necessary.
- If you have identical `keys` in your `dnssec-policy`, they may be retired inappropriately. Note that having keys with identical properties is discouraged in all cases.

Closes #5242

Merge branch '5242-several-keymgr-issues' into 'main'

See merge request isc-projects/bind9!10251
2025-03-20 10:13:22 +00:00
Matthijs Mekking
3e836a87e6 Update Retired and Removed if we update lifetime
If we are updating the lifetime, and it was not set before, also
set/update the Retired and Removed timing metadata.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
b93cb2e80e Fix a key generation issue in the tests
The dnssec-keygen command for the ZSK generation for the zone
multisigner-model2.kasp was wrong (no ZSK was generated in the setup
script, but when 'named' is started, the missing ZSK was created
anyway by 'dnssec-policy'.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
6c6b8796d3 Fix keymgr bug wrt setting the next time
Only set the next time the keymgr should run if the value is non zero.
Otherwise we default back to one hour. This may happen if there is one
or more key with an unlimited lifetime.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
8c9d2eb2bf keymgr: also set DeleteCDS when setting PublishCDS
The keymgr never set the expected timing metadata when CDS/CDNSKEY
records for the corresponding key will be removed from the zone. This
is not troublesome, as key states dictate when this happens, but with
the new pytest we use the timing metadata to determine if the CDS and/or
CDNSKEY for the given key needs to be published.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
63edc4435f Fix wrong usage of safety intervals in keymgr
There are a couple of cases where the safety intervals are added
inappropriately:

1. When setting the PublishCDS/SyncPublish timing metadata, we don't
   need to add the publish-safety value if we are calculating the time
   when the zone is completely signed for the first time. This value
   is for when the DNSKEY has been published and we add a safety
   interval before considering the DNSKEY omnipresent.

2. The retire-safety value should only be added to ZSK rollovers if
   there is an actual rollover happening, similar to adding the sign
   delay.

3. The retire-safety value should only be added to KSK rollovers if
   there is an actual rollover happening. We consider the new DS
   omnipresent a bit later, so that we are forced to keep the old DS
   a bit longer.
2025-03-20 10:12:16 +00:00
Matthijs Mekking
ef671919d5 Fix a small keymgr bug
While converting the kasp system test to pytest, I encountered a small
bug in the keymgr code. We retire keys when there is more than one
key matching a 'keys' line from the dnssec-policy. But if there are
multiple identical 'keys' lines, as is the case for the test zone
'checkds-doubleksk.kasp', we retire one of the two keys that have the
same properties.

Fix this by checking if there are double matches. This is not fool proof
because there may be many keys for a few identical 'keys' lines, but it
is good enough for now. In practice it makes no sense to have a policy
that dictates multiple keys with identical properties.
2025-03-20 10:12:16 +00:00
Mark Andrews
329a332708 fix: usr: Fix write after free in validator code
Raw integer pointers were being used for the validator's nvalidations
and nfails values but the memory holding them could be freed before
they ceased to be used.  Use reference counted counters instead.

Closes #5239

Merge branch '5239-use-counter-for-nvalidations-and-nfailss' into 'main'

See merge request isc-projects/bind9!10248
2025-03-20 01:30:11 +00:00
Mark Andrews
bfbaacc9a0 Use reference counted counters for nfail and nvalidations
The fetch context that held these values could be freed while there
were still active pointers to the memory.  Using a reference counted
pointer avoids this.
2025-03-20 09:12:49 +11:00
Andoni Duarte Pintado
8dba96d71e Merge tag 'v9.21.6' 2025-03-19 17:36:14 +01:00
Michal Nowak
1d688e89b8 fix: test: Fix the log-report-channel zones check
The check looks for logs that are not present, fails to make the
possible failure visible, and fails to bump the check enumerator:

    I:checking that log-report-channel zones fail if '*._er/TXT' is missing (129)
    grep: test.out4.129: No such file or directory
    grep: test.out4.129: No such file or directory
    I:checking that raw zone with bad class is handled (129)

The issue appeared in #3659.

Merge branch 'mnowak/checkzone-test-fix' into 'main'

See merge request isc-projects/bind9!10286
2025-03-19 08:00:42 +00:00
Michal Nowak
c2e60f9a5a Fix the log-report-channel zones check
The check looks for logs that are not present, fails to make the
possible failure visible, and fails to bump the check enumerator:

    I:checking that log-report-channel zones fail if '*._er/TXT' is missing (129)
    grep: test.out4.129: No such file or directory
    grep: test.out4.129: No such file or directory
    I:checking that raw zone with bad class is handled (129)
2025-03-19 08:00:19 +00:00
Mark Andrews
ed727ee924 fix: test: Fix failing grep invocation on OpenBSD
Lines starting with A or NSEC are expected but not matched with the
OpenBSD grep. Extended regular expressions with direct use of
parentheses and the pipe symbol is more appropriate.

    I:checking RRSIG query from cache (154)
    I:failed

The issue appeared in #4805.

Merge branch 'mnowak/openbsd-grep-fix' into 'main'

See merge request isc-projects/bind9!10285
2025-03-19 00:04:39 +00:00
Michal Nowak
00584d6f29 Fix failing grep invocation on OpenBSD
Lines starting with A or NSEC are expected but not matched with the
OpenBSD grep. Extended regular expressions with direct use of
parentheses and the pipe symbol is more appropriate.

    I:checking RRSIG query from cache (154)
    I:failed
2025-03-18 23:29:22 +00:00
Arаm Sаrgsyаn
d30b9eb46e fix: usr: Fix resolver statistics counters for timed out responses
When query responses timed out, the resolver could incorrectly increase the regular responses counters, even if no response was received. This has been fixed.

Closes #5193

Merge branch '5193-resolver-statistics-counters-fix' into 'main'

See merge request isc-projects/bind9!10227
2025-03-18 17:05:23 +00:00
Aram Sargsyan
0c7fa8d572 Test resolver statistics when responses time out
Add a test to check that the timed out responses do not skew the
normal responses statistics counters, and that they do update the
timeouts counter.
2025-03-18 16:20:59 +00:00
Aram Sargsyan
830e548111 Fix the resolvers RTT-ranged responses statistics counters
When a response times out the fctx_cancelquery() function
incorrectly calculates it in the 'dns_resstatscounter_queryrtt5'
counter (i.e. >=1600 ms). To avoid this, the rctx_timedout()
function should make sure that 'rctx->finish' is NULL. And in order
to adjust the RTT values for the timed out server, 'rctx->no_response'
should be true. Update the rctx_timedout() function to make those
changes.
2025-03-18 16:20:59 +00:00
Aram Sargsyan
12e7dfa397 Fix resolver responses statistics counter
The resquery_response() function increases the response counter without
checking if the response was successful. Increase the counter only when
the result indicates success.
2025-03-18 16:20:59 +00:00
Michał Kępień
c6e5710846 chg: test: asyncserver.py: TCP improvements
This branch started off as `michal/upforwd-asyncserver`.  It quickly
turned out that the critical `asyncserver.py` change that was needed for
the `upforwd` system test was for the server to be able to read multiple
TCP queries on a single connection.  As currently present in `main`,
`asyncserver.py` closes every client connection after servicing a single
query.  Retaining that behavior would cause the `upforwd` system test to
fail and, in general, capturing all data sent by a client seems more
useful in tests than just closing connections quickly.  `asyncserver.py`
can always be extended in the future (e.g. by adding a new
`ResponseAction` that the networking code would react to) to reinstate
the original behavior, if it turns out to be necessary.

While working on changing that particular `asyncserver.py` behavior, I
noticed a couple of other deficiencies in the TCP connection handling
code, so I started addressing them.  One thing led to another and before
I noticed, enough changes were applied to be worth doing a separate
merge request, particularly given that the actual rewrite of
`upforwd/ans4/ans.pl` using `asyncserver.py` is trivial once the
required changes to `asyncserver.py` itself are applied.

Merge branch 'michal/asyncserver-tcp-improvements' into 'main'

See merge request isc-projects/bind9!10276
2025-03-18 15:30:35 +00:00
Michał Kępień
575a874582
Handle queries indefinitely on each TCP connection
Instead of closing every incoming TCP connection after handling a single
query, continue receiving queries on each TCP connection until the
client disconnects itself.  When coupled with response dropping, this
enables silently receiving all incoming data, simulating an unresponsive
server.
2025-03-18 16:28:18 +01:00
Michał Kępień
68fe9a5df5
Enable receiving chunked TCP DNS messages
A TCP DNS client may send its queries in chunks, causing
StreamReader.read() to return less data than previously declared by the
client as the DNS message length; even the two-octet DNS message length
itself may be split up into two single-octet transmissions.  Sending
data in chunks is valid client behavior that should not be treated as an
error.  Add a new helper method for reading TCP data in a loop, properly
distinguishing between chunked queries and client disconnections.  Use
the new method for reading all TCP data from clients.
2025-03-18 16:28:18 +01:00
Michał Kępień
8c3f673f37
Extend TCP logging
Emit more log messages from TCP connection handling code and extend
existing ones to improve debuggability of servers using asyncserver.py.
2025-03-18 16:28:18 +01:00
Michał Kępień
748ed4259b
Handle connection resets during reading
A TCP peer may reset the connection at any point, but asyncserver.py
currently only handles connection resets when it is sending data to the
client.  Handle connection resets during reading in the same way.
2025-03-18 16:28:18 +01:00
Michał Kępień
a956947fba
Refactor AsyncDnsServer._handle_tcp()
Split up AsyncDnsServer._handle_tcp() into a set of smaller methods to
improve code readability.
2025-03-18 16:28:18 +01:00
Michał Kępień
e4c3186a7c
Gracefully handle TCP client disconnections
Prevent premature client disconnections during reading from triggering
unhandled exceptions in TCP connection handling code.
2025-03-18 16:28:18 +01:00
Michał Kępień
5764a9d660
Simplify peer address formatting
Add a helper class, Peer, which holds the <host, port> tuple of a
connection endpoint and gets pretty-printed when formatted as a string.
This enables passing instances of this new class directly to logging
functions, eliminating the need for the AsyncDnsServer._format_peer()
helper method.
2025-03-18 16:28:18 +01:00
Nicki Křížek
a2042e603e chg: ci: Allow re-run of the shotgun jobs to reduce false positives
The false positive rate is about 10-20 % when evaluating shotgun results
from a single run. Attempt to reduce the false positive rate by allowing
a re-run of failed jobs.

Merge branch 'nicki/ci-shotgun-reduce-false-positives' into 'main'

See merge request isc-projects/bind9!10271
2025-03-18 09:29:10 +00:00
Nicki Křížek
5eab352478 Allow re-run of the shotgun jobs to reduce false positive
The false positive rate is about 10-20 % when evaluating shotgun results
from a single run. Attempt to reduce the false positive rate by allowing
a re-run of failed jobs.

While there is a slight risk that barely noticable decreases in
performance might slip by more easily in MRs, they'd still likely pop up
during nightly or pre-release testing.

Also increase the tolerance threshold for DoH latency comparisons, as
those tests often experience increased jitter in the tail end latencies.
2025-03-18 10:19:32 +01:00
Nicki Křížek
7f8226a039 Adjust the load factor for shotgun:tcp test
With the slightly decreased load for the TCP test, the results appear to
be a little bit more stable.
2025-03-18 10:19:32 +01:00
Michał Kępień
192627db10 chg: test: Use isctest.asyncserver in the "qmin" test
Replace custom DNS servers used in the "qmin" system test with new code
based on the isctest.asyncserver module.  The revised code employs zone
files and a limited amount of custom logic, which massively improves
test readability and maintainability, extends logging, and fixes
non-compliant replies sent by some of the custom servers in response to
certain queries (e.g. AA=0 in authoritative empty non-terminal
responses, non-glue address records in ADDITIONAL section).

Merge branch 'michal/qmin-asyncserver' into 'main'

See merge request isc-projects/bind9!10195
2025-03-18 05:55:17 +00:00
Michał Kępień
dfd37918d6
Broaden vulture exclude glob for ans.py servers
The vulture tool seems to be unable to follow how the parent classes
defined in bin/tests/system/qmin/qmin_ans.py use mandatory properties
specified by child classes in bin/tests/system/qmin/ans*/ans.py.  Make
the tool ignore not just ans.py servers, but also *_ans.py utility
modules above the ansX/ subdirectories to prevent false positives about
unused code from causing CI pipeline failures.
2025-03-18 06:19:01 +01:00
Michał Kępień
f413ddbe5f
Ignore .hypothesis files created by system tests
Some versions of the Hypothesis Python library - notably the one
included in stock OS repositories for Ubuntu 20.04 Focal Fossa - cause a
.hypothesis file to be created in a Python script's working directory
when the hypothesis module is present in its import chain.  Ignore such
files by adding them to the list of expected test artifacts to prevent
pytest teardown checks from failing due to these files appearing in the
file system after running system tests.
2025-03-18 06:19:01 +01:00
Michał Kępień
a799dd04ad
Fix PYTHONPATH set for ans.py servers by start.pl
Commit 6c010a5644324947c8c13b5600cd8d988ae7684f caused the PYTHONPATH
environment variable to be set for ans.py servers started using
start.pl.  However, no system test has actually used the new
isctest.asyncserver module since that change was applied, so it has not
been noticed until now that including the source directory in PYTHONPATH
is only sufficient for in-tree builds.  Include the build directory
instead of the source directory in the PYTHONPATH environment variable
set for ans.py servers started by start.pl so that they work correctly
for both in-tree and out-of-tree builds.
2025-03-18 06:19:01 +01:00
Michał Kępień
7faa34c6ee
Use isctest.asyncserver in the "qmin" test
Replace custom DNS servers used in the "qmin" system test with new code
based on the isctest.asyncserver module.  The revised code employs zone
files and a limited amount of custom logic, which massively improves
test readability and maintainability, extends logging, and fixes
non-compliant replies sent by some of the custom servers in response to
certain queries (e.g. AA=0 in authoritative empty non-terminal
responses, non-glue address records in ADDITIONAL section).
2025-03-18 06:19:01 +01:00
Ondřej Surý
575a2e5f11 rem: dev: Cleanup BIND 8 compatibility code
There was some code in dns_resolver unit meant to keep compatibility with BIND 8 breaking the DNS protocol.  These should not be needed anymore.

Merge branch 'ondrej/resolver-bind-8-cleanup' into 'main'

See merge request isc-projects/bind9!10270
2025-03-18 00:12:31 +00:00