The link check job of the "Check Markdown" workflow has had frequent intermittent spurious failures recently, caused by
links under the docs.github.com domain returning 403 HTTP status.
Others experiencing the same problem reported that they were able to work around the problem by providing a custom
`Accept-Encoding` HTTP request header. Although I was not able to find any explanation of why, it does appear to resolve
the problem.
Since the problem seems to be permanent and the only other workarounds I have identified are unappealing (considering
links returning 403 statuses alive, or ignoring all docs.github.com links), I think it is worth a try.
High quality feedback via GitHub issues is a very valuable contribution to the project. It is important to make the
issue creation and management process as efficient as possible for the contributors, maintainers, and developers.
Issue templates are helpful to the maintainers and developers because it establishes a standardized framework for the
issues and encourages the contributors to provide the essential information.
The contributor is now presented with a web form when creating an issue. In the case of the library registration
maintenance requests, these are for the specific information the registry maintainers require. For more general bug
reports or feature requests, they use multi-line input fields that have the same formatting, preview, and attachment
capabilities as the standard GitHub Issue composer, in addition to other form components such as menus and checkboxes
where appropriate.
The use of this form-based system should provide a better experience for the contributors and library maintiners while
also resulting in higher quality issues by establishing a standardized framework for the issues and encouraging
contributors to provide the essential information.
A template chooser allows the contributor to select the appropriate template type, redirects support requests to the
appropriate communication channels via "Contact Links", and provides a prominent link to security policy to guide any
vulnerability disclosures.
The clear separation of the types of issues encourages the reporter to fit their report into a specific issue category,
resulting in more clarity. Automatic labeling according to template choice allows the reporter to do the initial
classification.
At the time the reference configuration file was developed, it was in JSON, which does not support comments. I found the
need to add some internal explanatory commentary to some of the labels, so I added an arbitrary `notes` key as a
container for this information, and our JSON schema was also configured to accept this field.
I later decided to convert the files to YAML, since that is the language used by the majority of the asset configuration
files. At this point it became possible to use comments, but the `notes` field was already in place so it
seemed pointless to change it.
Validation of the configuration file was added to the "GitHub Label Sync" tool in the recent 2.1.0 release. Before
2.1.0, the tool ignored any additional properties in the label configuration objects. It now errors if there are any
unexpected properties.
This `notes` field now causes the configuration files that contain it to be considered invalid by the tool, and by our
custom JSON schema:
```
.github/label-configuration-files/labels.yml invalid
[
{
instancePath: '/8',
schemaPath: '#/items/additionalProperties',
keyword: 'additionalProperties',
params: { additionalProperty: 'notes' },
message: 'must NOT have additional properties'
}
]
```
So the `notes` field is hereby removed, with the contents moved to comments.
Although the submission process is completely automated, other requests (e.g., library repository URL update) are still
handled manually due to requiring additional operations on the backend
The changes to the data files stored in this repository must be coordinated with the associated changes to the Library
Manager database and/or storage systems.
The dedicated "status: pending backend" repository label will be added to the issues and PRs which can be resolved only
after those external operations have been completed; in this manner indicating their status in a clear and machine
readable manner.
The `carlosperate/download-file-action` action is used by GitHub Actions workflows as a convenient way to download
external resources.
A major version ref has been added to that repository. It will always point to the latest release of the "1" major
version series. This means it is no longer necessary to do a full pin of the action version in use as before.
Use of the major version ref will cause the workflow to use a stable version of the action, while also benefiting from
ongoing development to the action up until such time as a new major release of an action is made. At that time we would
need to evaluate whether any changes to the workflow are required by the breaking change that triggered the major
release before manually updating the major ref (e.g., uses: `carlosperate/download-file-action@v2`). I think this
approach strikes the right balance between stability and maintainability for these workflows.
In order to prevent confusing feedback from the bot, parallel runs of the "Manage PRs" workflow for a given PR are
prevented by canceling any in progress runs for that PR whenever it is triggered.
However, sometimes a trigger event does not result in a run. For example, the workflow is triggered by every comment on
the PR thread, but only those containing the text "ArduinoBot" result in a run. With the previous workflow configuration,
this meant that if anyone made an incidental comment on the PR during a workflow run, the true run was canceled by the
otherwise ignored trigger event, causing a loss of automation.
The solution is to adjust the concurrency configuration so that prior runs in progress are canceled only if the current
trigger event will result in a true run.
I did not indent the added expression because doing so caused validation against the community developed GitHub Actions
workflow JSON schema from the JSON Schema Store to fail.
The reason is that lines with leading whitespace are not folded:
https://yaml.org/spec/1.2.2/#block-folding
Even though the resulting newlines in the expression don't cause any problems for GitHub Actions, the JSON Schema does
not have support for them. Since there is no explicit specification in the GitHub Actions documentation that newlines in
expressions are supported, I am hesitant to propose the necessary change to the schema.
It was previously possible to trigger the "Manage PRs" workflow for a pull request while a previous run for that PR is
already in process.
When that happens, it can result in erroneous bot comments. For example:
1. Workflow run is automatically triggered by a push.
2. Contributor does not notice this and comments a mention of the bot to trigger the workflow.
3. The first workflow run finds the PR is compliant and merges it.
4. The second workflow run finds the PR is compliant and attempts to merge it.
5. The second workflow run fails the merge (because it is already merged) and informs the contributor that there was a
merge conflict they must resolve.
6. The contributor is not able to resolve the non-existent conflict and is left wondering whether their submission was
successful.
The solution is to configure the "Manage PRs" workflow so that a workflow run in progress is canceled if the workflow is
triggered again for that PR. The "concurrency group" name is the PR number, so workflow runs in progress for other PRs
would not be affected.
In order to facilitate the testing and review of proposed changes to the repository label infrastructure, the
"Sync Labels" template workflow does a dry run when triggered under conditions that indicate it would not be appropriate
to make real changes to the repository's labels. The changes that would have resulted are printed to the log, but not
actually made.
One of the criteria used to determine "dry run" mode usage is whether the event occurred on the repository's default
branch. A trigger on a development branch or for a pull request should not result in a change to the labels.
It turns out that GitHub does not define a `github.event.repository.default_branch` context item when a workflow is
triggered by a `schedule` event. This resulted in the workflow always running in "dry run" mode on a `schedule` trigger.
Since `schedule` and `repository_dispatch` triggers are only permitted for the default branch, there is no need to check
whether the event's ref matches the default branch and it is safe to always run in write mode on these events.
Incorrect context key name resulted in impossible to satisfy conditional, meaning the dry run determination code was
solely dependent on the check for whether the workflow was triggered from the default branch name.
The repository contains labels for each of the distinct operations that may be requested for the library registrations or index data. These labels are defined in a local configuration file for management by the "Sync Labels" GitHub Actions work in combination with the shared universal Arduino tooling project repository labels configuration file.
Since there has not yet been an internal request for a type change operation, the associated "topic: type change" label was forgotten when creating the configuration file. It will be best to have the label in place so every standard operation can be accomplished without unnecessary complication.
On every push that changes relevant files, and periodically, configure the repository's issue and pull request labels
according to the universal, shared, and local label configuration files.
The automatically generated access token provided by `${{ secrets.GITHUB_TOKEN }}` is used to automatically merge
submission pull requests if they are compliant with all requirements.
If the pull request's branch is behind the parent repository and the code of any GitHub Actions workflow has been
modified in the parent since that time, the token permissions are downgraded, which causes the GitHub API request for
merging the PR to fail with a 403 status.
Previously, this was treated as an unexpected merge failure caused by some problem not resolvable by the PR author. Since
the PR author can easily resolve the failure by bringing their branch up to date (even through the GitHub web interface),
the "Manage PRs" workflow is hereby changed to provide instructions for doing so.
As before, a review will be requested from the maintainer of this repository so that they can monitor the situation and
provide the PR author with assistance if needed.
The system is designed to allow a submission to be accomplished in a single pull request. This is the case even when
initial passes of checks reveal problems that block acceptance. The checks will automatically re-run any time the PR
author pushes to the PR's branch or mentions the bot.
Although the submitters are welcome to submit a new PR if that is their preference, it is a less efficient approach, both
for them and the maintainer. So it's important to clearly communicate that the submission process can be continued via
the current PR if that is convenient to them.
Usage patterns indicate that this is not clearly communicated via the current messaging from the bot, so perhaps an
additional note with some styling to give it emphasis will improve on the user experience.
A new release of the `arduino/library-manager-submission-parser` tool used by the "Manage PRs" workflow has been made.
This release fixes a bug that caused pull requests that consisted only of newlines to be incorrectly classified as
"modification", resulting in an unexpected failure of the workflow run due to there being no library URLs to populate the
`check-submissions` job matrix:
```
Error when evaluating 'strategy' for job 'check-submissions'. (Line: 219, Col: 21): Unexpected value ''
```
These pull requests will now be assigned the appropriate "other" request type and the workflow run will pass as expected,
requesting the necessary manual review from a maintainer.
Some minor adjustments to the comments in the configuration files for the tools used by the repository's CI system to
bring them into sync with the upstream "template" assets.
The use of the `error` workflow command will cause the important error message output to be surfaced prominently in the
workflow run summary and log. The workflow run logs can be somewhat labyrinthine to those who don't work with them
regularly, so finding the previous output to determine what caused the failure might have been challenging.
Even if it works as intended, it is not clear what the effect is of the escaped quote at the end of the environment
variables in the shell commands used to check the license detection results. Wrapping the variable names in braces
ensures they are as expected and also makes the working of the code clear.
There are two file extensions in common use for YAML files: `.yaml` and `.yml`. Although this project uses `.yml`
exclusively for YAML files, this is a standardized workflow which might be applied to projects that have established the
use of the other extension. It will be most flexible if it supports both.
The `workflow_dispatch` event allows triggering the workflow via the GitHub web interface. This makes it easy to trigger
an immediate workflow run after some relevant external change.
The `repository_dispatch` event allows triggering workflows via the GitHub API. This might be useful for triggering an
immediate check in multiple relevant repositories after an external change, or some automated process. Although we don't
have any specific need for this event at the moment, the event has no impact on the workflow, so there is no reason
against having it. It is the sort of thing that can end up being useful if it is already in consistently in place, but
not worth setting up on demand, since the effort to set it up is greater than the effort to trigger all the workflows
manually.
This will make it easier for the maintainers to sync fixes and improvements in either direction between the upstream
"template" workflow and its installation in this repository.
Due to the limitations imposed by by using both `pull_request_target` and `issue_comment` events to trigger the
"Manage PRs" workflow, the PR diff used for the validation is procured via a GitHub API request.
It is necessary to check that the pull request state matches that of the diff, which is achieved via the `sha` parameter
of the GitHub API request used to merge. This can not be determined from the `github` context provided by GitHub Actions
to the workflow for either of the trigger events, so the pull request metadata is requested from the GitHub API at the
same time as the diff.
This situation requires different handling by the `merge-fail` job. Fortunately, the two failure causes result in
different values from the merge request workflow step's `status` output.
Previously, the "Manage PRs" workflow made a comment suggesting to resolve the merge conflict after any failure to merge
a pull request. This comment is worded in a way that makes it somewhat applicable to other causes, but still might cause
the submitter to waste time unnecessarily trying to figure out how to merge a nonexistent merge conflict when the failure
had a different cause.
The 405 response is not specific to a failure due to merge conflict, but I believe that all failures due to merge
conflict will result in a 405. This means that the check is not perfect, but will make spurious mentions of merge
conflict resolution less likely at least.
A review is requested from a maintainer any time the merge fails, so they will be able to investigate and provide
assistance if necessary.
When possible, if problems are detected in a pull request, the bot will attempt to guide the PR author through the
process of making a valid submission, which should be handled in a completely automated fashion on our end.
It has become clear that we need to prevent the removal of the final newline from `repositories.txt`. The existing system
did not accomodate this requirement. Submissions are validated on a per-library basis, and the bot comments based on
identifying which library the problem applies to. But this newline removal is not necessarily related to any specific
item added to the list. So handling for general problems with a submission PR is needed, which is added here.
Because the PR author is more likely to require assistance with resolving this sort of problem, PR review from a
maintainer is requested.
The workflow result might indicate either that the PR author could require assistance from a maintainer or that something
is wrong with the system. In this case, the situation is brought to the attention of the maintainers by requesting a pull
request review from them.
Due to the need to avoid requesting review from a maintainer when they are the PR author (which is not allowed and thus
would result in a spurious workflow failure), the code for requesting this review is not as trivial as might be expected.
Previously, this code was duplicated at multiple places in the workflow, and would become more so as additional code is
added. The workflow is made cleaner by moving that duplicated code to a single dedicated job, which is facilitated by the
recent reworking of the workflow structure.
This is a pure refactoring and should have no effect on the workflow behavior.
Whenever the bot needs to communicate to the user about a blocking issue with their pull request that they are able to
resolve, a standardized prefix is added to the situation-specific error message ("❌ **ERROR:**") to draw their
attention to this information. This standardized text occurred multiple times in the workflow, which might lead to it
becoming inconsistent over time, or just more work to improve the text. Use of an environment variable ensures that all
uses of the prefix will be consistent and allows it to be edited once in a single place.
A new release of `arduino/library-registry-submission-parser` is out with some minor improvements to the error messages
the bot comments to a pull request when problems with a submission are found.
In the event the checks on a submission PR fails, the bot comments with instructions for how the checks can be triggered
to run again once the user has resolved the issue. One option is to comment on the PR thread, mentioning ArduinoBot.
GitHub automatically linkifies mentions to the user's profile page, which occurs in these instructions. The ArduinoBot
profile page is not of relevance or interest in this context, so there is no benefit to providing a one click path for
its access. In addition, the link makes the text more difficult to copy. So it's better to prevent its linkification,
which is achieved by wrapping the text in backticks.
In the event the cause of a submission check failure is resolved externally, the user can trigger the "Manage PRs"
workflow run by mentioning ArduinoBot in a reply to the PR thread. Unlike a workflow run triggered by a push to the pull
request, GitHub Actions does not provide any visible indication on the PR page of the workflow run in progress. Instead,
we have configured the workflow so that the bot immediately comments on the PR thread so that the user is not left
wondering whether their comment had any effect as the longer process of the submission checks finishes before the
feedback about their result can be provided.
With the idea that some users might like to get a progress indicator in the time between the initial comment and the
final feedback from the workflow run, I added a link to the workflow runs page. However, we received feedback from
testers that encountering the fairly cryptic workflow run logs causes confusion. So we are trying to avoid leading users
toward those logs. Since the link does just that and is not necessary, it's best to simply remove it.
The "Manage PRs" GitHub Actions workflow processes pull requests submitted to the repository. It is intended to allow
completely automated submissions of libraries. The feedback mechanism used by the system is comments on the pull request
thread. These comments should provide all the necessary feedback about the process, including whatever is needed to
bring a submission into compliance in the event the automated checks find it to not meet the requirements.
From feedback provided by testers, we learned that they navigated to the workflow run logs provided by GitHub Actions
while trying to learn what the problem was with their submission.
The workflow run logs provide output from the internal workings of the system that is only of interest for developers
troubleshooting a malfunction of the system itself. We never intended to use them as a channel for communicating to the
regular users of the system. Users of the system may find them quite cryptic. Since this is an interface generated by
GitHub Actions, without much capability for customization, it would be quite difficult for us to improve their
readability for the normal user. Those efforts would also require an increase in the complexity of the workflow and make
it more difficult to maintain.
I think there are two possible paths a normal user would be likely to follow to the workflow logs while trying to
understand why their submission was not accepted:
- The check status UI shown at the bottom of the PR comment thread ("Some checks were not successful")
- Workflow run failure email notification ("Manage PRs: Some jobs were not successful: View workflow run")
These pathways are either less enticing or absent when a workflow run is successful.
Previously, there were two possible causes for a run of the "Manage PRs" workflow to fail:
- Submitted library did not meet the Library Manager requirements
- An unexpected error from one of the workflow steps
Since we already are using comments from the bot to communicate about the former, the workflow run failure status
indicators provided by GitHub Actions are superfluous. The latter only occurs under extraordinary and circumstances so
its effect on the user experience is not of concern.
So the way to improve the user experience is to configure the workflow to only fail on unexpected errors, only commenting
and blocking merge in the event of expected errors.
Since we already have the Dependabot infrastructure in place for managing dependencies of the project's Go code and
GitHub Actions workflows, it makes sense to do the same for the newly introduced Go and Python dependencies as well.
This configuration is applied to the `production` branch (using the `target-branch` key), but it must be added to the
configuration file in the default branch (`main`) because Dependabot only pays attention to the default branch's
configuration file.
Reference:
https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#about-the-dependabotyml-file
In the event of a problem with a submission, the comments on the pull request thread. Due to the use of a matrix job to
support submissions of any number of libraries in a single PR, this might consist of multiple comments. Adding a standard
prominent prefix (❌ **ERROR:**) to all error messages will ensure that the most important part of this information is
not missed.
Dependabot will periodically check the versions of all actions used in the GitHub Actions workflows of the `production`
branch. If any are found to be outdated, it will submit a pull request to update them.
NOTE: Dependabot's PRs will occasionally propose to pin to the patch version of the action (e.g., updating
`uses: foo/bar@v1` to `uses: foo/bar@v2.3.4`). When the action author has provided a major version ref, use that instead
(e.g., `uses: foo/bar@v2`). Dependabot will automatically close its PR once the workflow has been updated.
More information:
https://docs.github.com/en/github/administering-a-repository/keeping-your-actions-up-to-date-with-dependabot