* billing: marketplace UI
adds UI in billing section for managing user and org-bound skus
add more unit tests for org binding
changed endpoint for bulk attaching skus to orgs
* update reconciliationworker to use webCustomerId instead of
ebsAccountNumber
* fix reconciler where it was incorrectly using the ebsAccountNumber to
create subscriptions
* add job to reconciler so that it reconciles different ids between the
database and the user api
* separate skus to be used by billing and skus to be used by reconciler
* chore: pass config to isort as it doesn't always detect it
* chore: mark package "test" as local, not stdlib
* chore: remove "isort: skip_file"
* chore: fix app in test_load_security_information
* chore: fix app in test_notification
* chore: fix app in test_index_report
* add migration for orgrhskus table
* add endpoints for managing and listing skus bound to an org
* create checks in billing flow to look for org-bound skus
* refactor RH marketplace api objects to be more usable in tests
* update cypress test db data and exclude it from pre-commit hook formatting
Garbage collect manifests no longer referenced in Quay from the
security scanner service.
Also moved quota related code from data/registry_model/ to data/model/
to avoid circular dependencies.
* Aggregate stripe plans and subscriptions.
* Validate end date for subscriptions when fetching from marketplace.
* Check subscription returned from stripe api is non-null value when
finding stripe plan for sku.
Allows superusers to trigger a calculation of the deduplicated registry size. A superuser can go to the organization panel of the superuser page and select Calculate to queue a calculation of the registry total. The total will only be calculated when requested. Includes warning to user of increase of database load when running calculation.
Allow the replication worker to retry immediately without having to
wait and exhaust the queueitem's retries. This addresses transient
unreliable storage service issues.
Allows for only unique blobs are counted at the namespace and repository level. Calculation includes manifest list sizes.
Add's the following internal configurations that default to true:
QUOTA_INVALIDATE_TOTALS: Invalidates calculated totals when FEATURE_QUOTA_MANAGEMENT is set to false
RESET_CHILD_MANIFEST_EXPIRATION: Resets the expiry for child manifests on push of the manifest list for immediate GC eligibility
PERMANENTLY_DELETE_TAGS: Enables features related to the permanent deletion of tags outside the configured time machine window
* repomirror: Use skopeo list-tags to get repo tags
`skopeo inspect foo` returns infromation about the image `foo:latest`,
and repository tags. Quay needs only list of tags, so it should use
`skopeo list-tags` that doesn't fail if `foo:latest` doesn't exist.
* Update type hints
* On local-dev Quay does not provide valid TLS certificate
Currently Quay creates tags for Docker V2 schema 1 manifests in manifest lists. This makes it appear a tag was mirrored successfully when it had actually failed. This change rolls back those failed tags when the sync fails.
Adds the `REPO_MIRROR_ROLLBACK` option to specify whether the mirror will rollback the state of the repo on failure of any one of the tags. Defaults to false. Adds additional `PARTIAL_SYNC` error status which logs the tags that failed to sync to the console.
Previous logic for claiming mirror ("locking") relied on the value
returned from updating the database row. Since this was always being
updated with a new expiration time, it would always succeed, even when
another process had already claimed the same mirror.
Having only one worker index recent manifest doesn't keep up with the
rate new manifests being pushed, given the time it takes for an index
request to complete. Adding the option to bypass the global lock
allows for more workers, but also increase the chance of duplicate work.
Index recent manifests in a separate background process, allowing the
main process to correctly select random slabs from the entire table
set and marking them completed in the allocator (rbtree). This avoids
the worker having to start iterating from the beginning of the table
whenever it is restarted.
During a rollback the mirror worker checks for new tags that were created in the repository in the time that the mirror operation has been running. If it encounters older tags that have been updated it will attempt to create a new tag that will point to the previous manifest. Currently for large lists of tags this will fail since we only retrieve the 100 latest tags. The mirror worker will never reach the tags that have been updated and will never recreate them, leading to the behavior of deleting tags during a rollback.
Add a global lock on security worker iterations, based on the value of
the current pagination token. This is to avoid multilple worker
processes possibly doing work on the same set of manifests.
Currently the CI breaks due to a dependency of black, `click`, breaking with it's latest release with `ImportError: cannot import name '_unicodefun' from 'click'`. Since black does not pin it's version of click it pulls in the latest version containing the breaking change and fails the CI check. This updates black with the patch. [See the original issue here.](https://github.com/psf/black/issues/2964) The rest of the changes are format updates introduced with the latest version of black.
Currently when attempting to mirror a registry containing unsigned images the mirror will fail due to not finding the source signature. This is caused by the updated version of Skopeo blocking unsigned images by default. This allows users to specify the ability to pull unsigned images per-repository. The Skopeo version is also now pinned.
Currently blobs leftover in the uploads directory during cancelled uploads do not get cleaned up since they are no longer tracked. This change cleans up the uploads storage directory directly.
Adds ACCOUNT_RECOVERY_MODE to allow Quay to run with some core
features disabled. When this is set, the instance should only be used
in order by existing users who hasn't linked their account to an
external login service, after database authentication has been
disabled.
Since NamespaceGCWorker does a superset of RepositoryGCWorker's
operations, make sure that quay_gc_repos_purged is incremented if
either workers deletes a repository.
GlobalLock had a dependency on app, which would cause a circular
dependency if imported from the main app. Workaround this by requiring
to pass the configuration to the GlobalLock instead (this is done by a
classmethod, due to the use of Redlock's factory). This means before
the use of GlobalLock, "configure" will need to be called once, per process.
By default, Redlock creates a new client per instance. Using the
provided factory allows Redlock to reuse a single connection per
instance and avoid running out of connections. e.g When a worker tries
to get a lock, it should not open new connections every time.
Increase sleep duration between queue polls on
WorkerSleepException. This will give more time before retrying after
failing to acquire a lock.
Prevents the queueworker from setting the event to stop the poll_queue
job when a WorkerSleepException is raised. On WorkerSleepException,
the worker should instead skip this iteration (go to sleep). e.g when
the NamespaceGCWorker can't acquire a lock because it is already taken
by some other worker.
Reverts the gcworkers job timeout from 24h to 3h. In case of a
deadlock between processes (for example, redeploying the app will not
clear the existing Redis keys), 24h is too long waiting for the locks to
expires so that the workers can resume work.
Add missing Counter increment for on row deletion on the Manifest table.
Correctly converts the given ttl from seconds to milliseconds when
passed to Redis (redlock uses 'px', not 'ex'). Also increase the lock
timeout of gc workers to 1 day.
Some iteration, for repos with large numbers of tags (1000s), will
take more than 15 minutes to complete. This change will prevent multiple
workers GCing the same repo, and one possibly preempting
another. GlobalLock's ttl will make the lock available again when
expired, but will not actually stop execution of the current GC
iteration until the GlobalLock context is done. Having a 1 day timeout
should be enough.
NOTE: The correct solution would have GlobalLock should either renew
the lock until the caller is done, or signal that it is no longer
valid to the caller.
Migrate from using boto2 to boto3. Changes include:
- Removes explicit bucket addressing style: Boto3 will initially try virtual-style addressing first then fallback to path-style addressing (https://github.com/boto/boto3/blob/develop/docs/source/guide/configuration.rst)
- GCS workarounds to use boto3:
- Handles CORS config
- Update signed url access key parameter name
- Uses ListBucket V1 API
- On client-side chunks join, copy using non-multipart api: Use copy_from instead of copy when joining chunks client-side. This is because copy assumes multipart upload should be used which GCS and Rados are not compatible with (S3's version. They have their own parallel upload api)
- Update RDS healthcheck to use boto3
* Revert "Set default REPO_MIRROR_SERVER_HOSTNAME value to match SERVER_HOSTNAME (#667)"
This reverts commit 55e11c2bd6.
`REPO_MIRROR_SERVER_HOSTNAME` should match `SERVER_HOSTNAME` if its
value is None (default). i.e. if it's not set explicitly.
Instead changing the config's jsonschema to allow
`REPO_MIRROR_SERVER_HOSTNAME` to be None.
* Allow null value for REPO_MIRROR_SERVER_HOSTNAME