Fixing the error seen with signature_v2/v4 patch #3041 when using STSS3Storage. The STSS3Storage Class is using the connect_kwargs dictionary to initialze the S3Storage Class where all other use that dict for the connection parameters which is misleading and I did not catch that when submitting the patch for the signature v2/v4
---------
Co-authored-by: Michaela Lang <milang@redhat.com>
* updating Storage drivers to support configurable signature version
* we do not have any signature checks so to avoid any compiling errors
* removed signature as we do not check anything
* removed signature as we do not check anything
---------
Co-authored-by: Michaela Lang <milang@redhat.com>
* storage: use managed copy for single chunk uploads (PROJQUAY-7328)
We do a multi-part copy from the staging location to the
final blob location in 5GB chunks sequentially. For large
layers this is extremely slow. Use managed `copy` to
move the blob to the final location faster
This adds the optimization in CloudFlare where if a request is from the primary region then instead of redirecting to the CDN, we return the S3 URL to save egress cost
* storage: Increase GCP timeout (PROJQUAY-6819)
Currently, Boto GCP timeout is set to 60 seconds which causes a problem in pushing big layers. This will increase boto timeout to 10 minutes to be more aligned with our other S3 engines. Result:
~~~
root@cyberdyne:~# time { docker push quay.skynet/ibazulic/gcp-test; }
Using default tag: latest
The push refers to repository [quay.skynet/ibazulic/gcp-test]
4335316598de: Pushed
d101c9453715: Pushed
latest: digest: sha256:c6ffbd16c2ef43496ff13c130e31be84ceccdb5408e4f0d3b0f06ae94d378ff9 size: 744
real 7m9.881s
user 0m0.204s
sys 0m0.077s
root@cyberdyne:~#
~~~
* Fix isort sorting
* Made `boto_timeout` configurable, defaults to 60
* Made `boto_timeout` configurable, fix isort issues
* Remove reference to `self.boto_timeout`
* cdn: add namespace and username to CDN redirect for usage calculation (PROJQUAY-5939)
We use the referer header to infer the namespace but that
is not always reliable as some clients don't add that header
when sending the request to the CDN
* storage: Fix big layer uploads for Ceph/RADOS driver (PROJQUAY-6586)
Current uploads of large images usually fail on Ceph/RADOS compatible implementations (including Noobaa) because during the last assembly, copy is done all at once. For large layers, this takes a long while and Boto times out. With this patch, we limit the size of the used chunk to 32 MB so the final copy is done in parts of up to 32 MB each. The size can be overridden by specifying the parameter `maximum_chunk_size_mb` in the driver settings. For backwards compatibility, an additional parameter was added: if `server_side_assembly: true` then we force server side assembly and the final blob push in chunks, if `server_side_assembly: false` we fall back to default client side assembly (we increase the boto timeout in this case to still support large layer upload):
~~~
DISTRIBUTED_STORAGE_CONFIG:
default:
- RadosGWStorage
- ...
maximum_chunk_size_mb: 100
server_side_assembly: true
~~~
* Fix formatting
* Added backward compatiblity switch and increased boto timeout
* Changed name of variable in config
* Small fixes to if statements
* storage: make cloudfront_distribution_org_overrides optional (PROJQUAY-5788)
This is causing issues with config editor where it
configure CloudFront provider because of the required
override param
When completing a chunked upload, if the chunk list is empty do not attempt to assemble anything.
Using oras to copy an artifact from an outside registry to quay results in a 5XX error. This is because at some point the upload chunk list is empty and attempting to complete the chunked upload causes an exception. Not trying to write to storage if there are no chunks allows the copy operation to successfully complete.
* storage: Add MultiCDN storage provider (PROJQUAY-5048)
This storage provider can route to different underlying sub-providers
based on a critiera. Currently supported filters are source_ip and
namespace.
Example Config:
- MultiCDNStorage
- providers:
TargetName1:
- ProviderName1
- porviderConfig1
Targetname2:
- ProviderName2
- ProviderConfig2
default_provider: TargetName1
rules:
- namespace: test
continent: APAC
target: TargetName2
storage: Add Cloudflare as a CDN provider for an S3 backed storage (PROJQUAY-3699)
This adds CloudFlare as a CDN provider for quay for any storage backed
by S3. This requires a worker script that needs to be setup seperately
on CloudFlare. More details on the worker at
https://github.com/quay/quay-cloudflare-cdn-worker
* chore: Add server side assembly of chunked metadata for RADOSGW driver (PROJQUAY-0000)
RadosGW did not support multipart copying from keys so we needed to do a local join and reupload of the whole blob. This creates issues for blobs which are fairly big.
Since the issue was fixed in 2015. on the Rados side, we no longer need this part of legacy code.
See [here](https://github.com/ceph/ceph/pull/5139) for more information.
* Fixed linting with black
This optimization ensures that we return the direct S3 URL for
CloudFront storage only for requests from the same region. This
ensures we don't get charged for cross-region traffic to S3
* Update peewee types
Also remove tools/sharedimagestorage.py as it doesn't work anymore.
tools/sharedimagestorage.py:3: error: "ModelSelect[ImageStorage]" has no attribute "annotate"
* Remove endpoints/api/test/test_security.py from exclude list
* Format storage/test/test_azure.py
Currently the CI breaks due to a dependency of black, `click`, breaking with it's latest release with `ImportError: cannot import name '_unicodefun' from 'click'`. Since black does not pin it's version of click it pulls in the latest version containing the breaking change and fails the CI check. This updates black with the patch. [See the original issue here.](https://github.com/psf/black/issues/2964) The rest of the changes are format updates introduced with the latest version of black.
introduces the possibility to pull images from external registries
through Quay, storing them locally for faster subsequent pulls.
Closes PROJQUAY-3030 and PROJQUAY-3033
Boto3 behaves unexpectedly when the resource client is not set to use
the correct region. Boto3 can't seem to correctly set the
X-Amz-Credential header when generating presigned urls if the region
name is not explicitly set, and will always fall back to us-east-1.
To reproduce this:
- Create a bucket in a different region from us-east-1 (e.g
eu-north-1)
- Create a boto3 client/resource without specifying the region
- Generate a presigned url
This seems to be a DNS issue with AWS that only happens shortly after
a bucket has been created, and resolves itself eventually.
Ref:
- https://github.com/boto/boto3/issues/2989
- https://stackoverflow.com/questions/56517156/s3-presigned-url-works-90-minutes-after-bucket-creation
To workaround this, one can specify the bucket endpoint, either
explicitly via endpoint_url, or by setting s3_region, which will be
used to generate the bucket's virtual address.
Currently blobs leftover in the uploads directory during cancelled uploads do not get cleaned up since they are no longer tracked. This change cleans up the uploads storage directory directly.
* Add dev dependencies mypy and typing
* Add makefile target `types-test`, not yet included in `test` target.
* Generate stubs for imported modules to avoid mypy complaining about missing types.
* Remove generated stubs as there are way too many and they cause tons of mess in the repo. Switched to ignoring untyped modules for now, to concentrate on Quay-only type checking.
* mypy config changed to ignore missing imports
* ignore property decorator as it is not supported by mypy
* mypy annotations for many configuration variables
* re-generate mypy_stubs directory as its necessary in some classes for base classes to prevent mypy errors
* util/registry/queuefile referred to non existent definition of Empty class in multiprocessing.queues
* ignore type checking for things like monkey patching and exported/re-imported objects that
mypy does not allow.
* Adjust mypy config to warn us about unreachable return paths and useless expressions.
* Add the __annotations__ property to INTERNAL_ONLY_PROPERTIES so that it is not part of the config schema testing
* Remove redundant dependencies `typing` and `typing-extensions` which are NOOP after Python 3.5
* Remove mypy-extensions which only provides a TypedDict implementation but has not been updated since 2019.
* updated mypy to 0.910 which requires all types packages to be installed manually.
* exclude local-dev from type checking until core team can suggest an outcome for __init__.py duplicate packages
* re-add typing dependency which will be needed until Python 3.9
* ignore .mypy_cache
* add mypy stub for features module to replace inline definitions
* import annotations eager evaluation in billing.py as it was required to reference a class declared later in the module.
* remove the type definition of V1ProtocolSteps/V2ProtocolSteps to make tox happy
The way it was formatted (with the top-level parentheses), the extra
comma was causing the connection parameters to be passed as a 1-tuple
instead of a string.
Explicitly call abort on a mpu if no bytes were written.
Noobaa and Rados will not clean the artifacts, resulting in empty
files being stuck in an "uploading" state.
Migrate from using boto2 to boto3. Changes include:
- Removes explicit bucket addressing style: Boto3 will initially try virtual-style addressing first then fallback to path-style addressing (https://github.com/boto/boto3/blob/develop/docs/source/guide/configuration.rst)
- GCS workarounds to use boto3:
- Handles CORS config
- Update signed url access key parameter name
- Uses ListBucket V1 API
- On client-side chunks join, copy using non-multipart api: Use copy_from instead of copy when joining chunks client-side. This is because copy assumes multipart upload should be used which GCS and Rados are not compatible with (S3's version. They have their own parallel upload api)
- Update RDS healthcheck to use boto3
- Add Werkzeug's LimitedStream + Any binary stream (IOBase) to Swift's type assertion
- Allow LimitingStream from util.registry.filelike to seek backward, since it is required by the Swift client in order to retry operations, if it is configured to do so
- Update use of _pyio to io (io is implemented in C instead of pure Python) in Swift's implementation