In the previous kubernetes executor the build job was persisted in DEBUG mode due to the virtual machine in the pod never exiting. This kept the job alive for users to view the debug information. The `kubernetesPodman` executor does not run the VM so it will be cleaned up due to `ttlSecondsAfterFinished` being set on the job. This change prevents the `ttlSecondsAfterFinished` field from being set when DEBUG is true, allowing the pod to stay alive to retrieve the logs.
Setting the backoffLimit to 1 for kubernetes and kubernetesPodman builds. Prevents subsequent attempts from failing due to the token expiring. Having the job recreate pods is unnecessary as the build manager already has the retry logic.
Adding JOB_REGISTRATION_TIMEOUT to take effect on generating the build registration token. Also adding the DEBUG option to the kubernetesPodman executor.
Changes made to allow use of a single quay-builder image for kubernetes and kubernetesPodman builds.
Implements the following changes:
- Added EXECUTOR env var to kubernetsPodman job configuration
- Updated the builder ignition config to overwrite the registry.conf file to set short name mode to permissive
- Always run the quay-builder in the VM as root
If not set, TimeoutStartSec for the Docker service is set to
600. Since it's a service of type oneshot, this should either not be
set, or at least the length of the machine's lifetime.
* buildman: Add proxy variables to builds if they exist (PROJQUAY-2120)
Adds the ability to define proxy variables for builders. The proxy variables are parsed as env. variables and defined in Quay's config.yaml file.
* buildman: Add proxy variables to builds if they exist (PROJQUAY-2120)
Adds the ability to define proxy variables for builders. The proxy variables are parsed as env. variables and defined in Quay's config.yaml file.
When set to true, DEBUG will prevent the build nodes from shutting
down after the quay-builder service is done or fails, and will prevent the
build manager from cleaning up the instances (terminating EC2
instances or deleting k8s jobs).
This will allow debugging builder node issues, and should not be set
in a production environment.
The lifetime service will still exist. i.e The instance will still
shutdown after ~2h (EC2 instances will terminate, k8s jobs will
complete)
Setting DEBUG will also affect ALLOWED_WORKER_COUNT, as the
unterminated instances/jobs will still count towards the total number
of running workers.
* Handle non 200 api response from executors
* Allows the CA cert to be specified in the config for server verification
Allow the CA cert used for server verification to be specified in the
config even if client certificate authentication is not used.
Handles non-200 responses from executors when trying to get worker count.
* Use safe_load when loading the config yaml
* Setup nginx ssl termination for grpc endpoints
* Bootstrap Quay's ca cert in the build executor nodes
* Update certificate mount point in ignition config
Mount the Fedora CoreOS/RHCOS based cert directory to /certs in the
builder container, where it will be installed by the container's
entrypoint.
Allow specifying the container runtime to the templated ignition file
Allow specifying the container runtime in the executor's ignition
file. This allow for different runtimes, e.g Docker, Podman to run a build.
* Reenable builder in supervisord config
* Rewrites the buildmanager to use gRPC
Rewrite of the current buildmanager using gRPC.
This deprecates the enterprise type builder, as individual nodes will
no longer keep track of build states because of WAMP.
Also removes trollius, which was required by the WAMP servers.
Instead, gRPC uses a threaded model to serve its requests.
Deprecates etcd as state trakcing for build states in favor of Redis
only.
Defines a state interface to manage/transition build states, implemented by the
buildmanager.
* Fix incorrect reference to aws connection
* Truncate the "Token" tag in ec2 to 36 char.
Normalize the token tag to 36 char in EC2.
Add an expiration to the original redis key, in the event that the
expiry handler is not able to delete the key, the original should be
removed eventually.
* Orchestrator: add context to KeyError
* EXPOSE 50051 in Dockerfiles
* Add buildman/README
Used by the manager to schedule builds based on the current running
count. Uses the specific executors' api to get the count of running
builders instead of Redis/Orchestrator.
This is due to issues encountered in the past where the manager would
have problems scheduling builds, and go into a weird state when
Redis was unavailable.
Remove wamp's REALM/websocket parameters from executor
Remove asyncio from executor
* Update the executor image from Container Linux to Fedora CoreOS
* Move the container cloud config script for templating from devtable to quay's repo
* Ignition config template
* Move dockersystemd from devtable repo
* Remove pinned dependency on devtable/container-cloud-config
* Removes squashed image and logentries
* Update builder image
* Update mounted cert directory for Fedora
* Removes old clouconfig template
* Pass userdata as firmware config to qemu
* Use CentOS:8 as base image
* Convert all Python2 to Python3 syntax.
* Removes oauth2lib dependency
* Replace mockredis with fakeredis
* byte/str conversions
* Removes nonexisting __nonzero__ in Python3
* Python3 Dockerfile and related
* [PROJQUAY-98] Replace resumablehashlib with rehash
* PROJQUAY-123 - replace gpgme with python3-gpg
* [PROJQUAY-135] Fix unhashable class error
* Update external dependencies for Python 3
- Move github.com/app-registry/appr to github.com/quay/appr
- github.com/coderanger/supervisor-stdout
- github.com/DevTable/container-cloud-config
- Update to latest mockldap with changes applied from coreos/mockldap
- Update dependencies in requirements.txt and requirements-dev.txt
* Default FLOAT_REPR function to str in json encoder and removes keyword assignment
True, False, and str were not keywords in Python2...
* [PROJQUAY-165] Replace package `bencode` with `bencode.py`
- Bencode is not compatible with Python 3.x and is no longer
maintained. Bencode.py appears to be a drop-in replacement/fork
that is compatible with Python 3.
* Make sure monkey.patch is called before anything else (
* Removes anunidecode dependency and replaces it with text_unidecode
* Base64 encode/decode pickle dumps/loads when storing value in DB
Base64 encodes/decodes the serialized values when storing them in the
DB. Also make sure to return a Python3 string instead of a Bytes when
coercing for db, otherwise, Postgres' TEXT field will convert it into
a hex representation when storing the value.
* Implement __hash__ on Digest class
In Python 3, if a class defines __eq__() but not __hash__(), its
instances will not be usable as items in hashable collections (e.g sets).
* Remove basestring check
* Fix expected message in credentials tests
* Fix usage of Cryptography.Fernet for Python3 (#219)
- Specifically, this addresses the issue where Byte<->String
conversions weren't being applied correctly.
* Fix utils
- tar+stream layer format utils
- filelike util
* Fix storage tests
* Fix endpoint tests
* Fix workers tests
* Fix docker's empty layer bytes
* Fix registry tests
* Appr
* Enable CI for Python 3.6
* Skip buildman tests
Skip buildman tests while it's being rewritten to allow ci to pass.
* Install swig for CI
* Update expected exception type in redis validation test
* Fix gpg signing calls
Fix gpg calls for updated gpg wrapper, and add signing tests.
* Convert / to // for Python3 integer division
* WIP: Update buildman to use asyncio instead of trollius.
This dependency is considered deprecated/abandoned and was only
used as an implementation/backport of asyncio on Python 2.x
This is a work in progress, and is included in the PR just to get the
rest of the tests passing. The builder is actually being rewritten.
* Target Python 3.8
* Removes unused files
- Removes unused files that were added accidentally while rebasing
- Small fixes/cleanup
- TODO tasks comments
* Add TODO to verify rehash backward compat with resumablehashlib
* Revert "[PROJQUAY-135] Fix unhashable class error" and implements __hash__ instead.
This reverts commit 735e38e3c1d072bf50ea864bc7e119a55d3a8976.
Instead, defines __hash__ for encryped fields class, using the parent
field's implementation.
* Remove some unused files ad imports
Co-authored-by: Kenny Lee Sin Cheong <kenny.lee@redhat.com>
Co-authored-by: Tom McKay <thomasmckay@redhat.com>
This is in attempts to keep the codebase as idiomatic as possible.
An addition benefit of reverting to the default histogram buckets is
that the slowest route durations more accurate.
This change replaces the metricqueue library with a native Prometheus
client implementation with the intention to aggregated results with the
Prometheus PushGateway.
This change also adds instrumentation for greenlet context switches.