1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-11-02 06:13:16 +03:00
Commit Graph

13 Commits

Author SHA1 Message Date
mariadb-AlanMologorsky
c86586c228 feat(cmapi,failover): MCOL-6006 Disable failover when shared storage not detected
- Add SharedStorageMonitor thread to periodically verify shared storage:
  * Writes a temp file to the shared location and validates MD5 from all nodes.
  * Skips nodes with unstable recent heartbeats; retries once; defers decision if any node is unreachable.
  * Updates a cluster-wide stateful flag (shared_storage_on) only on conclusive checks.
- New CMAPI endpoints:
  * PUT /cmapi/{ver}/cluster/check-shared-storage — orchestrates cross-node checks.
  * GET /cmapi/{ver}/node/check-shared-file — validates a given file’s MD5 on a node.
  * PUT /cmapi/{ver}/node/stateful-config — fast path to distribute stateful config updates.
- Introduce in-memory stateful config (AppStatefulConfig) with versioned flags (term/seq) and shared_storage_on flag:
  * Broadcast via helpers.broadcast_stateful_config and enhanced broadcast_new_config.
  * Config PUT is now validated with Pydantic models; supports stateful-only updates and set_mode requests.
- Failover behavior:
  * NodeMonitor keeps failover inactive when shared_storage_on is false or cluster size < 3.
  * Rebalancing DBRoots becomes a no-op when shared storage is OFF (safety guard).
- mcl status improvements: per-node 'state' (online/offline), better timeouts and error reporting.
- Routing/wiring: add dispatcher routes for new endpoints; add ClusterModeEnum.
- Tests: cover shared-storage monitor (unreachable nodes, HB-based skipping), node manipulation with shared storage ON/OFF, and server/config flows.
- Dependencies: add pydantic; minor cleanups and logging.
2025-10-01 21:10:34 +04:00
mariadb-AlanMologorsky
a76e153a1d feat(upgrade): MCOL-6028 upgrade MVP with repo-managed flow, prechecks, and other enhancements
Implements the initial upgrade capability across CMAPI and the CLI, including
repository setup, package operations, environment prechecks, and coordinated
cluster steps with progress reporting.

Details:
- CMAPI upgrade manager:
  - Add `cmapi/cmapi_server/managers/upgrade/` modules:
    - `repo.py`, `packages.py`, `preinstall.py`, `upgrade.py`, `utils.py` and `__init__.py`
  - Extend endpoints and routing to expose upgrade operations and status:
    - `cmapi_server/controllers/{endpoints.py, dispatcher.py, api_clients.py}`
    - `cmapi_server/managers/{application.py, process.py}`
    - Add improved constants and helpers for upgrade flow
- Backup/restore and safety:
  - Add `cmapi_server/managers/backup_restore.py`
  - Fix pre-upgrade backup regressions (due to `mcs_backup_manager.sh 3.17 changes`)
  - Improve cluster version validation; add `ignore_missmatch` override
- CLI enhancements:
  - Progress UI and richer feedback (`mcs_cluster_tool/tools_commands.py`, `README.md`, `mcs.1`)
  - Add steps to start MDB and start MCS during/after upgrade
  - Improved error surfacing for version validation
- Platform and packaging:
  - Ubuntu and Rocky Linux support
  - RHEL/DNF dry-run support
  - Distro detection and platform-dependent logic hardened
  - Logging improvements
- Updater service:
  - Add `cmapi/updater/cmapi_updater.service.template` and `cmapi_updater.sh` to make CMAPI update itself
- Docs:
  - Update mcs cli README and mcs.1 man file
  - Add `cmapi/updater/README.md`
2025-09-30 18:48:32 +04:00
Alexander Presnyakov
bc4c7e27d9 MCOL-5806: added ability to start node in read-only mode
* feat(cmapi): add read_only param for API add node endpoint
* style(cmapi): fixes for string length and quotes

Add dbroots of other nodes to the read-only node

On every node change adjust dbroots in the read-only nodes

Fix logging (trace level) in tests

Remove ExeMgr from constants

Fix tests

Manually remove read-only node from ReadOnlyNodes on node removal (because nodes are only deactivated)

Review fixes (mostly switching to StrEnum analog before py3.11, also changes in ruff config)

Read-only nodes are now called read replica consistently

Don't write hostname into IP fields of the config like PMSx/IPAddr, pmx_WriteEngineServer/IPAddr

We calculate ReadReplicas by finding PMs without WriteEngineServer

In _replace_localhost, replace local IP addrs with resolved IP addrs and local hostnames -- with the resolved hostnames.
ModuleHostName/ModuleIPAddr is kept intact.

Keep only IPv4 in ActiveNodes/DesiredNodes/InactiveNodes

feat: add mock DNS resolution builder for testing hostname/IP mappings

* Fix _add_node_to_PMS: if node is already in PMS, save it to existing items to not miss it during the reconstruction of the list
* Make tests independent from CWD

Fixed for _add_Module_entries

Fixed node removal and tests

Fixes for node manipulation tests
2025-09-26 21:55:10 +04:00
Alexander Presnyakov
4a1b3b9355 feat: add forward/reverse DNS validation when adding nodes by hostname 2025-09-17 17:58:51 +00:00
Alexander Presnyakov
08d46a9e70 Pass git revision from the build process -> cmapi - Sentry (to be able to understand which revision caused some error) 2025-09-03 20:32:03 +04:00
Alexander Presnyakov
58040871bd Fixed "raise None" that happened on signal interception
Also fixed couple of (possibly) unbound variables
2025-08-11 16:48:19 +00:00
mariadb-AlanMologorsky
b6a5c1d71f fix(cmapi): MCOL-5913 cmapi write local address 127.0.1.1 instead real ip address
* feat(cmapi): NetworkManager class for some ip hostname opoerations.
 * fix(cmapi): Use NetworkManager class to resolve ip and hostname in node_manipulation.add_node function
 * fix(cmapi): Minor docstring and formatting fixes
2025-07-08 15:57:28 +03:00
mariadb-AlanMologorsky
27e3b8d808 feat(cmapi): MCOL-5133: Stage3 stand alone cli tool.
MAJOR: Some logic inside node remove changed significantly using active
    nodes list from Columnstore.xml to broadcast config after remove.

 [fix] TransactionManager passsing  extra, remove and optional nodes arguments to start_transaction function
 [fix] commit and rollback methods of TransactionManager adding nodes argument
 [fix] TransactionManager using success_txn_nodes inside
 [fix] remove node logic to use Transaction manager
 [fix] cluster set api key call using totp on a top level cli call
 [add] missed docstrings
 [fix] cluster shutdown timeout for next release
2025-04-01 18:03:07 +04:00
Alan Mologorsky
a60a5288d8 feat(cmapi): MCOL-5133: Stage1 to stand alone cli tool. (#3378)
[add] cluster api client class
[fix] cli cluster_app using cluster api client
[add] ClusterAction Enum
[add] toggle_cluster_state function to reduce code duplication
2025-04-01 18:03:07 +04:00
Alan Mologorsky
a00711cc16 fix!(cmapi): MCOL-5454: Self-signed certificate autorenew. (#3213)
[add] managers/certificate.py with CertificateManger class
[mv] creating self-signed certificate logic into CertificateManger class
[add] renew and days_before_expire methods to CertificateManger class
[mv] several certificate dependent constants to managers/certificate.py
[add] CherryPy BackgroundTask to invoke certificate check hourly (3600 secs)
[fix] tests
[fix] bug with txn timer clean (clean_txn_by_timeout, worker and invoking of it)
2025-04-01 18:03:07 +04:00
Alan Mologorsky
6efffef765 MCOL-5594: Duplicate to stable-23.10 (#3222)
* MCOL-5594: Interactive "mcs cluster stop" command for CMAPI. (#3024)

* MCOL-5594: Interactive "mcs cluster stop" command for CMAPI.

[add] NodeProcessController class to handle Node operations
[add] two endpoints: stop_dmlproc (PUT) and is_process_running (GET)
[add] NodeProcessController.put_stop_dmlproc method to separately stop DMLProc on primary Node
[add] NodeProcessController.get_process_running method to check if specified process running or not
[add] build_url function to helpers.py. It needed to build urls with query_params
[add] MCSProcessManager.gracefully_stop_dmlproc method
[add] MCSProcessManager.is_service_running method as a top level wrapper to the same method in dispatcher
[fix] MCSProcessManager.stop by using new gracefully_stop_dmlproc
[add] interactive option and mode to mcs cluster stop command
[fix] requirements.txt with typer version to 0.9.0 where supports various of features including "Annotated"
[fix] requirements.txt click version (8.1.3 -> 8.1.7) and typing-extensions (4.3.0 -> 4.8.0). This is dependencies for typer package.
[fix] multiple minor formatting, docstrings and comments

* MCOL-5594: Add new CMAPI transaction manager.

- [add] TransactionManager ContextDecorator to manage transactions in less code and in one place
- [add] TransactionManager to cli cluster stop command and to API cluster shutdown command
- [fix] id -> txn_id in ClusterHandler class
- [fix] ClusterHandler.shutdown class to use inside existing transaction
- [add] docstrings in multiple places

* MCOL-5594: Review fixes.

* fix(cmapi): MCOL-5594: Fix after testing.. (#3150)

- [fix] wrong invoke already imported constants and function from signal lib
2024-06-27 14:32:50 +04:00
Roman Nozdrin
3046f7f952 fix(cmapi): handle an exception seen in CMAPI at RHEL 8 2024-05-21 16:19:30 +04:00
mariadb-AlanMologorsky
a079a2c944 MCOL-5496: Merge CMAPI code to engine repo.
[add] cmapi code to engine
2023-06-07 10:00:16 +03:00