mirror of
https://github.com/Mbed-TLS/mbedtls.git
synced 2025-08-08 17:42:09 +03:00
Move PSA documentation to tf-psa-crypto
Move the docuumentation files that after the split will fit better in TF-PSA-Crypto than Mbed TLS. No comment update. Signed-off-by: Ronald Cron <ronald.cron@arm.com>
This commit is contained in:
@@ -1,467 +0,0 @@
|
||||
Mbed TLS storage specification
|
||||
=================================
|
||||
|
||||
This document specifies how Mbed TLS uses storage.
|
||||
Key storage was originally introduced in a product called Mbed Crypto, which was re-distributed via Mbed TLS and has since been merged into Mbed TLS.
|
||||
This document contains historical information both from before and after this merge.
|
||||
|
||||
Mbed Crypto may be upgraded on an existing device with the storage preserved. Therefore:
|
||||
|
||||
1. Any change may break existing installations and may require an upgrade path.
|
||||
1. This document retains historical information about all past released versions. Do not remove information from this document unless it has always been incorrect or it is about a version that you are sure was never released.
|
||||
|
||||
Mbed Crypto 0.1.0
|
||||
-----------------
|
||||
|
||||
Tags: mbedcrypto-0.1.0b, mbedcrypto-0.1.0b2
|
||||
|
||||
Released in November 2018. <br>
|
||||
Integrated in Mbed OS 5.11.
|
||||
|
||||
Supported backends:
|
||||
|
||||
* [PSA ITS](#file-namespace-on-its-for-0.1.0)
|
||||
* [C stdio](#file-namespace-on-stdio-for-0.1.0)
|
||||
|
||||
Supported features:
|
||||
|
||||
* [Persistent transparent keys](#key-file-format-for-0.1.0) designated by a [slot number](#key-names-for-0.1.0).
|
||||
* [Nonvolatile random seed](#nonvolatile-random-seed-file-format-for-0.1.0) on ITS only.
|
||||
|
||||
This is a beta release, and we do not promise backward compatibility, with one exception:
|
||||
|
||||
> On Mbed OS, if a device has a nonvolatile random seed file produced with Mbed OS 5.11.x and is upgraded to a later version of Mbed OS, the nonvolatile random seed file is preserved or upgraded.
|
||||
|
||||
We do not make any promises regarding key storage, or regarding the nonvolatile random seed file on other platforms.
|
||||
|
||||
### Key names for 0.1.0
|
||||
|
||||
Information about each key is stored in a dedicated file whose name is constructed from the key identifier. The way in which the file name is constructed depends on the storage backend. The content of the file is described [below](#key-file-format-for-0.1.0).
|
||||
|
||||
The valid values for a key identifier are the range from 1 to 0xfffeffff. This limitation on the range is not documented in user-facing documentation: according to the user-facing documentation, arbitrary 32-bit values are valid.
|
||||
|
||||
The code uses the following constant in an internal header (note that despite the name, this value is actually one plus the maximum permitted value):
|
||||
|
||||
#define PSA_MAX_PERSISTENT_KEY_IDENTIFIER 0xffff0000
|
||||
|
||||
There is a shared namespace for all callers.
|
||||
|
||||
### Key file format for 0.1.0
|
||||
|
||||
All integers are encoded in little-endian order in 8-bit bytes.
|
||||
|
||||
The layout of a key file is:
|
||||
|
||||
* magic (8 bytes): `"PSA\0KEY\0"`
|
||||
* version (4 bytes): 0
|
||||
* type (4 bytes): `psa_key_type_t` value
|
||||
* policy usage flags (4 bytes): `psa_key_usage_t` value
|
||||
* policy usage algorithm (4 bytes): `psa_algorithm_t` value
|
||||
* key material length (4 bytes)
|
||||
* key material: output of `psa_export_key`
|
||||
* Any trailing data is rejected on load.
|
||||
|
||||
### Nonvolatile random seed file format for 0.1.0
|
||||
|
||||
The nonvolatile random seed file contains a seed for the random generator. If present, it is rewritten at each boot as part of the random generator initialization.
|
||||
|
||||
The file format is just the seed as a byte string with no metadata or encoding of any kind.
|
||||
|
||||
### File namespace on ITS for 0.1.0
|
||||
|
||||
Assumption: ITS provides a 32-bit file identifier namespace. The Crypto service can use arbitrary file identifiers and no other part of the system accesses the same file identifier namespace.
|
||||
|
||||
* File 0: unused.
|
||||
* Files 1 through 0xfffeffff: [content](#key-file-format-for-0.1.0) of the [key whose identifier is the file identifier](#key-names-for-0.1.0).
|
||||
* File 0xffffff52 (`PSA_CRYPTO_ITS_RANDOM_SEED_UID`): [nonvolatile random seed](#nonvolatile-random-seed-file-format-for-0.1.0).
|
||||
* Files 0xffff0000 through 0xffffff51, 0xffffff53 through 0xffffffff: unused.
|
||||
|
||||
### File namespace on stdio for 0.1.0
|
||||
|
||||
Assumption: C stdio, allowing names containing lowercase letters, digits and underscores, of length up to 23.
|
||||
|
||||
An undocumented build-time configuration value `CRYPTO_STORAGE_FILE_LOCATION` allows storing the key files in a directory other than the current directory. This value is simply prepended to the file name (so it must end with a directory separator to put the keys in a different directory).
|
||||
|
||||
* `CRYPTO_STORAGE_FILE_LOCATION "psa_key_slot_0"`: used as a temporary file. Must be writable. May be overwritten or deleted if present.
|
||||
* `sprintf(CRYPTO_STORAGE_FILE_LOCATION "psa_key_slot_%lu", key_id)` [content](#key-file-format-for-0.1.0) of the [key whose identifier](#key-names-for-0.1.0) is `key_id`.
|
||||
* Other files: unused.
|
||||
|
||||
Mbed Crypto 1.0.0
|
||||
-----------------
|
||||
|
||||
Tags: mbedcrypto-1.0.0d4, mbedcrypto-1.0.0
|
||||
|
||||
Released in February 2019. <br>
|
||||
Integrated in Mbed OS 5.12.
|
||||
|
||||
Supported integrations:
|
||||
|
||||
* [PSA platform](#file-namespace-on-a-psa-platform-for-1.0.0)
|
||||
* [library using PSA ITS](#file-namespace-on-its-as-a-library-for-1.0.0)
|
||||
* [library using C stdio](#file-namespace-on-stdio-for-1.0.0)
|
||||
|
||||
Supported features:
|
||||
|
||||
* [Persistent transparent keys](#key-file-format-for-1.0.0) designated by a [key identifier and owner](#key-names-for-1.0.0).
|
||||
* [Nonvolatile random seed](#nonvolatile-random-seed-file-format-for-1.0.0) on ITS only.
|
||||
|
||||
Backward compatibility commitments: TBD
|
||||
|
||||
### Key names for 1.0.0
|
||||
|
||||
Information about each key is stored in a dedicated file designated by the key identifier. In integrations where there is no concept of key owner (in particular, in library integrations), the key identifier is exactly the key identifier as defined in the PSA Cryptography API specification (`psa_key_id_t`). In integrations where there is a concept of key owner (integration into a service for example), the key identifier is made of an owner identifier (its semantics and type are integration specific) and of the key identifier (`psa_key_id_t`) from the key owner point of view.
|
||||
|
||||
The way in which the file name is constructed from the key identifier depends on the storage backend. The content of the file is described [below](#key-file-format-for-1.0.0).
|
||||
|
||||
* Library integration: the key file name is just the key identifier as defined in the PSA crypto specification. This is a 32-bit value.
|
||||
* PSA service integration: the key file name is `(uint64_t)owner_uid << 32 | key_id` where `key_id` is the key identifier from the owner point of view and `owner_uid` (of type `int32_t`) is the calling partition identifier provided to the server by the partition manager. This is a 64-bit value.
|
||||
|
||||
### Key file format for 1.0.0
|
||||
|
||||
The layout is identical to [0.1.0](#key-file-format-for-0.1.0) so far. However note that the encoding of key types, algorithms and key material has changed, therefore the storage format is not compatible (despite using the same value in the version field so far).
|
||||
|
||||
### Nonvolatile random seed file format for 1.0.0
|
||||
|
||||
The nonvolatile random seed file contains a seed for the random generator. If present, it is rewritten at each boot as part of the random generator initialization.
|
||||
|
||||
The file format is just the seed as a byte string with no metadata or encoding of any kind.
|
||||
|
||||
This is unchanged since [the feature was introduced in Mbed Crypto 0.1.0](#nonvolatile-random-seed-file-format-for-0.1.0).
|
||||
|
||||
### File namespace on a PSA platform for 1.0.0
|
||||
|
||||
Assumption: ITS provides a 64-bit file identifier namespace. The Crypto service can use arbitrary file identifiers and no other part of the system accesses the same file identifier namespace.
|
||||
|
||||
Assumption: the owner identifier is a nonzero value of type `int32_t`.
|
||||
|
||||
* Files 0 through 0xffffff51, 0xffffff53 through 0xffffffff: unused, reserved for internal use of the crypto library or crypto service.
|
||||
* File 0xffffff52 (`PSA_CRYPTO_ITS_RANDOM_SEED_UID`): [nonvolatile random seed](#nonvolatile-random-seed-file-format-for-0.1.0).
|
||||
* Files 0x100000000 through 0xffffffffffff: [content](#key-file-format-for-1.0.0) of the [key whose identifier is the file identifier](#key-names-for-1.0.0). The upper 32 bits determine the owner.
|
||||
|
||||
### File namespace on ITS as a library for 1.0.0
|
||||
|
||||
Assumption: ITS provides a 64-bit file identifier namespace. The entity using the crypto library can use arbitrary file identifiers and no other part of the system accesses the same file identifier namespace.
|
||||
|
||||
This is a library integration, so there is no owner. The key file identifier is identical to the key identifier.
|
||||
|
||||
* File 0: unused.
|
||||
* Files 1 through 0xfffeffff: [content](#key-file-format-for-1.0.0) of the [key whose identifier is the file identifier](#key-names-for-1.0.0).
|
||||
* File 0xffffff52 (`PSA_CRYPTO_ITS_RANDOM_SEED_UID`): [nonvolatile random seed](#nonvolatile-random-seed-file-format-for-1.0.0).
|
||||
* Files 0xffff0000 through 0xffffff51, 0xffffff53 through 0xffffffff, 0x100000000 through 0xffffffffffffffff: unused.
|
||||
|
||||
### File namespace on stdio for 1.0.0
|
||||
|
||||
This is a library integration, so there is no owner. The key file identifier is identical to the key identifier.
|
||||
|
||||
[Identical to 0.1.0](#file-namespace-on-stdio-for-0.1.0).
|
||||
|
||||
### Upgrade from 0.1.0 to 1.0.0.
|
||||
|
||||
* Delete files 1 through 0xfffeffff, which contain keys in a format that is no longer supported.
|
||||
|
||||
### Suggested changes to make before 1.0.0
|
||||
|
||||
The library integration and the PSA platform integration use different sets of file names. This is annoyingly non-uniform. For example, if we want to store non-key files, we have room in different ranges (0 through 0xffffffff on a PSA platform, 0xffff0000 through 0xffffffffffffffff in a library integration).
|
||||
|
||||
It would simplify things to always have a 32-bit owner, with a nonzero value, and thus reserve the range 0–0xffffffff for internal library use.
|
||||
|
||||
Mbed Crypto 1.1.0
|
||||
-----------------
|
||||
|
||||
Tags: mbedcrypto-1.1.0
|
||||
|
||||
Released in early June 2019. <br>
|
||||
Integrated in Mbed OS 5.13.
|
||||
|
||||
Changes since [1.0.0](#mbed-crypto-1.0.0):
|
||||
|
||||
* The stdio backend for storage has been replaced by an implementation of [PSA ITS over stdio](#file-namespace-on-stdio-for-1.1.0).
|
||||
* [Some changes in the key file format](#key-file-format-for-1.1.0).
|
||||
|
||||
### File namespace on stdio for 1.1.0
|
||||
|
||||
Assumption: C stdio, allowing names containing lowercase letters, digits and underscores, of length up to 23.
|
||||
|
||||
An undocumented build-time configuration value `PSA_ITS_STORAGE_PREFIX` allows storing the key files in a directory other than the current directory. This value is simply prepended to the file name (so it must end with a directory separator to put the keys in a different directory).
|
||||
|
||||
* `PSA_ITS_STORAGE_PREFIX "tempfile.psa_its"`: used as a temporary file. Must be writable. May be overwritten or deleted if present.
|
||||
* `sprintf(PSA_ITS_STORAGE_PREFIX "%016llx.psa_its", key_id)`: a key or non-key file. The `key_id` in the name is the 64-bit file identifier, which is the [key identifier](#key-names-for-mbed-tls-2.25.0) for a key file or some reserved identifier for a non-key file (currently: only the [nonvolatile random seed](#nonvolatile-random-seed-file-format-for-1.0.0)). The contents of the file are:
|
||||
* Magic header (8 bytes): `"PSA\0ITS\0"`
|
||||
* File contents.
|
||||
|
||||
### Key file format for 1.1.0
|
||||
|
||||
The key file format is identical to [1.0.0](#key-file-format-for-1.0.0), except for the following changes:
|
||||
|
||||
* A new policy field, marked as [NEW:1.1.0] below.
|
||||
* The encoding of key types, algorithms and key material has changed, therefore the storage format is not compatible (despite using the same value in the version field so far).
|
||||
|
||||
A self-contained description of the file layout follows.
|
||||
|
||||
All integers are encoded in little-endian order in 8-bit bytes.
|
||||
|
||||
The layout of a key file is:
|
||||
|
||||
* magic (8 bytes): `"PSA\0KEY\0"`
|
||||
* version (4 bytes): 0
|
||||
* type (4 bytes): `psa_key_type_t` value
|
||||
* policy usage flags (4 bytes): `psa_key_usage_t` value
|
||||
* policy usage algorithm (4 bytes): `psa_algorithm_t` value
|
||||
* policy enrollment algorithm (4 bytes): `psa_algorithm_t` value [NEW:1.1.0]
|
||||
* key material length (4 bytes)
|
||||
* key material: output of `psa_export_key`
|
||||
* Any trailing data is rejected on load.
|
||||
|
||||
Mbed Crypto TBD
|
||||
---------------
|
||||
|
||||
Tags: TBD
|
||||
|
||||
Released in TBD 2019. <br>
|
||||
Integrated in Mbed OS TBD.
|
||||
|
||||
### Changes introduced in TBD
|
||||
|
||||
* The layout of a key file now has a lifetime field before the type field.
|
||||
* Key files can store references to keys in a secure element. In such key files, the key material contains the slot number.
|
||||
|
||||
### File namespace on a PSA platform on TBD
|
||||
|
||||
Assumption: ITS provides a 64-bit file identifier namespace. The Crypto service can use arbitrary file identifiers and no other part of the system accesses the same file identifier namespace.
|
||||
|
||||
Assumption: the owner identifier is a nonzero value of type `int32_t`.
|
||||
|
||||
* Files 0 through 0xfffeffff: unused.
|
||||
* Files 0xffff0000 through 0xffffffff: reserved for internal use of the crypto library or crypto service. See [non-key files](#non-key-files-on-tbd).
|
||||
* Files 0x100000000 through 0xffffffffffff: [content](#key-file-format-for-1.0.0) of the [key whose identifier is the file identifier](#key-names-for-1.0.0). The upper 32 bits determine the owner.
|
||||
|
||||
### File namespace on ITS as a library on TBD
|
||||
|
||||
Assumption: ITS provides a 64-bit file identifier namespace. The entity using the crypto library can use arbitrary file identifiers and no other part of the system accesses the same file identifier namespace.
|
||||
|
||||
This is a library integration, so there is no owner. The key file identifier is identical to the key identifier.
|
||||
|
||||
* File 0: unused.
|
||||
* Files 1 through 0xfffeffff: [content](#key-file-format-for-1.0.0) of the [key whose identifier is the file identifier](#key-names-for-1.0.0).
|
||||
* Files 0xffff0000 through 0xffffffff: reserved for internal use of the crypto library or crypto service. See [non-key files](#non-key-files-on-tbd).
|
||||
* Files 0x100000000 through 0xffffffffffffffff: unused.
|
||||
|
||||
### Non-key files on TBD
|
||||
|
||||
File identifiers in the range 0xffff0000 through 0xffffffff are reserved for internal use in Mbed Crypto.
|
||||
|
||||
* Files 0xfffffe02 through 0xfffffeff (`PSA_CRYPTO_SE_DRIVER_ITS_UID_BASE + lifetime`): secure element driver storage. The content of the file is the secure element driver's persistent data.
|
||||
* File 0xffffff52 (`PSA_CRYPTO_ITS_RANDOM_SEED_UID`): [nonvolatile random seed](#nonvolatile-random-seed-file-format-for-1.0.0).
|
||||
* File 0xffffff54 (`PSA_CRYPTO_ITS_TRANSACTION_UID`): [transaction file](#transaction-file-format-for-tbd).
|
||||
* Other files are unused and reserved for future use.
|
||||
|
||||
### Key file format for TBD
|
||||
|
||||
All integers are encoded in little-endian order in 8-bit bytes except where otherwise indicated.
|
||||
|
||||
The layout of a key file is:
|
||||
|
||||
* magic (8 bytes): `"PSA\0KEY\0"`.
|
||||
* version (4 bytes): 0.
|
||||
* lifetime (4 bytes): `psa_key_lifetime_t` value.
|
||||
* type (4 bytes): `psa_key_type_t` value.
|
||||
* policy usage flags (4 bytes): `psa_key_usage_t` value.
|
||||
* policy usage algorithm (4 bytes): `psa_algorithm_t` value.
|
||||
* policy enrollment algorithm (4 bytes): `psa_algorithm_t` value.
|
||||
* key material length (4 bytes).
|
||||
* key material:
|
||||
* For a transparent key: output of `psa_export_key`.
|
||||
* For an opaque key (unified driver interface): driver-specific opaque key blob.
|
||||
* For an opaque key (key in a secure element): slot number (8 bytes), in platform endianness.
|
||||
* Any trailing data is rejected on load.
|
||||
|
||||
### Transaction file format for TBD
|
||||
|
||||
The transaction file contains data about an ongoing action that cannot be completed atomically. It exists only if there is an ongoing transaction.
|
||||
|
||||
All integers are encoded in platform endianness.
|
||||
|
||||
All currently existing transactions concern a key in a secure element.
|
||||
|
||||
The layout of a transaction file is:
|
||||
|
||||
* type (2 bytes): the [transaction type](#transaction-types-on-tbd).
|
||||
* unused (2 bytes)
|
||||
* lifetime (4 bytes): `psa_key_lifetime_t` value that corresponds to a key in a secure element.
|
||||
* slot number (8 bytes): `psa_key_slot_number_t` value. This is the unique designation of the key for the secure element driver.
|
||||
* key identifier (4 bytes in a library integration, 8 bytes on a PSA platform): the internal representation of the key identifier. On a PSA platform, this encodes the key owner in the same way as [in file identifiers for key files](#file-namespace-on-a-psa-platform-on-tbd)).
|
||||
|
||||
#### Transaction types on TBD
|
||||
|
||||
* 0x0001: key creation. The following locations may or may not contain data about the key that is being created:
|
||||
* The slot in the secure element designated by the slot number.
|
||||
* The file containing the key metadata designated by the key identifier.
|
||||
* The driver persistent data.
|
||||
* 0x0002: key destruction. The following locations may or may not still contain data about the key that is being destroyed:
|
||||
* The slot in the secure element designated by the slot number.
|
||||
* The file containing the key metadata designated by the key identifier.
|
||||
* The driver persistent data.
|
||||
|
||||
Mbed Crypto TBD
|
||||
---------------
|
||||
|
||||
Tags: TBD
|
||||
|
||||
Released in TBD 2020. <br>
|
||||
Integrated in Mbed OS TBD.
|
||||
|
||||
### Changes introduced in TBD
|
||||
|
||||
* The type field has been split into a type and a bits field of 2 bytes each.
|
||||
|
||||
### Key file format for TBD
|
||||
|
||||
All integers are encoded in little-endian order in 8-bit bytes except where otherwise indicated.
|
||||
|
||||
The layout of a key file is:
|
||||
|
||||
* magic (8 bytes): `"PSA\0KEY\0"`.
|
||||
* version (4 bytes): 0.
|
||||
* lifetime (4 bytes): `psa_key_lifetime_t` value.
|
||||
* type (2 bytes): `psa_key_type_t` value.
|
||||
* bits (2 bytes): `psa_key_bits_t` value.
|
||||
* policy usage flags (4 bytes): `psa_key_usage_t` value.
|
||||
* policy usage algorithm (4 bytes): `psa_algorithm_t` value.
|
||||
* policy enrollment algorithm (4 bytes): `psa_algorithm_t` value.
|
||||
* key material length (4 bytes).
|
||||
* key material:
|
||||
* For a transparent key: output of `psa_export_key`.
|
||||
* For an opaque key (unified driver interface): driver-specific opaque key blob.
|
||||
* For an opaque key (key in a secure element): slot number (8 bytes), in platform endianness.
|
||||
* Any trailing data is rejected on load.
|
||||
|
||||
Mbed TLS 2.25.0
|
||||
---------------
|
||||
|
||||
Tags: `mbedtls-2.25.0`, `mbedtls-2.26.0`, `mbedtls-2.27.0`, `mbedtls-2.28.0`, `mbedtls-3.0.0`, `mbedtls-3.1.0`
|
||||
|
||||
First released in December 2020.
|
||||
|
||||
Note: this is the first version that is officially supported. The version number is still 0.
|
||||
|
||||
Backward compatibility commitments: we promise backward compatibility for stored keys when Mbed TLS is upgraded from x to y if x >= 2.25 and y < 4. See [`BRANCHES.md`](../../BRANCHES.md) for more details.
|
||||
|
||||
Supported integrations:
|
||||
|
||||
* [PSA platform](#file-namespace-on-a-psa-platform-on-mbed-tls-2.25.0)
|
||||
* [library using PSA ITS](#file-namespace-on-its-as-a-library-on-mbed-tls-2.25.0)
|
||||
* [library using C stdio](#file-namespace-on-stdio-for-mbed-tls-2.25.0)
|
||||
|
||||
Supported features:
|
||||
|
||||
* [Persistent keys](#key-file-format-for-mbed-tls-2.25.0) designated by a [key identifier and owner](#key-names-for-mbed-tls-2.25.0). Keys can be:
|
||||
* Transparent, stored in the export format.
|
||||
* Opaque, using the PSA driver interface with statically registered drivers. The driver determines the content of the opaque key blob.
|
||||
* Opaque, using the deprecated secure element interface with dynamically registered drivers (`MBEDTLS_PSA_CRYPTO_SE_C`). The driver picks a slot number which is stored in the place of the key material.
|
||||
* [Nonvolatile random seed](#nonvolatile-random-seed-file-format-for-mbed-tls-2.25.0) on ITS only.
|
||||
|
||||
### Changes introduced in Mbed TLS 2.25.0
|
||||
|
||||
* The numerical encodings of `psa_key_type_t`, `psa_key_usage_t` and `psa_algorithm_t` have changed.
|
||||
|
||||
### File namespace on a PSA platform on Mbed TLS 2.25.0
|
||||
|
||||
Assumption: ITS provides a 64-bit file identifier namespace. The Crypto service can use arbitrary file identifiers and no other part of the system accesses the same file identifier namespace.
|
||||
|
||||
Assumption: the owner identifier is a nonzero value of type `int32_t`.
|
||||
|
||||
* Files 0 through 0xfffeffff: unused.
|
||||
* Files 0xffff0000 through 0xffffffff: reserved for internal use of the crypto library or crypto service. See [non-key files](#non-key-files-on-mbed-tls-2.25.0).
|
||||
* Files 0x100000000 through 0xffffffffffff: [content](#key-file-format-for-mbed-tls-2.25.0) of the [key whose identifier is the file identifier](#key-names-for-mbed-tls-2.25.0). The upper 32 bits determine the owner.
|
||||
|
||||
### File namespace on ITS as a library on Mbed TLS 2.25.0
|
||||
|
||||
Assumption: ITS provides a 64-bit file identifier namespace. The entity using the crypto library can use arbitrary file identifiers and no other part of the system accesses the same file identifier namespace.
|
||||
|
||||
This is a library integration, so there is no owner. The key file identifier is identical to the key identifier.
|
||||
|
||||
* File 0: unused.
|
||||
* Files 1 through 0xfffeffff: [content](#key-file-format-for-mbed-tls-2.25.0) of the [key whose identifier is the file identifier](#key-names-for-mbed-tls-2.25.0).
|
||||
* Files 0xffff0000 through 0xffffffff: reserved for internal use of the crypto library or crypto service. See [non-key files](#non-key-files-on-mbed-tls-2.25.0).
|
||||
* Files 0x100000000 through 0xffffffffffffffff: unused.
|
||||
|
||||
### File namespace on stdio for Mbed TLS 2.25.0
|
||||
|
||||
Assumption: C stdio, allowing names containing lowercase letters, digits and underscores, of length up to 23.
|
||||
|
||||
An undocumented build-time configuration value `PSA_ITS_STORAGE_PREFIX` allows storing the key files in a directory other than the current directory. This value is simply prepended to the file name (so it must end with a directory separator to put the keys in a different directory).
|
||||
|
||||
* `PSA_ITS_STORAGE_PREFIX "tempfile.psa_its"`: used as a temporary file. Must be writable. May be overwritten or deleted if present.
|
||||
* `sprintf(PSA_ITS_STORAGE_PREFIX "%016llx.psa_its", key_id)`: a key or non-key file. The `key_id` in the name is the 64-bit file identifier, which is the [key identifier](#key-names-for-mbed-tls-2.25.0) for a key file or some reserved identifier for a [non-key file](#non-key-files-on-mbed-tls-2.25.0). The contents of the file are:
|
||||
* Magic header (8 bytes): `"PSA\0ITS\0"`
|
||||
* File contents.
|
||||
|
||||
### Key names for Mbed TLS 2.25.0
|
||||
|
||||
Information about each key is stored in a dedicated file designated by the key identifier. In integrations where there is no concept of key owner (in particular, in library integrations), the key identifier is exactly the key identifier as defined in the PSA Cryptography API specification (`psa_key_id_t`). In integrations where there is a concept of key owner (integration into a service for example), the key identifier is made of an owner identifier (its semantics and type are integration specific) and of the key identifier (`psa_key_id_t`) from the key owner point of view.
|
||||
|
||||
The way in which the file name is constructed from the key identifier depends on the storage backend. The content of the file is described [below](#key-file-format-for-mbed-tls-2.25.0).
|
||||
|
||||
* Library integration: the key file name is just the key identifier as defined in the PSA crypto specification. This is a 32-bit value which must be in the range 0x00000001..0x3fffffff (`PSA_KEY_ID_USER_MIN`..`PSA_KEY_ID_USER_MAX`).
|
||||
* PSA service integration: the key file name is `(uint64_t)owner_uid << 32 | key_id` where `key_id` is the key identifier from the owner point of view and `owner_uid` (of type `int32_t`) is the calling partition identifier provided to the server by the partition manager. This is a 64-bit value.
|
||||
|
||||
### Key file format for Mbed TLS 2.25.0
|
||||
|
||||
All integers are encoded in little-endian order in 8-bit bytes except where otherwise indicated.
|
||||
|
||||
The layout of a key file is:
|
||||
|
||||
* magic (8 bytes): `"PSA\0KEY\0"`.
|
||||
* version (4 bytes): 0.
|
||||
* lifetime (4 bytes): `psa_key_lifetime_t` value.
|
||||
* type (2 bytes): `psa_key_type_t` value.
|
||||
* bits (2 bytes): `psa_key_bits_t` value.
|
||||
* policy usage flags (4 bytes): `psa_key_usage_t` value.
|
||||
* policy usage algorithm (4 bytes): `psa_algorithm_t` value.
|
||||
* policy enrollment algorithm (4 bytes): `psa_algorithm_t` value.
|
||||
* key material length (4 bytes).
|
||||
* key material:
|
||||
* For a transparent key: output of `psa_export_key`.
|
||||
* For an opaque key (unified driver interface): driver-specific opaque key blob.
|
||||
* For an opaque key (key in a dynamic secure element): slot number (8 bytes), in platform endianness.
|
||||
* Any trailing data is rejected on load.
|
||||
|
||||
### Non-key files on Mbed TLS 2.25.0
|
||||
|
||||
File identifiers that are outside the range of persistent key identifiers are reserved for internal use by the library. The only identifiers currently in use have the owner id (top 32 bits) set to 0.
|
||||
|
||||
* Files 0xfffffe02 through 0xfffffeff (`PSA_CRYPTO_SE_DRIVER_ITS_UID_BASE + lifetime`): dynamic secure element driver storage. The content of the file is the secure element driver's persistent data.
|
||||
* File 0xffffff52 (`PSA_CRYPTO_ITS_RANDOM_SEED_UID`): [nonvolatile random seed](#nonvolatile-random-seed-file-format-for-mbed-tls-2.25.0).
|
||||
* File 0xffffff54 (`PSA_CRYPTO_ITS_TRANSACTION_UID`): [transaction file](#transaction-file-format-for-mbed-tls-2.25.0).
|
||||
* Other files are unused and reserved for future use.
|
||||
|
||||
### Nonvolatile random seed file format for Mbed TLS 2.25.0
|
||||
|
||||
[Identical to Mbed Crypto 0.1.0](#nonvolatile-random-seed-file-format-for-0.1.0).
|
||||
|
||||
### Transaction file format for Mbed TLS 2.25.0
|
||||
|
||||
The transaction file contains data about an ongoing action that cannot be completed atomically. It exists only if there is an ongoing transaction.
|
||||
|
||||
All integers are encoded in platform endianness.
|
||||
|
||||
All currently existing transactions concern a key in a dynamic secure element.
|
||||
|
||||
The layout of a transaction file is:
|
||||
|
||||
* type (2 bytes): the [transaction type](#transaction-types-on-mbed-tls-2.25.0).
|
||||
* unused (2 bytes)
|
||||
* lifetime (4 bytes): `psa_key_lifetime_t` value that corresponds to a key in a secure element.
|
||||
* slot number (8 bytes): `psa_key_slot_number_t` value. This is the unique designation of the key for the secure element driver.
|
||||
* key identifier (4 bytes in a library integration, 8 bytes on a PSA platform): the internal representation of the key identifier. On a PSA platform, this encodes the key owner in the same way as [in file identifiers for key files](#file-namespace-on-a-psa-platform-on-mbed-tls-2.25.0)).
|
||||
|
||||
#### Transaction types on Mbed TLS 2.25.0
|
||||
|
||||
* 0x0001: key creation. The following locations may or may not contain data about the key that is being created:
|
||||
* The slot in the secure element designated by the slot number.
|
||||
* The file containing the key metadata designated by the key identifier.
|
||||
* The driver persistent data.
|
||||
* 0x0002: key destruction. The following locations may or may not still contain data about the key that is being destroyed:
|
||||
* The slot in the secure element designated by the slot number.
|
||||
* The file containing the key metadata designated by the key identifier.
|
||||
* The driver persistent data.
|
@@ -1,173 +0,0 @@
|
||||
PSA Cryptography API implementation and PSA driver interface
|
||||
===========================================================
|
||||
|
||||
## Introduction
|
||||
|
||||
The [PSA Cryptography API specification](https://armmbed.github.io/mbed-crypto/psa/#application-programming-interface) defines an interface to cryptographic operations for which the Mbed TLS library provides a reference implementation. The PSA Cryptography API specification is complemented by the PSA driver interface specification which defines an interface for cryptoprocessor drivers.
|
||||
|
||||
This document describes the high level organization of the Mbed TLS PSA Cryptography API implementation which is tightly related to the PSA driver interface.
|
||||
|
||||
## High level organization of the Mbed TLS PSA Cryptography API implementation
|
||||
In one sentence, the Mbed TLS PSA Cryptography API implementation is made of a core and PSA drivers as defined in the PSA driver interface. The key point is that software cryptographic operations are organized as PSA drivers: they interact with the core through the PSA driver interface.
|
||||
|
||||
### Rationale
|
||||
|
||||
* Addressing software and hardware cryptographic implementations through the same C interface reduces the core code size and its call graph complexity. The core and its dispatching to software and hardware implementations are consequently easier to test and validate.
|
||||
* The organization of the software cryptographic implementations in drivers promotes modularization of those implementations.
|
||||
* As hardware capabilities, software cryptographic functionalities can be described by a JSON driver description file as defined in the PSA driver interface.
|
||||
* Along with JSON driver description files, the PSA driver specification defines the deliverables for a driver to be included into the Mbed TLS PSA Cryptography implementation. This provides a natural framework to integrate third party or alternative software implementations of cryptographic operations.
|
||||
|
||||
## The Mbed TLS PSA Cryptography API implementation core
|
||||
|
||||
The core implements all the APIs as defined in the PSA Cryptography API specification but does not perform on its own any cryptographic operation. The core relies on PSA drivers to actually
|
||||
perform the cryptographic operations. The core is responsible for:
|
||||
|
||||
* the key store.
|
||||
* checking PSA API arguments and translating them into valid arguments for the necessary calls to the PSA driver interface.
|
||||
* dispatching the cryptographic operations to the appropriate PSA drivers.
|
||||
|
||||
The sketch of an Mbed TLS PSA cryptographic API implementation is thus:
|
||||
```C
|
||||
psa_status_t psa_api( ... )
|
||||
{
|
||||
psa_status_t status;
|
||||
|
||||
/* Pre driver interface call processing: validation of arguments, building
|
||||
* of arguments for the call to the driver interface, ... */
|
||||
|
||||
...
|
||||
|
||||
/* Call to the driver interface */
|
||||
status = psa_driver_wrapper_<entry_point>( ... );
|
||||
if( status != PSA_SUCCESS )
|
||||
return( status );
|
||||
|
||||
/* Post driver interface call processing: validation of the values returned
|
||||
* by the driver, finalization of the values to return to the caller,
|
||||
* clean-up in case of error ... */
|
||||
}
|
||||
```
|
||||
The code of most PSA APIs is expected to match precisely the above layout. However, it is likely that the code structure of some APIs will be more complicated with several calls to the driver interface, mainly to encompass a larger variety of hardware designs. For example, to encompass hardware accelerators that are capable of verifying a MAC and those that are only capable of computing a MAC, the psa_mac_verify() API could call first psa_driver_wrapper_mac_verify() and then fallback to psa_driver_wrapper_mac_compute().
|
||||
|
||||
The implementations of `psa_driver_wrapper_<entry_point>` functions are generated by the build system based on the JSON driver description files of the various PSA drivers making up the Mbed TLS PSA Cryptography API implementation. The implementations are splited into two parts. The static ones are generated in a psa_crypto_driver_wrappers.h header file, the non-static ones are generated in a psa_crypto_driver_wrappers_no_static.c C file and the function prototypes declared in a psa_crypto_driver_wrappers_no_static.h header file.
|
||||
|
||||
The psa_driver_wrapper_<entry_point>() functions dispatch cryptographic operations to accelerator drivers, secure element drivers as well as to the software implementations of cryptographic operations.
|
||||
|
||||
Note that the implementation allows to build the library with only a C compiler by shipping a generated file corresponding to a pure software implementation. The driver entry points and their code in this generated file are guarded by pre-processor directives based on PSA_WANT_xyz macros (see [Conditional inclusion of cryptographic mechanism through the PSA API in Mbed TLS](psa-conditional-inclusion-c.html). That way, it is possible to compile and include in the library only the desired cryptographic operations.
|
||||
|
||||
### Key creation
|
||||
|
||||
Key creation implementation in Mbed TLS PSA core is articulated around three internal functions: psa_start_key_creation(), psa_finish_key_creation() and psa_fail_key_creation(). Implementations of key creation PSA APIs, namely psa_import_key(), psa_generate_key(), psa_key_derivation_output_key() and psa_copy_key() go by the following sequence:
|
||||
1. Check the input parameters.
|
||||
2. Call psa_start_key_creation() that allocates a key slot, prepares it with the specified key attributes, and in case of a volatile key assign it a volatile key identifier.
|
||||
3. Generate or copy the key material into the key slot. This entails the allocation of the buffer to store the key material.
|
||||
4. Call psa_finish_key_creation() that mostly saves persistent keys into persistent storage.
|
||||
|
||||
In case of any error occurring at step 3 or 4, psa_fail_key_creation() is called. It wipes and cleans the slot especially the key material: reset to zero of the RAM memory that contained the key material, free the allocated buffer.
|
||||
|
||||
|
||||
## Mbed TLS PSA Cryptography API implementation drivers
|
||||
|
||||
A driver of the Mbed TLS PSA Cryptography API implementation (Mbed TLS PSA driver in the following) is a driver in the sense that it is compliant with the PSA driver interface specification. But it is not an actual driver that drives some hardware. It implements cryptographic operations purely in software.
|
||||
|
||||
An Mbed TLS PSA driver C file is named psa_crypto_<driver_name>.c and its associated header file psa_crypto_<driver_name>.h. The functions implementing a driver entry point as defined in the PSA driver interface specification are named as mbedtls_psa_<driver name>_<entry point>(). As an example, the psa_crypto_rsa.c and psa_crypto_rsa.h are the files containing the Mbed TLS PSA driver implementing RSA cryptographic operations. This RSA driver implements among other entry points the "import_key" entry point. The function implementing this entry point is named mbedtls_psa_rsa_import_key().
|
||||
|
||||
## How to implement a new cryptographic mechanism
|
||||
|
||||
Summary of files to modify when adding a new algorithm or key type:
|
||||
|
||||
* [ ] PSA Crypto API draft, if not already done — [PSA standardization](#psa-standardization)
|
||||
* [ ] `include/psa/crypto_values.h` or `include/psa/crypto_extra.h` — [New functions and macros](#new-functions-and-macros)
|
||||
* [ ] `include/psa/crypto_config.h`, `tests/configs/crypto_config_test_driver_extension.h` — [Preprocessor symbols](#preprocessor-symbols)
|
||||
* Occasionally `library/check_crypto_config.h` — [Preprocessor symbols](#preprocessor-symbols)
|
||||
* [ ] `include/mbedtls/config_psa.h` — [Preprocessor symbols](#preprocessor-symbols)
|
||||
* [ ] `library/psa_crypto.c`, `library/psa_crypto_*.[hc]` — [Implementation of the mechanisms](#implementation-of-the-mechanisms)
|
||||
* [ ] `include/psa/crypto_builtin_*.h` — [Translucent data structures](#translucent-data-structures)
|
||||
* [ ] `tests/suites/test_suite_psa_crypto_metadata.data` — [New functions and macros](#new-functions-and-macros)
|
||||
* (If adding `PSA_IS_xxx`) `tests/suites/test_suite_psa_crypto_metadata.function` — [New functions and macros](#new-functions-and-macros)
|
||||
* [ ] `tests/suites/test_suite_psa_crypto*.data`, `tests/suites/test_suite_psa_crypto*.function` — [Unit tests](#unit-tests)
|
||||
* [ ] `framework/scripts/mbedtls_framework/crypto_knowledge.py`, `framework/scripts/mbedtls_framework/asymmetric_key_data.py` — [Unit tests](#unit-tests)
|
||||
* [ ] `ChangeLog.d/*.txt` — changelog entry
|
||||
|
||||
Summary of files to modify when adding new API functions:
|
||||
|
||||
* [ ] `include/psa/crypto.h` and `include/psa/crypto_sizes.h`, or `include/psa/crypto_extra.h` — [New functions and macros](#new-functions-and-macros)
|
||||
* [ ] `library/psa_crypto.c`, `scripts/data_files/driver_templates/*.jinja` — [Implementation of the mechanisms](#implementation-of-the-mechanisms)
|
||||
* [ ] If adding stateful functions: `include/psa/crypto_struct.h`, `include/psa/crypto_builtin_*.h`, `include/psa/crypto_driver_contexts_*.h` — [Translucent data structures](#translucent-data-structures)
|
||||
* [ ] `tests/suites/test_suite_psa_crypto.data`, `tests/suites/test_suite_psa_crypto.function`, `tests/suites/test_suite_psa_crypto_driver_wrappers.*` — [Unit tests](#unit-tests)
|
||||
|
||||
Note that this is just a basic guide. In some cases, you won't need to change all the files listed here. In some cases, you may need to change other files.
|
||||
|
||||
### PSA standardization
|
||||
|
||||
Typically, if there's enough demand for a cryptographic mechanism in Mbed TLS, there's enough demand for it to be part of the official PSA Cryptography specification. Therefore the first step before implementing a new mechanism should be to approach the PSA Cryptography working group in Arm for standardization.
|
||||
|
||||
At the time of writing, all cryptographic mechanisms that are accessible through `psa_xxx` APIs in in Mbed TLS are current or upcoming PSA standards. Mbed TLS implements some extensions to the PSA API that offer extra integration customization or extra key policies.
|
||||
|
||||
Mbed TLS routinely implements cryptographic mechanisms that are not yet part of a published PSA standard, but that are scheduled to be part of a future version of the standard. The Mbed TLS implementation validates the feasibility of the upcoming PSA standard. The PSA Cryptography working group and the Mbed TLS development team communicate during the elaboration of the new interfaces.
|
||||
|
||||
### New functions and macros
|
||||
|
||||
If a mechanism requires new functions, they should follow the design guidelines in the PSA Cryptography API specification.
|
||||
|
||||
Functions that are part of the current or upcoming API are declared in `include/psa/crypto.h`, apart from structure accessors defined in `include/psa/crypto_struct.h`. Functions that have output buffers have associated sufficient-output-size macros in `include/psa/crypto_sizes.h`.
|
||||
|
||||
Constants (algorithm identifiers, key type identifiers, etc.) and associated destructor macros (e.g. `PSA_IS_xxx()`) are defined in `include/psa/crypto_values.h`.
|
||||
|
||||
Functions and macros that are not intended for standardization, or that are at a stage where the draft standard might still evolve significantly, are declared in `include/psa/crypto_extra.h`.
|
||||
|
||||
The PSA Cryptography API specification defines both names and values for certain kinds of constants: algorithms (`PSA_ALG_xxx`), key types (`PSA_KEY_TYPE_xxx`), ECC curve families (`PSA_ECC_FAMILY_xxx`), DH group families (`PSA_DH_FAMILY_xxx`). If Mbed TLS defines an algorithm or a key type that is not part of a current or upcoming PSA standard, pick a value with the `VENDOR` flag set. If Mbed TLS defines an ECC curve or DH group family that is not part of a current or upcoming PSA standard, define a vendor key type and use the family identifier only with this vendor key type.
|
||||
|
||||
New constants must have a test case in `tests/suites/test_suite_psa_crypto_metadata.data` that verifies that `PSA_IS_xxx` macros behave properly with the new constant. New `PSA_IS_xxx` macros must be declared in `tests/suites/test_suite_psa_crypto_metadata.function`.
|
||||
|
||||
### Preprocessor symbols
|
||||
|
||||
Each cryptographic mechanism is optional and can be selected by the application at build time. For each feature `PSA_ttt_xxx`:
|
||||
|
||||
* The feature is available to applications when the preprocessor symbol `PSA_WANT_ttt_xxx` is defined. These symbols are set in the application configuration file `include/psa/crypto_config.h` (or `MBEDTLS_PSA_CRYPTO_CONFIG_FILE`, plus `MBEDTLS_PSA_CRYPTO_USER_CONFIG_FILE`), with code in `include/mbedtls/config_psa.h` deducing the necessary underlying `MBEDTLS_xxx` symbols.
|
||||
* For transparent keys (keys that are not in a secure element), the feature is implemented by Mbed TLS if `MBEDTLS_PSA_BUILTIN_ttt_xxx` is defined, and by an accelerator driver if `MBEDTLS_PSA_ACCEL_ttt_xxx` is defined. `MBEDTLS_PSA_BUILTIN_ttt_xxx` constants are set in `include/mbedtls/config_psa.h` based on the application requests `PSA_WANT_ttt_xxx` and the accelerator driver declarations `MBEDTLS_PSA_ACCEL_ttt_xxx`.
|
||||
* For the testing of the driver dispatch code, `tests/configs/crypto_config_test_driver_extension.h` sets additional `MBEDTLS_PSA_ACCEL_xxx` symbols.
|
||||
|
||||
For more details, see *[Conditional inclusion of cryptographic mechanism through the PSA API in Mbed TLS](../proposed/psa-conditional-inclusion-c.html)*.
|
||||
|
||||
Some mechanisms require other mechanisms. For example, you can't do GCM without a block cipher, or RSA-PSS without RSA keys. When mechanism A requires mechanism B, `include/mbedtls/config_psa.h` ensures that B is enabled whenever A is enabled. When mechanism A requires at least one of a set {B1, B2, B3, ...} but there is no particular reason why enabling A would enable any of the specific Bi's, it's up to the application to choose Bi's and the file `library/check_crypto_config.h` contains compile-time constraints to ensure that at least one Bi is enabled.
|
||||
|
||||
### Implementation of the mechanisms
|
||||
|
||||
The general structure of a cryptographic operation function is:
|
||||
|
||||
1. API function defined in `library/psa_crypto.c`. The entry point performs generic checks that don't depend on whether the mechanism is implemented in software or in a driver and looks up keys in the key store.
|
||||
2. Driver dispatch code in `scripts/data_files/driver_templates/psa_crypto_driver_wrappers.h.jinja`, `scripts/data_files/driver_templates/psa_crypto_driver_wrappers_no_static.c.jinja` or files included from there.
|
||||
3. Built-in implementation in `library/psa_crypto_*.c` (with function declarations in the corresponding `.h` file). These files typically contain the implementation of modes of operation over basic building blocks that are defined elsewhere. For example, HMAC is implemented in `library/psa_crypto_mac.c` but the underlying hash functions are implemented in `library/sha*.c` and `library/md*.c`.
|
||||
4. Basic cryptographic building blocks in `library/*.c`.
|
||||
|
||||
When implementing a new algorithm or key type, there are typically things to change in `library/crypto.c` (e.g. buffer size calculations, algorithm/key-type compatibility) and in the built-in implementation, but not in the driver dispatch code.
|
||||
|
||||
### Translucent data structures
|
||||
|
||||
Some mechanisms require state to be kept between function calls. Keys and key-like data is kept in the key store, which PSA manages internally. Other state, for example the state of multipart operations, is kept in structures allocated by the caller.
|
||||
|
||||
The size of operation structures needs to be known at compile time, since callers may allocate them on the stack. Therefore these structures are defined in a public header: `include/psa/crypto_struct.h` for the parts that are independent of the underlying implementation, `include/psa/crypto_builtin_*` for parts that are specific to the Mbed TLS built-in implementation, `include/psa/crypto_driver_*.h` for structures implemented by drivers.
|
||||
|
||||
### Unit tests
|
||||
|
||||
A number of unit tests are automatically generated by `framework/scripts/generate_psa_tests.py` based on the algorithms and key types declared in `include/psa/crypto_values.h` and `include/psa/crypto_extra.h`:
|
||||
|
||||
* Attempt to create a key with a key type that is not supported.
|
||||
* Attempt to perform an operation with a combination of key type and algorithm that is not valid or not supported.
|
||||
* Storage and retrieval of a persistent key.
|
||||
|
||||
When adding a new key type or algorithm:
|
||||
|
||||
* `framework/scripts/mbedtls_framework/crypto_knowledge.py` contains knowledge about the compatibility of key types, key sizes and algorithms.
|
||||
* `framework/scripts/mbedtls_framework/asymmetric_key_data.py` contains valid key data for asymmetric key types.
|
||||
|
||||
Other things need to be tested manually, either in `tests/suites/test_sutie_psa_crypto.data` or in another file. For example (this is not an exhaustive list):
|
||||
|
||||
* Known answer tests.
|
||||
* Potential edge cases (e.g. data less/equal/more than the block size, number equal to zero in asymmetric cryptography).
|
||||
* Tests with invalid keys (e.g. wrong size or format).
|
||||
* Tests with invalid data (e.g. wrong size or format, output buffer too small, invalid padding).
|
||||
* For new functions: incorrect function call sequence, driver dispatch (in `tests/suites/test_suite_psa_crypto_driver_wrappers.*`).
|
||||
* For key derivation algorithms: variation on the sequence of input steps, variation on the output size.
|
||||
|
@@ -1,214 +0,0 @@
|
||||
PSA key store design
|
||||
====================
|
||||
|
||||
## Introduction
|
||||
|
||||
This document describes the architecture of the key storage in memory in the Mbed TLS and TF-PSA-Crypto implementation of the PSA Cryptography API.
|
||||
|
||||
In the PSA Cryptography API, cryptographic operations access key materials via a key identifier (key ID for short). Applications must first create a key object, which allocates storage in memory for the key material and metadata. This storage is under the control of the library and may be located in a different memory space such as a trusted execution environment or a secure element.
|
||||
|
||||
The storage of persistent keys is out of scope of this document. See the [Mbed Crypto storage specification](mbed-crypto-storage-specification.md).
|
||||
|
||||
## Key slot management interface
|
||||
|
||||
### Key store and key slots
|
||||
|
||||
The **key store** consists of a collection of **key slots**. Each key slot contains the metadata for one key, as well as the key material or a reference to the key material.
|
||||
|
||||
A key slot has the type `psa_key_slot_t`. The key store is a global object which is private inside `psa_crypto_slot_management.c`.
|
||||
|
||||
### Key slot entry points
|
||||
|
||||
The following operations allocate a key slot by calling `psa_reserve_free_key_slot()`:
|
||||
|
||||
* **Creating** a key object, through means such as import, random generation, deterministic derivation, copy, or registration of an existing key that is stored in protected hardware (secure element, hardware unique key (HUK)).
|
||||
* **Loading** a persistent key from storage, or loading a built-in key. This is done through `psa_get_and_lock_key_slot()`, which calls `psa_reserve_free_key_slot()` and loads the key if applicable.
|
||||
|
||||
The following operations free a key slot by calling `psa_wipe_key_slot()` and, if applicable, `psa_free_key_slot()`:
|
||||
|
||||
* **Destroying** a key.
|
||||
* **Purging** a persistent key from memory, either explicitly at the application's request or to free memory.
|
||||
|
||||
Deinitializing the PSA Crypto subsystem with `mbedtls_psa_crypto_free()` destroys all volatile keys and purges all persistent keys.
|
||||
|
||||
The library accesses key slots in the following scenarios:
|
||||
|
||||
* while the key is being created or loaded;
|
||||
* while the key is being destroyed or purged;
|
||||
* while the key metadata or key material is being accessed.
|
||||
|
||||
### Key slot states
|
||||
|
||||
The state of a key slot is indicated by its `state` field of type `psa_key_slot_state_t`, which can be:
|
||||
|
||||
* `PSA_SLOT_EMPTY`: a slot that occupies memory but does not currently contain a key.
|
||||
* `PSA_SLOT_FILLING`: a slot that is being filled to create or load a key.
|
||||
* `PSA_SLOT_FULL`: a slot containing a key.
|
||||
* `PSA_SLOT_PENDING_DELETION`: a slot whose key is being destroy or purged.
|
||||
|
||||
These states are mostly useful for concurrency. See [Concurrency](#concurrency) below and [key slot states in the PSA thread safety specification](psa-thread-safety/psa-thread-safety.md#key-slot-states).
|
||||
|
||||
#### Concurrency
|
||||
|
||||
In a multithreaded environment, since Mbed TLS 3.6.0, each key slot is protected by a reader-writer lock. (In earlier versions, the key store was not thread-safe.) The lock is controlled by a single global mutex `mbedtls_threading_psa_globaldata_mutex`. The concurrency state of the slot is indicated by the state and the `registered_readers` field:
|
||||
|
||||
* `EMPTY` or `FULL` state, `registered_readers == 0`: the slot is not in use by any thread.
|
||||
* `FULL` state, `registered_readers != 0`: the slot is being read.
|
||||
* `FILLING` or `PENDING_DELETION` state: the slot is being written.
|
||||
|
||||
For more information, see [PSA thread safety](psa-thread-safety/psa-thread-safety.md).
|
||||
|
||||
Note that a slot must not be moved in memory while it is being read or written.
|
||||
|
||||
## Key slot management implementations
|
||||
|
||||
### Key store implementation variants
|
||||
|
||||
There are three variants of the key store implementation, responding to different needs.
|
||||
|
||||
* Hybrid key store ([static key slots](#static-key-store) with dynamic key data): the key store is a statically allocated array of slots, of size `MBEDTLS_PSA_KEY_SLOT_COUNT`. Key material is allocated on the heap. This is the historical implementation. It remains the default in the Mbed TLS 3.6 long-time support (LTS) branch when using a handwritten `mbedtls_config.h`, as is common on resource-constrained platforms, because the alternatives have tradeoffs (key size limit and larger RAM usage at rest for the static key store, larger code size and more risk due to code complexity for the dynamic key store).
|
||||
* Fully [static key store](#static-key-store) (since Mbed TLS 3.6.3): the key store is a statically allocated array of slots, of size `MBEDTLS_PSA_KEY_SLOT_COUNT`. Each key slot contains the key representation directly, and the key representation must be no more than `MBEDTLS_PSA_STATIC_KEY_SLOT_BUFFER_SIZE` bytes. This is intended for very constrained devices that do not have a heap.
|
||||
* [Dynamic key store](#dynamic-key-store) (since Mbed TLS 3.6.1): the key store is dynamically allocated as multiple slices on the heap, with a size that adjusts to the application's usage. Key material is allocated on the heap. Compared to the hybrid key store, the code size and RAM consumption are larger. This is intended for higher-end devices where applications are not expected to have a highly predicatable resource usage. This is the default implementation when using the default `mbedtls_config.h` file, as is common on platforms such as Linux, starting with Mbed TLS 3.6.1.
|
||||
|
||||
#### Future improvement: merging the key store variants
|
||||
|
||||
In the future, we may reduce the number of key store variants to just two, perhaps even one.
|
||||
|
||||
We introduced the variants other than the hybrid key store in a patch release of a long-time support version. As a consequence, we wanted to minimize making changes to the default build (when not using the supplied `mbedtls_config.h`, as explained above), to minimize the risk of bugs and the increase in code size. These considerations will not apply in future major or minor releases, so the default key store can change later.
|
||||
|
||||
The static key store could become a runtime decision, where only keys larger than some threshold require the use of heap memory. The reasons not to do this in Mbed TLS 3.6.x are that this increases complexity somewhat (slightly more code size, and more risk), and this changes the RAM usage profile somewhat.
|
||||
|
||||
A major constraint on the design of the dynamic key store is the need to preserve slot pointers while a slot may be accessed by another thread (see [“Concurrency”](#concurrency)). With the concurrency primitives available in Mbed TLS 3.x, it is very hard to move a key slot in memory, because there could be an indefinite wait until some other thread has finished accessing the slot. This pushed towards the slice-based organisation described below, where each slice is allocated for the long term. In particular, slices cannot be compacted (compacting would be moving slots out of a sparsely-used slice to free it). Better concurrency primitives (e.g. condition variables or semaphores), together with a `realloc()` primitive, could allow freeing unused memory more aggressively, which could make the dynamic key store not detrimental in RAM usage compared to the historical hybrid key store.
|
||||
|
||||
#### Slice abstraction
|
||||
|
||||
Some parts of the key slot management code use **key slices** as an abstraction. A key slice is an array of key slots. Key slices are identified by an index which is a small non-negative integer.
|
||||
|
||||
* With a [static key store](#static-key-store), there is a single, statically allocated slice, with the index 0.
|
||||
* With a [dynamic key store](#dynamic-key-store), there is statically allocated array of pointers to key slices. The index of a slice is the index in that array. The slices are allocated on the heap as needed.
|
||||
|
||||
#### Key identifiers and slot location
|
||||
|
||||
When creating a volatile key, the slice containing the slot and index of the slot in its slice determine the key identifier. When accessing a volatile key, the slice and the slot index in the slice are calculated from the key identifier. The encoding of the slot location in the volatile key identifier is different for a [static](#volatile-key-identifiers-in-the-static-key-store) or [dynamic](#volatile-key-identifiers-in-the-dynamic-key-store) key store.
|
||||
|
||||
### Static key store
|
||||
|
||||
The static key store is the historical implementation. The key store is a statically allocated array of slots, of size `MBEDTLS_PSA_KEY_SLOT_COUNT`. This value is an upper bound for the total number of volatile keys plus loaded keys.
|
||||
|
||||
Since Mbed TLS 3.6.3, there are two variants for the static key store: a hybrid variant (default), and a fully-static variant enabled by the configuration option `MBEDTLS_PSA_STATIC_KEY_SLOTS`. The two variants have the same key store management: the only difference is in how the memory for key data is managed. With fully static key slots, the key data is directly inside the slot, and limited to `MBEDTLS_PSA_KEY_SLOT_BUFFER_SIZE` bytes. With the hybrid key store, the slot contains a pointer to the key data, which is allocated on the heap.
|
||||
|
||||
#### Volatile key identifiers in the static key store
|
||||
|
||||
For easy lookup, a volatile key whose index is `id` is stored at the index `id - PSA_KEY_ID_VOLATILE_MIN`.
|
||||
|
||||
#### Key creation with a static key store
|
||||
|
||||
To create a key, `psa_reserve_free_key_slot()` searches the key slot array until it finds one that is empty. If there are none, the code looks for a persistent key that can be purged (see [“Persistent key cache”](#persistent-key-cache)), and purges it. If no slot is free and no slot contains a purgeable key, the key creation fails.
|
||||
|
||||
#### Freeing a key slot with a static key store
|
||||
|
||||
With a static key store, `psa_wipe_key_slot()` destroys or purges a key by freeing any associated resources, then setting the key slot to the empty state. The slot is then ready for reuse.
|
||||
|
||||
### Dynamic key store
|
||||
|
||||
The dynamic key store allows a large number of keys, at the expense of more complex memory management.
|
||||
|
||||
The dynamic key store was added in Mbed TLS 3.6.1. It is enabled by `MBEDTLS_PSA_KEY_STORE_DYNAMIC`, which is enabled by default since Mbed TLS 3.6.1.
|
||||
|
||||
#### Dynamic key slot performance characteristics
|
||||
|
||||
Key management and key access have $O(1)$ amortized performance, and mostly $O(1)$ performance for actions involving keys. More precisely:
|
||||
|
||||
* Access to an existing volatile key takes $O(1)$ time.
|
||||
* Access to a persistent key (including creation and destruction) takes time that is linear in `MBEDTLS_PSA_KEY_SLOT_COUNT`.
|
||||
* Allocating a key takes amortized $O(1)$ time. Usually the time is $O(s)$ where $s$ is the number of slices (which is a hard-coded value less than $30$), but when creating $k$ volatile keys, at most $\log(k)$ creations will involve calls to `calloc()`, totalling $O(k)$ memory.
|
||||
* Destroying a volatile key takes $O(1)$ time as of Mbed TLS 3.6.1. Later improvements to memory consumption are likely to involve calls to `free()` which may total $O(k)$ memory where $k$ is the maximum number of volatile keys.
|
||||
|
||||
#### Key slices in the dynamic key store
|
||||
|
||||
The key slot is organized in slices, which are dynamically arrays of key slot. The number of slices is determined at compile time. The key store contains a static array of pointers to slices.
|
||||
|
||||
Volatile keys and loaded keys (persistent or built-in) are stored in separate slices.
|
||||
Key slices number 0 to `KEY_SLOT_VOLATILE_SLICE_COUNT - 1` contain only volatile keys.
|
||||
One key slice contains only loaded keys: that key slice is thus the cache slice. See [“Persistent key cache”](persistent-key-cache) for how the cache is managed.
|
||||
|
||||
#### Volatile key identifiers in the dynamic key store
|
||||
|
||||
A volatile key identifier encodes the slice index and the slot index at separate bit positions. That is, `key_id = BASE | slice_index | slot_index` where the bits set in `BASE`, `slice_index` and `slot_index` do not overlap.
|
||||
|
||||
#### From key slot to key slice
|
||||
|
||||
Some parts of the slot management code need to determine which key slice contains a key slot when given a pointer to the key slot. In principle, the key slice is uniquely determined from the key identifier which is located in the slot:
|
||||
|
||||
* for a volatile key identifier, the [slice index is encoded in the key identifier](#volatile-key-identifiers-in-the-dynamic-key-store);
|
||||
* for a persistent key identifier or built-in key identifier, [the slot is in the sole cache slice](#key-slices-in-the-dynamic-key-store).
|
||||
|
||||
Nonetheless, we store the slice index as a field in the slot, for two reasons:
|
||||
|
||||
* It is more robust in case the slice assignment becomes more complex in the future or is somehow buggy.
|
||||
* It allows the slot to slice correspondence to work even if the key identifier field has not been filled yet or has been wiped. The implementation in Mbed TLS 3.6.1 requires this because `psa_wipe_key_slot()` wipes the slot, then calls `psa_free_key_slot()`, which needs to determine the slice. Keeping the slice index as a separate field allows us to better separate the concerns of key liveness and slot liveness. A redesign of the internal interfaces could improve this, but would be too disruptive in the 3.6 LTS branch.
|
||||
|
||||
#### Length of the volatile key slices
|
||||
|
||||
The volatile key slices have exponentially increasing length: each slice is twice as long as the previous one. Thus if the length of slice 0 is `B` and there are `N` slices, then there are `B * (2^N - 1)` slots.
|
||||
|
||||
As of Mbed TLS 3.6.1, the maximum number of volatile key slots is less than the theoretical maximum of 2^30 - 2^16 (0x10000000..0x7ffeffff, the largest range of key identifiers reserved for the PSA Crypto implementation that does not overlap the range for built-in keys). The reason is that we limit the slot index to 2^25-1 so that the [encoding of volatile key identifiers](#volatile-key-identifiers-in-the-dynamic-key-store) has 25 bits for the slot index.
|
||||
|
||||
When `MBEDTLS_TEST_HOOKS` is enabled, the length of key slices can be overridden. We use this in tests that need to fill the key store.
|
||||
|
||||
#### Free list
|
||||
|
||||
Each volatile key slice has a **free list**. This is a linked list of all the slots in the slice that are free. The global data contains a static array of free list heads, i.e. the index of a free slot in the slice. Each free slot contains the index of the next free slot in that slice's free list. The end of the list is indicated by an index that is larger than the length of the slice. If the list is empty, the head contains an index that is larger than the length.
|
||||
|
||||
As a small optimization, a free slot does not actually contain the index of the next slot, but the index of the next free slot on the list _relative to the next slot in the array_. For example, 0 indicates that the next free slot is the slot immediately after the current slot. This fact is the reason for the encoding: a slice freshly obtained from `calloc` has all of its slots in the free list in order. The value 1 indicates that there is one element between this slot and the next free slot. The next element of the free list can come before the current slot: -2 indicates that it's the slot immediately before, -3 is two slots before, and so on (-1 is impossible). In general, the absolute index of the next slot after slot `i` in the free list is `i + 1 slice[i].next_free_relative_to_next`.
|
||||
|
||||
#### Dynamic key slot allocation
|
||||
|
||||
To create a volatile key, `psa_reserve_free_key_slot()` searches the free lists of each allocated slice until it finds a slice that is not full. If all allocated slices are full, the code allocates a new slice at the lowest possible slice index. If all possible slices are already allocated and full, the key creation fails.
|
||||
|
||||
The newly allocated slot is removed from the slice's free list.
|
||||
|
||||
We only allocate a slice of size `B * 2^k` if there are already `B * (2^k - 1)` occupied slots. Thus the memory overhead is at most `B` slots plus the number of occupied slots, i.e. the memory consumption for slots is at most twice the required memory plus a small constant overhead.
|
||||
|
||||
#### Dynamic key slot deallocation
|
||||
|
||||
When destroying a volatile key, `psa_wipe_key_slot()` calls `psa_free_key_slot()`. This function adds the newly freed slot to the head of the free list.
|
||||
|
||||
##### Future improvement: slice deallocation
|
||||
|
||||
As of Mbed TLS 3.6.1, `psa_free_key_slot()` does not deallocate slices. Thus the memory consumption for slots never decreases (except when the PSA crypto subsystem is deinitialized). Freeing key slices intelligently would be a desirable improvement.
|
||||
|
||||
We should not free a key slice as soon as it becomes empty, because that would cause large allocations and deallocations if there are slices full of long-lived keys, and then one slice keeps being allocate and deallocated for the occasional short-lived keys. Rather, there should be some hysteresis, e.g. only deallocate a slice if there are at least T free slots in the previous slice. [#9435](https://github.com/Mbed-TLS/mbedtls/issues/9435)
|
||||
|
||||
Note that currently, the slice array contains one sequence of allocated slices followed by one sequence of unallocated slices. Mixing allocated and unallocated slices may make some parts of the code a little more complex, and should be tested thoroughly.
|
||||
|
||||
### Persistent key cache
|
||||
|
||||
Persistent keys and built-in keys need to be loaded into the in-memory key store each time they are accessed:
|
||||
|
||||
* while creating them;
|
||||
* to access their metadata;
|
||||
* to start performing an operation with the key;
|
||||
* when destroying the key.
|
||||
|
||||
To avoid frequent storage access, we cache persistent keys in memory. This cache also applies to built-in keys.
|
||||
|
||||
With the [static key store](#static-key-store), a non-empty slot can contain either a volatile key or a cache entry for a persistent or built-in key. With the [dynamic key store](#dynamic-key-store), volatile keys and cached keys are placed in separate [slices](#key-slices-in-the-dynamic-key-store).
|
||||
|
||||
The persistent key cache is a fixed-size array of `MBEDTLS_PSA_KEY_SLOT_COUNT` slots. In the static key store, this array is shared with volatile keys. In the dynamic key store, the cache is a separate array that does not contain volatile keys.
|
||||
|
||||
#### Accessing a persistent key
|
||||
|
||||
`psa_get_and_lock_key_slot()` automatically loads persistent and built-in keys if the specified key identifier is in the corresponding range. To that effect, it traverses the key cache to see if a key with the given identifier is already loaded. If not, it loads the key. This cache walk takes time that is proportional to the cache size.
|
||||
|
||||
#### Cache eviction
|
||||
|
||||
A key slot must be allocated in the cache slice:
|
||||
|
||||
* to create a volatile key (static key store only);
|
||||
* to create a persistent key;
|
||||
* to load a persistent or built-in key.
|
||||
|
||||
If the cache slice is full, the code will try to evict an entry. Only slots that do not have readers can be evicted (see [“Concurrency”](#concurrency)). In the static key store, slots containing volatile keys cannot be evicted.
|
||||
|
||||
As of Mbed TLS 3.6.1, there is no tracking of a key's usage frequency or age. The slot eviction code picks the first evictable slot it finds in its traversal order. We have not reasoned about or experimented with different strategies.
|
@@ -1,685 +0,0 @@
|
||||
PSA API functions and shared memory
|
||||
===================================
|
||||
|
||||
## Introduction
|
||||
|
||||
This document discusses the security architecture of systems where PSA API functions might receive arguments that are in memory that is shared with an untrusted process. On such systems, the untrusted process might access a shared memory buffer while the cryptography library is using it, and thus cause unexpected behavior in the cryptography code.
|
||||
|
||||
### Core assumptions
|
||||
|
||||
We assume the following scope limitations:
|
||||
|
||||
* Only PSA Crypto API functions are in scope (including Mbed TLS extensions to the official API specification). Legacy crypto, X.509, TLS, or any other function which is not called `psa_xxx` is out of scope.
|
||||
* We only consider [input buffers](https://arm-software.github.io/psa-api/crypto/1.1/overview/conventions.html#input-buffer-sizes) and [output buffers](https://arm-software.github.io/psa-api/crypto/1.1/overview/conventions.html#output-buffer-sizes). Any other data is assumed to be in non-shared memory.
|
||||
|
||||
## System architecture discussion
|
||||
|
||||
### Architecture overview
|
||||
|
||||
We consider a system that has memory separation between partitions: a partition can't access another partition's memory directly. Partitions are meant to be isolated from each other: a partition may only affect the integrity of another partition via well-defined system interfaces. For example, this can be a Unix/POSIX-like system that isolates processes, or isolation between the secure world and the non-secure world relying on a mechanism such as TrustZone, or isolation between secure-world applications on such a system.
|
||||
|
||||
More precisely, we consider such a system where our PSA Crypto implementation is running inside one partition, called the **crypto service**. The crypto service receives remote procedure calls (RPC) from other partitions, validates their arguments (e.g. validation of key identifier ownership), and calls a PSA Crypto API function. This document is concerned with environments where the arguments passed to a PSA Crypto API function may be in shared memory (as opposed to environments where the inputs are always copied into memory that is solely accessible by the crypto service before calling the API function, and likewise with output buffers after the function returns).
|
||||
|
||||
When the data is accessible to another partition, there is a risk that this other partition will access it while the crypto implementation is working. Although this could be prevented by suspending the whole system while crypto is working, such a limitation is rarely desirable and most systems don't offer a way to do it. (Even systems that have absolute thread priorities, and where crypto has a higher priority than any untrusted partition, may be vulnerable due to having multiple cores or asynchronous data transfers with peripherals.)
|
||||
|
||||
The crypto service must guarantee that it behaves as if the rest of the world was suspended while it is executed. A behavior that is only possible if an untrusted entity accesses a buffer while the crypto service is processing the data is a security violation.
|
||||
|
||||
### Risks and vulnerabilities
|
||||
|
||||
We consider a security architecture with two or three entities:
|
||||
|
||||
* a crypto service, which offers PSA crypto API calls over RPC (remote procedure call) using shared memory for some input or output arguments;
|
||||
* a client of the crypto service, which makes a RPC to the crypto service;
|
||||
* in some scenarios, a client of the client, which makes a RPC to the crypto client which re-shares the memory with the crypto service.
|
||||
|
||||
The behavior of RPC is defined for in terms of values of inputs and outputs. This models an ideal world where the content of input and output buffers is not accessible outside the crypto service while it is processing an RPC. It is a security violation if the crypto service behaves in a way that cannot be achieved by setting the inputs before the RPC call, and reading the outputs after the RPC call is finished.
|
||||
|
||||
#### Read-read inconsistency
|
||||
|
||||
If an input argument is in shared memory, there is a risk of a **read-read inconsistency**:
|
||||
|
||||
1. The crypto code reads part of the input and validates it, or injects it into a calculation.
|
||||
2. The client (or client's client) modifies the input.
|
||||
3. The crypto code reads the same part again, and performs an action which would be impossible if the input had had the same value all along.
|
||||
|
||||
Vulnerability example (parsing): suppose the input contains data with a type-length-value or length-value encoding (for example, importing an RSA key). The crypto code reads the length field and checks that it fits within the buffer. (This could be the length of the overall data, or the length of an embedded field) Later, the crypto code reads the length again and uses it without validation. A malicious client can modify the length field in the shared memory between the two reads and thus cause a buffer overread on the second read.
|
||||
|
||||
Vulnerability example (dual processing): consider an RPC to perform authenticated encryption, using a mechanism with an encrypt-and-MAC structure. The authenticated encryption implementation separately calculates the ciphertext and the MAC from the plaintext. A client sets the plaintext input to `"PPPP"`, then starts the RPC call, then changes the input buffer to `"QQQQ"` while the crypto service is working.
|
||||
|
||||
* Any of `enc("PPPP")+mac("PPPP")`, `enc("PPQQ")+mac("PPQQ")` or `enc("QQQQ")+mac("QQQQ")` are valid outputs: they are outputs that can be produced by this authenticated encryption RPC.
|
||||
* If the authenticated encryption calculates the ciphertext before the client changes the output buffer and calculates the MAC after that change, reading the input buffer again each time, the output will be `enc("PPPP")+mac("QQQQ")`. There is no input that can lead to this output, hence this behavior violates the security guarantees of the crypto service.
|
||||
|
||||
#### Write-read inconsistency
|
||||
|
||||
If an output argument is in shared memory, there is a risk of a **write-read inconsistency**:
|
||||
|
||||
1. The crypto code writes some intermediate data into the output buffer.
|
||||
2. The client (or client's client) modifies the intermediate data.
|
||||
3. The crypto code reads the intermediate data back and continues the calculation, leading to an outcome that would not be possible if the intermediate data had not been modified.
|
||||
|
||||
Vulnerability example: suppose that an RSA signature function works by formatting the data in place in the output buffer, then applying the RSA private-key operation in place. (This is how `mbedtls_rsa_pkcs1_sign` works.) A malicious client may write badly formatted data into the buffer, so that the private-key operation is not a valid signature (e.g. it could be a decryption), violating the RSA key's usage policy.
|
||||
|
||||
Vulnerability example with chained calls: we consider the same RSA signature operation as before. In this example, we additionally assume that the data to sign comes from an attestation application which signs some data on behalf of a final client: the key and the data to sign are under the attestation application's control, and the final client must not be able to obtain arbitrary signatures. The final client shares an output buffer for the signature with the attestation application, and the attestation application re-shares this buffer with the crypto service. A malicious final client can modify the intermediate data and thus sign arbitrary data.
|
||||
|
||||
#### Write-write disclosure
|
||||
|
||||
If an output argument is in shared memory, there is a risk of a **write-write disclosure**:
|
||||
|
||||
1. The crypto code writes some intermediate data into the output buffer. This intermediate data must remain confidential.
|
||||
2. The client (or client's client) reads the intermediate data.
|
||||
3. The crypto code overwrites the intermediate data.
|
||||
|
||||
Vulnerability example with chained calls (temporary exposure): an application encrypts some data, and lets its clients store the ciphertext. Clients may not have access to the plaintext. To save memory, when it calls the crypto service, it passes an output buffer that is in the final client's memory. Suppose the encryption mechanism works by copying its input to the output buffer then encrypting in place (for example, to simplify considerations related to overlap, or because the implementation relies on a low-level API that works in place). In this scenario, the plaintext is exposed to the final client while the encryption in progress, which violates the confidentiality of the plaintext.
|
||||
|
||||
Vulnerability example with chained calls (backtrack): we consider a provisioning application that provides a data encryption service on behalf of multiple clients, using a single shared key. Clients are not allowed to access each other's data. The provisioning application isolates clients by including the client identity in the associated data. Suppose that an AEAD decryption function processes the ciphertext incrementally by simultaneously writing the plaintext to the output buffer and calculating the tag. (This is how AEAD decryption usually works.) At the end, if the tag is wrong, the decryption function wipes the output buffer. Assume that the output buffer for the plaintext is shared from the client to the provisioning application, which re-shares it with the crypto service. A malicious client can read another client (the victim)'s encrypted data by passing the ciphertext to the provisioning application, which will attempt to decrypt it with associated data identifying the requesting client. Although the operation will fail beacuse the tag is wrong, the malicious client still reads the victim plaintext.
|
||||
|
||||
#### Write-read feedback
|
||||
|
||||
If a function both has an input argument and an output argument in shared memory, and processes its input incrementally to emit output incrementally, the following sequence of events is possible:
|
||||
|
||||
1. The crypto code processes part of the input and writes the corresponding part of the output.
|
||||
2. The client reads the early output and uses that to calculate the next part of the input.
|
||||
3. The crypto code processes the rest of the input.
|
||||
|
||||
There are cryptographic mechanisms for which this breaks security properties. An example is [CBC encryption](https://link.springer.com/content/pdf/10.1007/3-540-45708-9_2.pdf): if the client can choose the content of a plaintext block after seeing the immediately preceding ciphertext block, this gives the client a decryption oracle. This is a security violation if the key policy only allowed the client to encrypt, not to decrypt.
|
||||
|
||||
TODO: is this a risk we want to take into account? Although this extends the possible behaviors of the one-shot interface, the client can do the same thing legitimately with the multipart interface.
|
||||
|
||||
### Possible countermeasures
|
||||
|
||||
In this section, we briefly discuss generic countermeasures.
|
||||
|
||||
#### Copying
|
||||
|
||||
Copying is a valid countermeasure. It is conceptually simple. However, it is often unattractive because it requires additional memory and time.
|
||||
|
||||
Note that although copying is very easy to write into a program, there is a risk that a compiler (especially with whole-program optimization) may optimize the copy away, if it does not understand that copies between shared memory and non-shared memory are semantically meaningful.
|
||||
|
||||
Example: the PSA Firmware Framework 1.0 forbids shared memory between partitions. This restriction is lifted in version 1.1 due to concerns over RAM usage.
|
||||
|
||||
#### Careful accesses
|
||||
|
||||
The following rules guarantee that shared memory cannot result in a security violation other than [write-read feedback](#write-read-feedback):
|
||||
|
||||
* Never read the same input twice at the same index.
|
||||
* Never read back from an output.
|
||||
* Never write to the output twice at the same index.
|
||||
* This rule can usefully be relaxed in many circumstances. It is ok to write data that is independent of the inputs (and not otherwise confidential), then overwrite it. For example, it is ok to zero the output buffer before starting to process the input.
|
||||
|
||||
These rules are very difficult to enforce.
|
||||
|
||||
Example: these are the rules that a GlobalPlatform TEE Trusted Application (application running on the secure side of TrustZone on Cortex-A) must follow.
|
||||
|
||||
## Protection requirements
|
||||
|
||||
### Responsibility for protection
|
||||
|
||||
A call to a crypto service to perform a crypto operation involves the following components:
|
||||
|
||||
1. The remote procedure call framework provided by the operating system.
|
||||
2. The code of the crypto service.
|
||||
3. The code of the PSA Crypto dispatch layer (also known as the core), which is provided by Mbed TLS.
|
||||
4. The driver implementing the cryptographic mechanism, which may be provided by Mbed TLS (built-in driver) or by a third-party driver.
|
||||
|
||||
The [PSA Crypto API specification](https://arm-software.github.io/psa-api/crypto/1.1/overview/conventions.html#stability-of-parameters) puts the responsibility for protection on the implementation of the PSA Crypto API, i.e. (3) or (4).
|
||||
|
||||
> In an environment with multiple threads or with shared memory, the implementation carefully accesses non-overlapping buffer parameters in order to prevent any security risk resulting from the content of the buffer being modified or observed during the execution of the function. (...)
|
||||
|
||||
In Mbed TLS 2.x and 3.x up to and including 3.5.0, there is no defense against buffers in shared memory. The responsibility shifts to (1) or (2), but this is not documented.
|
||||
|
||||
In the remainder of this chapter, we will discuss how to implement this high-level requirement where it belongs: inside the implementation of the PSA Crypto API. Note that this allows two possible levels: in the dispatch layer (independently of the implementation of each mechanism) or in the driver (specific to each implementation).
|
||||
|
||||
#### Protection in the dispatch layer
|
||||
|
||||
The dispatch layer has no control over how the driver layer will access buffers. Therefore the only possible protection at this layer method is to ensure that drivers have no access to shared memory. This means that any buffer located in shared memory must be copied into or out of a buffer in memory owned by the crypto service (heap or stack). This adds inefficiency, mostly in terms of RAM usage.
|
||||
|
||||
For buffers with a small static size limit, this is something we often do for convenience, especially with output buffers. However, as of Mbed TLS 3.5.0, it is not done systematically.
|
||||
|
||||
It is ok to skip the copy if it is known for sure that a buffer is not in shared memory. However, the location of the buffer is not under the control of Mbed TLS. This means skipping the copy would have to be a compile-time or run-time option which has to be set by the application using Mbed TLS. This is both an additional maintenance cost (more code to analyze, more testing burden), and a residual security risk in case the party who is responsible for setting this option does not set it correctly. As a consequence, Mbed TLS will not offer this configurability unless there is a compelling argument.
|
||||
|
||||
#### Protection in the driver layer
|
||||
|
||||
Putting the responsibility for protection in the driver layer increases the overall amount of work since there are more driver implementations than dispatch implementations. (This is true even inside Mbed TLS: almost all API functions have multiple underlying implementations, one for each algorithm.) It also increases the risk to the ecosystem since some drivers might not protect correctly. Therefore having drivers be responsible for protection is only a good choice if there is a definite benefit to it, compared to allocating an internal buffer and copying. An expected benefit in some cases is that there are practical protection methods other than copying.
|
||||
|
||||
Some cryptographic mechanisms are naturally implemented by processing the input in a single pass, with a low risk of ever reading the same byte twice, and by writing the final output directly into the output buffer. For such mechanism, it is sensible to mandate that drivers respect these rules.
|
||||
|
||||
In the next section, we will analyze how susceptible various cryptographic mechanisms are to shared memory vulnerabilities.
|
||||
|
||||
### Susceptibility of different mechanisms
|
||||
|
||||
#### Operations involving small buffers
|
||||
|
||||
For operations involving **small buffers**, the cost of copying is low. For many of those, the risk of not copying is high:
|
||||
|
||||
* Any parsing of formatted data has a high risk of [read-read inconsistency](#read-read-inconsistency).
|
||||
* An internal review shows that for RSA operations, it is natural for an implementation to have a [write-read inconsistency](#write-read-inconsistency) or a [write-write disclosure](#write-write-disclosure).
|
||||
|
||||
Note that in this context, a “small buffer” is one with a size limit that is known at compile time, and small enough that copying the data is not prohibitive. For example, an RSA key fits in a small buffer. A hash input is not a small buffer, even if it happens to be only a few bytes long in one particular call.
|
||||
|
||||
The following buffers are considered small buffers:
|
||||
|
||||
* Any input or output directly related to asymmetric cryptography (signature, encryption/decryption, key exchange, PAKE), including key import and export.
|
||||
* Note that this does not include inputs or outputs that are not processed by an asymmetric primitives, for example the message input to `psa_sign_message` or `psa_verify_message`.
|
||||
* Cooked key derivation output.
|
||||
* The output of a hash or MAC operation.
|
||||
|
||||
**Design decision: the dispatch layer shall copy all small buffers**.
|
||||
|
||||
#### Symmetric cryptography inputs with small output
|
||||
|
||||
Message inputs to hash, MAC and key derivation operations are at a low risk of [read-read inconsistency](#read-read-inconsistency) because they are unformatted data, and for all specified algorithms, it is natural to process the input one byte at a time.
|
||||
|
||||
**Design decision: require symmetric cryptography drivers to read their input without a risk of read-read inconsistency**.
|
||||
|
||||
TODO: what about IV/nonce inputs? They are typically small, but don't necessarily have a static size limit (e.g. GCM recommends a 12-byte nonce, but also allows large nonces).
|
||||
|
||||
#### Key derivation outputs
|
||||
|
||||
Key derivation typically emits its output as a stream, with no error condition detected after setup other than operational failures (e.g. communication failure with an accelerator) or running out of data to emit (which can easily be checked before emitting any data, since the data size is known in advance).
|
||||
|
||||
(Note that this is about raw byte output, not about cooked key derivation, i.e. deriving a structured key, which is considered a [small buffer](#operations-involving-small-buffers).)
|
||||
|
||||
**Design decision: require key derivation drivers to emit their output without reading back from the output buffer**.
|
||||
|
||||
#### Cipher and AEAD
|
||||
|
||||
AEAD decryption is at risk of [write-write disclosure](#write-write-disclosure) when the tag does not match.
|
||||
|
||||
AEAD encryption and decryption are at risk of [read-read inconsistency](#read-read-inconsistency) if they process the input multiple times, which is natural in a number of cases:
|
||||
|
||||
* when encrypting with an encrypt-and-authenticate or authenticate-then-encrypt structure (one read to calculate the authentication tag and another read to encrypt);
|
||||
* when decrypting with an encrypt-then-authenticate structure (one read to decrypt and one read to calculate the authentication tag);
|
||||
* with SIV modes (not yet present in the PSA API, but likely to come one day) (one full pass to calculate the IV, then another full pass for the core authenticated encryption);
|
||||
|
||||
Cipher and AEAD outputs are at risk of [write-read inconsistency](#write-read-inconsistency) and [write-write disclosure](#write-write-disclosure) if they are implemented by copying the input into the output buffer with `memmove`, then processing the data in place. In particular, this approach makes it easy to fully support overlapping, since `memmove` will take care of overlapping cases correctly, which is otherwise hard to do portably (C99 does not offer an efficient, portable way to check whether two buffers overlap).
|
||||
|
||||
**Design decision: the dispatch layer shall allocate an intermediate buffer for cipher and AEAD plaintext/ciphertext inputs and outputs**.
|
||||
|
||||
Note that this can be a single buffer for the input and the output if the driver supports in-place operation (which it is supposed to, since it is supposed to support arbitrary overlap, although this is not always the case in Mbed TLS, a [known issue](https://github.com/Mbed-TLS/mbedtls/issues/3266)). A side benefit of doing this intermediate copy is that overlap will be supported.
|
||||
|
||||
For all currently implemented AEAD modes, the associated data is only processed once to calculate an intermediate value of the authentication tag.
|
||||
|
||||
**Design decision: for now, require AEAD drivers to read the additional data without a risk of read-read inconsistency**. Make a note to revisit this when we start supporting an SIV mode, at which point the dispatch layer shall copy the input for modes that are not known to be low-risk.
|
||||
|
||||
#### Message signature
|
||||
|
||||
For signature algorithms with a hash-and-sign framework, the input to sign/verify-message is passed to a hash, and thus can follow the same rules as [symmetric cryptography inputs with small output](#symmetric-cryptography-inputs-with-small-output). This is also true for `PSA_ALG_RSA_PKCS1V15_SIGN_RAW`, which is the only non-hash-and-sign signature mechanism implemented in Mbed TLS 3.5. This is not true for PureEdDSA (`#PSA_ALG_PURE_EDDSA`), which is not yet implemented: [PureEdDSA signature](https://www.rfc-editor.org/rfc/rfc8032#section-5.1.6) processes the message twice. (However, PureEdDSA verification only processes the message once.)
|
||||
|
||||
**Design decision: for now, require sign/verify-message drivers to read their input without a risk of read-read inconsistency**. Make a note to revisit this when we start supporting PureEdDSA, at which point the dispatch layer shall copy the input for algorithms such as PureEdDSA that are not known to be low-risk.
|
||||
|
||||
## Design of shared memory protection
|
||||
|
||||
This section explains how Mbed TLS implements the shared memory protection strategy summarized below.
|
||||
|
||||
### Shared memory protection strategy
|
||||
|
||||
* The core (dispatch layer) shall make a copy of the following buffers, so that drivers do not receive arguments that are in shared memory:
|
||||
* Any input or output from asymmetric cryptography (signature, encryption/decryption, key exchange, PAKE), including key import and export.
|
||||
* Plaintext/ciphertext inputs and outputs for cipher and AEAD.
|
||||
* The output of a hash or MAC operation.
|
||||
* Cooked key derivation output.
|
||||
|
||||
* A document shall explain the requirements on drivers for arguments whose access needs to be protected:
|
||||
* Hash and MAC input.
|
||||
* Cipher/AEAD IV/nonce (to be confirmed).
|
||||
* AEAD associated data (to be confirmed).
|
||||
* Key derivation input (excluding key agreement).
|
||||
* Raw key derivation output (excluding cooked key derivation output).
|
||||
|
||||
* The built-in implementations of cryptographic mechanisms with arguments whose access needs to be protected shall protect those arguments.
|
||||
|
||||
Justification: see “[Susceptibility of different mechanisms](#susceptibility-of-different-mechanisms)”.
|
||||
|
||||
### Implementation of copying
|
||||
|
||||
Copy what needs copying. This is broadly straightforward, however there are a few things to consider.
|
||||
|
||||
#### Compiler optimization of copies
|
||||
|
||||
It is unclear whether the compiler will attempt to optimize away copying operations.
|
||||
|
||||
Once the copying code is implemented, it should be evaluated to see whether compiler optimization is a problem. Specifically, for the major compilers supported by Mbed TLS:
|
||||
* Write a small program that uses a PSA function which copies inputs or outputs.
|
||||
* Build the program with link-time optimization / full-program optimization enabled (e.g. `-flto` with `gcc`). Try also enabling the most extreme optimization options such as `-Ofast` (`gcc`) and `-Oz` (`clang`).
|
||||
* Inspect the generated code with `objdump` or a similar tool to see if copying operations are preserved.
|
||||
|
||||
If copying behaviour is preserved by all major compilers then assume that compiler optimization is not a problem.
|
||||
|
||||
If copying behaviour is optimized away by the compiler, further investigation is needed. Experiment with using the `volatile` keyword to force the compiler not to optimize accesses to the copied buffers. If the `volatile` keyword is not sufficient, we may be able to use compiler or target-specific techniques to prevent optimization, for example memory barriers or empty `asm` blocks. These may be implemented and verified for important platforms while retaining a C implementation that is likely to be correct on most platforms as a fallback - the same approach taken by the constant-time module.
|
||||
|
||||
**Open questions: Will the compiler optimize away copies? If so, can it be prevented from doing so in a portable way?**
|
||||
|
||||
#### Copying code
|
||||
|
||||
We may either copy buffers on an ad-hoc basis using `memcpy()` in each PSA function, or use a unified set of functions for copying input and output data. The advantages of the latter are obvious:
|
||||
|
||||
* Any test hooks need only be added in one place.
|
||||
* Copying code must only be reviewed for correctness in one place, rather than in all functions where it occurs.
|
||||
* Copy bypass is simpler as we can just replace these functions with no-ops in a single place.
|
||||
* Any complexity needed to prevent the compiler optimizing copies away does not have to be duplicated.
|
||||
|
||||
On the other hand, the only advantage of ad-hoc copying is slightly greater flexibility.
|
||||
|
||||
**Design decision: Create a unified set of functions for copying input and output data.**
|
||||
|
||||
#### Copying in multipart APIs
|
||||
|
||||
Multipart APIs may follow one of 2 possible approaches for copying of input:
|
||||
|
||||
##### 1. Allocate a buffer and copy input on each call to `update()`
|
||||
|
||||
This is simple and mirrors the approach for one-shot APIs nicely. However, allocating memory in the middle of a multi-part operation is likely to be bad for performance. Multipart APIs are designed in part for systems that do not have time to perform an operation at once, so introducing poor performance may be a problem here.
|
||||
|
||||
**Open question: Does memory allocation in `update()` cause a performance problem? If so, to what extent?**
|
||||
|
||||
##### 2. Allocate a buffer at the start of the operation and subdivide calls to `update()`
|
||||
|
||||
In this approach, input and output buffers are allocated at the start of the operation that are large enough to hold the expected average call to `update()`. When `update()` is called with larger buffers than these, the PSA API layer makes multiple calls to the driver, chopping the input into chunks of the temporary buffer size and filling the output from the results until the operation is finished.
|
||||
|
||||
This would be more complicated than approach (1) and introduces some extra issues. For example, if one of the intermediate calls to the driver's `update()` returns an error, it is not possible for the driver's state to be rolled back to before the first call to `update()`. It is unclear how this could be solved.
|
||||
|
||||
However, this approach would reduce memory usage in some cases and prevent memory allocation during an operation. Additionally, since the input and output buffers would be fixed-size it would be possible to allocate them statically, avoiding the need for any dynamic memory allocation at all.
|
||||
|
||||
**Design decision: Initially use approach (1) and treat approach (2) as an optimization to be done if necessary.**
|
||||
|
||||
### Validation of copying
|
||||
|
||||
#### Validation of copying by review
|
||||
|
||||
This is fairly self-explanatory. Review all functions that use shared memory and ensure that they each copy memory. This is the simplest strategy to implement but is less reliable than automated validation.
|
||||
|
||||
#### Validation of copying with memory pools
|
||||
|
||||
Proposed general idea: have tests where the test code calling API functions allocates memory in a certain pool, and code in the library allocates memory in a different pool. Test drivers check that needs-copying arguments are within the library pool, not within the test pool.
|
||||
|
||||
#### Validation of copying by memory poisoning
|
||||
|
||||
Proposed general idea: in test code, “poison” the memory area used by input and output parameters that must be copied. Poisoning means something that prevents accessing memory while it is poisoned. This could be via memory protection (allocate with `mmap` then disable access with `mprotect`), or some kind of poisoning for an analyzer such as MSan or Valgrind.
|
||||
|
||||
In the library, the code that does the copying temporarily unpoisons the memory by calling a test hook.
|
||||
|
||||
```c
|
||||
static void copy_to_user(void *copy_buffer, void *const input_buffer, size_t length) {
|
||||
#if defined(MBEDTLS_TEST_HOOKS)
|
||||
if (memory_poison_hook != NULL) {
|
||||
memory_poison_hook(copy_buffer, length);
|
||||
}
|
||||
#endif
|
||||
memcpy(copy_buffer, input_buffer, length);
|
||||
#if defined(MBEDTLS_TEST_HOOKS)
|
||||
if (memory_unpoison_hook != NULL) {
|
||||
memory_unpoison_hook(copy_buffer, length);
|
||||
}
|
||||
#endif
|
||||
}
|
||||
```
|
||||
The reason to poison the memory before calling the library, rather than after the copy-in (and symmetrically for output buffers) is so that the test will fail if we forget to copy, or we copy the wrong thing. This would not be the case if we relied on the library's copy function to do the poisoning: that would only validate that the driver code does not access the memory on the condition that the copy is done as expected.
|
||||
|
||||
##### Options for implementing poisoning
|
||||
|
||||
There are several different ways that poisoning could be implemented:
|
||||
|
||||
1. Using Valgrind's memcheck tool. Valgrind provides a macro `VALGRIND_MAKE_MEM_NO_ACCESS` that allows manual memory poisoning. Valgrind memory poisoning is already used for constant-flow testing in Mbed TLS.
|
||||
2. Using Memory Sanitizer (MSan), which allows us to mark memory as uninitialized. This is also used for constant-flow testing. It is suitable for input buffers only, since it allows us to detect when a poisoned buffer is read but not when it is written.
|
||||
3. Using Address Sanitizer (ASan). This provides `ASAN_POISON_MEMORY_REGION` which marks memory as inaccessible.
|
||||
4. Allocating buffers separate pages and calling `mprotect()` to set pages as inaccessible. This has the disadvantage that we will have to manually ensure that buffers sit in their own pages, which likely means making a copy.
|
||||
5. Filling buffers with random data, keeping a copy of the original. For input buffers, keep a copy of the original and copy it back once the PSA function returns. For output buffers, fill them with random data and keep a separate copy of it. In the memory poisoning hooks, compare the copy of random data with the original to ensure that the output buffer has not been written directly.
|
||||
|
||||
Approach (2) is insufficient for the full testing we require as we need to be able to check both input and output buffers.
|
||||
|
||||
Approach (5) is simple and requires no extra tooling. It is likely to have good performance as it does not use any sanitizers. However, it requires the memory poisoning test hooks to maintain extra copies of the buffers, which seems difficult to implement in practice. Additionally, it does not precisely test the property we want to validate, so we are relying on the tests to fail if given random data as input. It is possible (if unlikely) that the PSA function will access the poisoned buffer without causing the test to fail. This becomes more likely when we consider test cases that call PSA functions on incorrect inputs to check that the correct error is returned. For these reasons, this memory poisoning approach seems unsuitable.
|
||||
|
||||
All three remaining approaches are suitable for our purposes. However, approach (4) is more complex than the other two. To implement it, we would need to allocate poisoned buffers in separate memory pages. They would require special handling and test code would likely have to be designed around this special handling.
|
||||
|
||||
Meanwhile, approaches (1) and (3) are much more convenient. We are simply required to call a special macro on some buffer that was allocated by us and the sanitizer takes care of everything else. Of these two, ASan appears to have a limitation related to buffer alignment. From code comments quoted in [the documentation](https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning):
|
||||
|
||||
> This function is not guaranteed to poison the whole region - it may poison only subregion of [addr, addr+size) due to ASan alignment restrictions.
|
||||
|
||||
Specifically, ASan will round the buffer size down to 8 bytes before poisoning due to details of its implementation. For more information on this, see [Microsoft documentation of this feature](https://learn.microsoft.com/en-us/cpp/sanitizers/asan-runtime?view=msvc-170#alignment-requirements-for-addresssanitizer-poisoning).
|
||||
|
||||
It should be possible to work around this by manually rounding buffer lengths up to the nearest multiple of 8 in the poisoning function, although it's remotely possible that this will cause other problems. Valgrind does not appear to have this limitation (unless Valgrind is simply more poorly documented). However, running tests under Valgrind causes a much greater slowdown compared with ASan. As a result, it would be beneficial to implement support for both Valgrind and ASan, to give the extra flexibility to choose either performance or accuracy as required. This should be simple as both have very similar memory poisoning interfaces.
|
||||
|
||||
**Design decision: Implement memory poisoning tests with both Valgrind's memcheck and ASan manual poisoning.**
|
||||
|
||||
##### Validation with new tests
|
||||
|
||||
Validation with newly created tests would be simpler to implement than using existing tests, since the tests can be written to take into account memory poisoning. It is also possible to build such a testsuite using existing tests as a starting point - `mbedtls_test_psa_exercise_key` is a test helper that already exercises many PSA operations on a key. This would need to be extended to cover operations without keys (e.g. hashes) and multipart operations, but it provides a good base from which to build all of the required testing.
|
||||
|
||||
Additionally, we can ensure that all functions are exercised by automatically generating test data files.
|
||||
|
||||
##### Validation with existing tests
|
||||
|
||||
An alternative approach would be to integrate memory poisoning validation with existing tests. This has two main advantages:
|
||||
|
||||
* All of the tests are written already, potentially saving development time.
|
||||
* The code coverage of these tests is greater than would be achievable writing new tests from scratch. In practice this advantage is small as buffer copying will take place in the dispatch layer. The tests are therefore independent of the values of parameters passed to the driver, so extra coverage in these parameters does not gain anything.
|
||||
|
||||
It may be possible to transparently implement memory poisoning so that existing tests can work without modification. This would be achieved by replacing the implementation of `malloc()` with one that allocates poisoned buffers. However, there are some difficulties with this:
|
||||
|
||||
* Not all buffers allocated by tests are used as inputs and outputs to PSA functions being tested.
|
||||
* Those buffers that are inputs to a PSA function need to be unpoisoned right up until the function is called, so that they can be filled with input data.
|
||||
* Those buffers that are outputs from a PSA function need to be unpoisoned straight after the function returns, so that they can be read to check the output is correct.
|
||||
|
||||
These issues may be solved by creating some kind of test wrapper around every PSA function call that poisons the memory. However, it is unclear how straightforward this will be in practice. If this is simple to achieve, the extra coverage and time saved on new tests will be a benefit. If not, writing new tests is the best strategy.
|
||||
|
||||
**Design decision: Add memory poisoning transparently to existing tests.**
|
||||
|
||||
#### Discussion of copying validation
|
||||
|
||||
Of all discussed approaches, validation by memory poisoning appears as the best. This is because it:
|
||||
|
||||
* Does not require complex linking against different versions of `malloc()` (as is the case with the memory pool approach).
|
||||
* Allows automated testing (unlike the review approach).
|
||||
|
||||
**Design decision: Use a memory poisoning approach to validate copying.**
|
||||
|
||||
### Shared memory protection requirements
|
||||
|
||||
TODO: write document and reference it here.
|
||||
|
||||
### Validation of careful access for built-in drivers
|
||||
|
||||
For PSA functions whose inputs and outputs are not copied, it is important that we validate that the builtin drivers are correctly accessing their inputs and outputs so as not to cause a security issue. Specifically, we must check that each memory location in a shared buffer is not accessed more than once by a driver function. In this section we examine various possible methods for performing this validation.
|
||||
|
||||
Note: We are focusing on read-read inconsistencies for now, as most of the cases where we aren't copying are inputs.
|
||||
|
||||
#### Review
|
||||
|
||||
As with validation of copying, the simplest method of validation we can implement is careful code review. This is the least desirable method of validation for several reasons:
|
||||
|
||||
1. It is tedious for the reviewers.
|
||||
2. Reviewers are prone to make mistakes (especially when performing tedious tasks).
|
||||
3. It requires engineering time linear in the number of PSA functions to be tested.
|
||||
4. It cannot assure the quality of third-party drivers, whereas automated tests can be ported to any driver implementation in principle.
|
||||
|
||||
If all other approaches turn out to be prohibitively difficult, code review exists as a fallback option. However, it should be understood that this is far from ideal.
|
||||
|
||||
#### Tests using `mprotect()`
|
||||
|
||||
Checking that a memory location is not accessed more than once may be achieved by using `mprotect()` on a Linux system to cause a segmentation fault whenever a memory access happens. Tests based on this approach are sketched below.
|
||||
|
||||
##### Linux mprotect+ptrace
|
||||
|
||||
Idea: call `mmap` to allocate memory for arguments and `mprotect` to deny or reenable access. Use `ptrace` from a parent process to react to SIGSEGV from a denied access. On SIGSEGV happening in the faulting region:
|
||||
|
||||
1. Use `ptrace` to execute a `mprotect` system call in the child to enable access. TODO: How? `ptrace` can modify registers and memory in the child, which includes changing parameters of a syscall that's about to be executed, but not directly cause the child process to execute a syscall that it wasn't about to execute.
|
||||
2. Use `ptrace` with `PTRACE_SINGLESTEP` to re-execute the failed load/store instrution.
|
||||
3. Use `ptrace` to execute a `mprotect` system call in the child to disable access.
|
||||
4. Use `PTRACE_CONT` to resume the child execution.
|
||||
|
||||
Record the addresses that are accessed. Mark the test as failed if the same address is read twice.
|
||||
|
||||
##### Debugger + mprotect
|
||||
|
||||
Idea: call `mmap` to allocate memory for arguments and `mprotect` to deny or reenable access. Use a debugger to handle SIGSEGV (Gdb: set signal catchpoint). If the segfault was due to accessing the protected region:
|
||||
|
||||
1. Execute `mprotect` to allow access.
|
||||
2. Single-step the load/store instruction.
|
||||
3. Execute `mprotect` to disable access.
|
||||
4. Continue execution.
|
||||
|
||||
Record the addresses that are accessed. Mark the test as failed if the same address is read twice. This part might be hard to do in the gdb language, so we may want to just log the addresses and then use a separate program to analyze the logs, or do the gdb tasks from Python.
|
||||
|
||||
#### Instrumentation (Valgrind)
|
||||
|
||||
An alternative approach is to use a dynamic instrumentation tool (the most obvious being Valgrind) to trace memory accesses and check that each of the important memory addresses is accessed no more than once.
|
||||
|
||||
Valgrind has no tool specifically that checks the property that we are looking for. However, it is possible to generate a memory trace with Valgrind using the following:
|
||||
|
||||
```
|
||||
valgrind --tool=lackey --trace-mem=yes --log-file=logfile ./myprogram
|
||||
```
|
||||
This will execute `myprogram` and dump a record of every memory access to `logfile`, with its address and data width. If `myprogram` is a test that does the following:
|
||||
|
||||
1. Set up input and output buffers for a PSA function call.
|
||||
2. Leak the start and end address of each buffer via `print()`.
|
||||
3. Write data into the input buffer exactly once.
|
||||
4. Call the PSA function.
|
||||
5. Read data from the output buffer exactly once.
|
||||
|
||||
Then it should be possible to parse the output from the program and from Valgrind and check that each location was accessed exactly twice: once by the program's setup and once by the PSA function.
|
||||
|
||||
#### Fixed Virtual Platform testing
|
||||
|
||||
It may be possible to measure double accesses by running tests on a Fixed Virtual Platform such as Corstone 310 ecosystem FVP, available [here](https://developer.arm.com/downloads/-/arm-ecosystem-fvps). There exists a pre-packaged example program for the Corstone 310 FVP available as part of the Open IoT SDK [here](https://git.gitlab.arm.com/iot/open-iot-sdk/examples/sdk-examples/-/tree/main/examples/mbedtls/cmsis-rtx/corstone-310) that could provide a starting point for a set of tests.
|
||||
|
||||
Running on an FVP allows two approaches to careful-access testing:
|
||||
|
||||
* Convenient scripted use of a debugger with [Iris](https://developer.arm.com/documentation/101196/latest/). This allows memory watchpoints to be set, perhaps more flexibly than with GDB.
|
||||
* Tracing of all memory accesses with [Tarmac Trace](https://developer.arm.com/documentation/100964/1123/Plug-ins-for-Fast-Models/TarmacTrace). To validate the single-access properties, the [processor memory access trace source](https://developer.arm.com/documentation/100964/1123/Plug-ins-for-Fast-Models/TarmacTrace/Processor-memory-access-trace) can be used to output all memory accesses happening on the FVP. This output can then be easily parsed and processed to ensure that the input and output buffers are accessed only once. The addresses of buffers can either be leaked by the program through printing to the serial port or set to fixed values in the FVP's linker script.
|
||||
|
||||
#### Discussion of careful-access validation
|
||||
|
||||
The best approach for validating the correctness of memory accesses is an open question that requires further investigation. To answer this question, each of the test strategies discussed above must be prototyped as follows:
|
||||
|
||||
1. Take 1-2 days to create a basic prototype of a test that uses the approach.
|
||||
2. Document the prototype - write a short guide that can be followed to arrive at the same prototype.
|
||||
3. Evaluate the prototype according to its usefulness. The criteria of evaluation should include:
|
||||
* Ease of implementation - Was the prototype simple to implement? Having implemented it, is it simple to extend it to do all of the required testing?
|
||||
* Flexibility - Could the prototype be extended to cover other careful-access testing that may be needed in future?
|
||||
* Performance - Does the test method perform well? Will it cause significant slowdown to CI jobs?
|
||||
* Ease of reproduction - Does the prototype require a particular platform or tool to be set up? How easy would it be for an external user to run the prototype?
|
||||
* Comprehensibility - Accounting for the lower code quality of a prototype, would developers unfamiliar with the tests based on the prototype be able to understand them easily?
|
||||
* Portability - How well can this approach be ported to multiple platforms? This would allow us to ensure that there are no double-accesses due to a bug that only affects a specific target.
|
||||
|
||||
Once each prototype is complete, choose the best approach to implement the careful-access testing. Implement tests using this approach for each of the PSA interfaces that require careful-access testing:
|
||||
|
||||
* Hash
|
||||
* MAC
|
||||
* AEAD (additional data only)
|
||||
* Key derivation
|
||||
* Asymmetric signature (input only)
|
||||
|
||||
##### New vs existing tests
|
||||
|
||||
Most of the test methods discussed above need extra setup. Some require leaking of buffer bounds, predictable memory access patterns or allocation of special buffers. FVP testing even requires the tests to be run on a non-host target.
|
||||
|
||||
With this complexity in mind it does not seem feasible to run careful-access tests using existing testsuites. Instead, new tests should be written that exercise the drivers in the required way. Fortunately, the only interfaces that need testing are hash, MAC, AEAD (testing over AD only), Key derivation and Asymmetric signature, which limits the number of new tests that must be written.
|
||||
|
||||
#### Validation of validation for careful-access
|
||||
|
||||
In order to ensure that the careful-access validation works, it is necessary to write tests to check that we can correctly detect careful-access violations when they occur. To do this, write a test function that:
|
||||
|
||||
* Reads its input multiple times at the same location.
|
||||
* Writes to its output multiple times at the same location.
|
||||
|
||||
Then, write a careful-access test for this function and ensure that it fails.
|
||||
|
||||
## Analysis of argument protection in built-in drivers
|
||||
|
||||
TODO: analyze the built-in implementations of mechanisms for which there is a requirement on drivers. By code inspection, how satisfied are we that they meet the requirement?
|
||||
|
||||
## Copy bypass
|
||||
|
||||
For efficiency, we are likely to want mechanisms to bypass the copy and process buffers directly in builds that are not affected by shared memory considerations.
|
||||
|
||||
Expand this section to document any mechanisms that bypass the copy.
|
||||
|
||||
Make sure that such mechanisms preserve the guarantees when buffers overlap.
|
||||
|
||||
## Detailed design
|
||||
|
||||
### Implementation by module
|
||||
|
||||
Module | Input protection strategy | Output protection strategy | Notes
|
||||
---|---|---|---
|
||||
Hash and MAC | Careful access | Careful access | Low risk of multiple-access as the input and output are raw unformatted data.
|
||||
Cipher | Copying | Copying |
|
||||
AEAD | Copying (careful access for additional data) | Copying |
|
||||
Key derivation | Careful access | Careful access |
|
||||
Asymmetric signature | Careful access | Copying | Inputs to signatures are passed to a hash. This will no longer hold once PureEdDSA support is implemented.
|
||||
Asymmetric encryption | Copying | Copying |
|
||||
Key agreement | Copying | Copying |
|
||||
PAKE | Copying | Copying |
|
||||
Key import / export | Copying | Copying | Keys may be imported and exported in DER format, which is a structured format and therefore susceptible to read-read inconsistencies and potentially write-read inconsistencies.
|
||||
|
||||
### Copying functions
|
||||
|
||||
As discussed in [Copying code](#copying-code), it is simpler to use a single unified API for copying. Therefore, we create the following functions:
|
||||
|
||||
* `psa_crypto_copy_input(const uint8_t *input, size_t input_length, uint8_t *input_copy, size_t input_copy_length)`
|
||||
* `psa_crypto_copy_output(const uint8_t *output_copy, size_t output_copy_length, uint8_t *output, size_t output_length)`
|
||||
|
||||
These seem to be a repeat of the same function, however it is useful to retain two separate functions for input and output parameters so that we can use different test hooks in each when using memory poisoning for tests.
|
||||
|
||||
Given that the majority of functions will be allocating memory on the heap to copy, it is helpful to build convenience functions that allocate the memory as well.
|
||||
|
||||
In order to keep track of allocated copies on the heap, we can create new structs:
|
||||
|
||||
```c
|
||||
typedef struct psa_crypto_local_input_s {
|
||||
uint8_t *buffer;
|
||||
size_t length;
|
||||
} psa_crypto_local_input_t;
|
||||
|
||||
typedef struct psa_crypto_local_output_s {
|
||||
uint8_t *original;
|
||||
uint8_t *buffer;
|
||||
size_t length;
|
||||
} psa_crypto_local_output_t;
|
||||
```
|
||||
|
||||
These may be used to keep track of input and output copies' state, and ensure that their length is always stored with them. In the case of output copies, we keep a pointer to the original buffer so that it is easy to perform a writeback to the original once we have finished outputting.
|
||||
|
||||
With these structs we may create 2 pairs of functions, one pair for input copies:
|
||||
|
||||
```c
|
||||
psa_status_t psa_crypto_local_input_alloc(const uint8_t *input, size_t input_len,
|
||||
psa_crypto_local_input_t *local_input);
|
||||
|
||||
void psa_crypto_local_input_free(psa_crypto_local_input_t *local_input);
|
||||
```
|
||||
|
||||
* `psa_crypto_local_input_alloc()` calls `calloc()` to allocate a new buffer of length `input_len`, copies the contents across from `input`. It then stores `input_len` and the pointer to the copy in the struct `local_input`.
|
||||
* `psa_crypto_local_input_free()` calls `free()` on the local input that is referred to by `local_input` and sets the pointer in the struct to `NULL`.
|
||||
|
||||
We also create a pair of functions for output copies:
|
||||
|
||||
```c
|
||||
psa_status_t psa_crypto_local_output_alloc(uint8_t *output, size_t output_len,
|
||||
psa_crypto_local_output_t *local_output);
|
||||
|
||||
psa_status_t psa_crypto_local_output_free(psa_crypto_local_output_t *local_output);
|
||||
```
|
||||
|
||||
* `psa_crypto_local_output_alloc()` calls `calloc()` to allocate a new buffer of length `output_len` and stores `output_len` and the pointer to the buffer in the struct `local_output`. It also stores a pointer to `output` in `local_output->original`.
|
||||
* `psa_crypto_local_output_free()` copies the contents of the output buffer `local_output->buffer` into the buffer `local_output->original`, calls `free()` on `local_output->buffer` and sets it to `NULL`.
|
||||
|
||||
Some PSA functions may not use these convenience functions as they may have local optimizations that reduce memory usage. For example, ciphers may be able to use a single intermediate buffer for both input and output.
|
||||
|
||||
In order to abstract the management of the copy state further, to make it simpler to add, we create the following 6 convenience macros:
|
||||
|
||||
For inputs:
|
||||
|
||||
* `LOCAL_INPUT_DECLARE(input, input_copy_name)`, which declares and initializes a `psa_crypto_local_input_t` and a pointer with the name `input_copy_name` in the current scope.
|
||||
* `LOCAL_INPUT_ALLOC(input, input_size, input_copy)`, which tries to allocate an input using `psa_crypto_local_input_alloc()`. On failure, it sets an error code and jumps to an exit label. On success, it sets `input_copy` to point to the copy of the buffer.
|
||||
* `LOCAL_INPUT_FREE(input, input_copy)`, which frees the input copy using `psa_crypto_local_input_free()` and sets `input_copy` to `NULL`.
|
||||
|
||||
For outputs:
|
||||
|
||||
* `LOCAL_OUTPUT_DECLARE(output, output_copy_name)`, analogous to `LOCAL_INPUT_DECLARE()` for `psa_crypto_local_output_t`.
|
||||
* `LOCAL_OUTPUT_ALLOC(output, output_size, output_copy)`, analogous to `LOCAL_INPUT_ALLOC()` for outputs, calling `psa_crypto_local_output_alloc()`.
|
||||
* `LOCAL_OUTPUT_FREE(output, output_copy)`, analogous to `LOCAL_INPUT_FREE()` for outputs. If the `psa_crypto_local_output_t` is in an invalid state (the copy pointer is valid, but the original pointer is `NULL`) this macro sets an error status.
|
||||
|
||||
These macros allow PSA functions to have copying added while keeping the code mostly unmodified. Consider a hypothetical PSA function:
|
||||
|
||||
```c
|
||||
psa_status_t psa_foo(const uint8_t *input, size_t input_length,
|
||||
uint8_t *output, size_t output_size, size_t *output_length)
|
||||
{
|
||||
/* Do some operation on input and output */
|
||||
}
|
||||
```
|
||||
|
||||
By changing the name of the input and output parameters, we can retain the original variable name as the name of the local copy while using a new name (e.g. with the suffix `_external`) for the original buffer. This allows copying to be added near-seamlessly as follows:
|
||||
|
||||
```c
|
||||
psa_status_t psa_foo(const uint8_t *input_external, size_t input_length,
|
||||
uint8_t *output_external, size_t output_size, size_t *output_length)
|
||||
{
|
||||
psa_status_t status;
|
||||
|
||||
LOCAL_INPUT_DECLARE(input_external, input);
|
||||
LOCAL_OUTPUT_DECLARE(output_external, output);
|
||||
|
||||
LOCAL_INPUT_ALLOC(input_external, input);
|
||||
LOCAL_OUTPUT_ALLOC(output_external, output);
|
||||
|
||||
/* Do some operation on input and output */
|
||||
|
||||
exit:
|
||||
LOCAL_INPUT_FREE(input_external, input);
|
||||
LOCAL_OUTPUT_FREE(output_external, output);
|
||||
}
|
||||
```
|
||||
|
||||
A second advantage of using macros for the copying (other than simple convenience) is that it allows copying to be easily disabled by defining alternate macros that function as no-ops. Since buffer copying is specific to systems where shared memory is passed to PSA functions, it is useful to be able to disable it where it is not needed, to save code size.
|
||||
|
||||
To this end, the macros above are defined conditionally on a new config option, `MBEDTLS_PSA_ASSUME_EXCLUSIVE_BUFFERS`, which may be set whenever PSA functions are assumed to have exclusive access to their input and output buffers. When `MBEDTLS_PSA_ASSUME_EXCLUSIVE_BUFFERS` is set, the macros do not perform copying.
|
||||
|
||||
### Implementation of copying validation
|
||||
|
||||
As discussed in the [design exploration of copying validation](#validation-of-copying), the best strategy for validation of copies appears to be validation by memory poisoning, implemented using Valgrind and ASan.
|
||||
|
||||
To perform memory poisoning, we must implement the functions alluded to in [Validation of copying by memory poisoning](#validation-of-copying-by-memory-poisoning):
|
||||
```c
|
||||
void mbedtls_test_memory_poison(const unsigned char *ptr, size_t size);
|
||||
void mbedtls_test_memory_unpoison(const unsigned char *ptr, size_t size);
|
||||
```
|
||||
This should poison or unpoison the given buffer, respectively.
|
||||
|
||||
* `mbedtls_test_memory_poison()` is equivalent to calling `VALGRIND_MAKE_MEM_NOACCESS(ptr, size)` or `ASAN_POISON_MEMORY_REGION(ptr, size)`.
|
||||
* `mbedtls_test_memory_unpoison()` is equivalent to calling `VALGRIND_MAKE_MEM_DEFINED(ptr, size)` or `ASAN_UNPOISON_MEMORY_REGION(ptr, size)`.
|
||||
|
||||
The PSA copying function must then have test hooks implemented as outlined in [Validation of copying by memory poisoning](#validation-of-copying-by-memory-poisoning).
|
||||
|
||||
As discussed in [the design exploration](#validation-with-existing-tests), the preferred approach for implementing copy-testing is to implement it transparently using existing tests. This is specified in more detail below.
|
||||
|
||||
#### Transparent allocation-based memory poisoning
|
||||
|
||||
In order to implement transparent memory poisoning we require a wrapper around all PSA function calls that poisons any input and output buffers.
|
||||
|
||||
The easiest way to do this is to create wrapper functions that poison the memory and then `#define` PSA function names to be wrapped versions of themselves. For example, to replace `psa_aead_update()`:
|
||||
```c
|
||||
psa_status_t mem_poison_psa_aead_update(psa_aead_operation_t *operation,
|
||||
const uint8_t *input,
|
||||
size_t input_length,
|
||||
uint8_t *output,
|
||||
size_t output_size,
|
||||
size_t *output_length)
|
||||
{
|
||||
mbedtls_test_memory_poison(input, input_length);
|
||||
mbedtls_test_memory_poison(output, output_size);
|
||||
psa_status_t status = psa_aead_update(operation, input, input_length,
|
||||
output, output_size, output_length);
|
||||
mbedtls_test_memory_unpoison(input, input_length);
|
||||
mbedtls_test_memory_unpoison(output, output_size);
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
#define psa_aead_update(...) mem_poison_psa_aead_update(__VA_ARGS__)
|
||||
```
|
||||
|
||||
There now exists a more generic mechanism for making exactly this kind of transformation - the PSA test wrappers, which exist in the files `tests/include/test/psa_test_wrappers.h` and `tests/src/psa_test_wrappers.c`. These are wrappers around all PSA functions that allow testing code to be inserted at the start and end of a PSA function call.
|
||||
|
||||
The test wrappers are generated by a script, although they are not automatically generated as part of the build process. Instead, they are checked into source control and must be manually updated when functions change by running `framework/scripts/generate_psa_wrappers.py`.
|
||||
|
||||
Poisoning code is added to these test wrappers where relevant in order to pre-poison and post-unpoison the parameters to the functions.
|
||||
|
||||
#### Configuration of poisoning tests
|
||||
|
||||
Since the memory poisoning tests will require the use of interfaces specific to the sanitizers used to poison memory, they must only be enabled when we are building with ASan or Valgrind. For now, we can auto-detect ASan at compile-time and set an option: `MBEDTLS_TEST_MEMORY_CAN_POISON`. When this option is enabled, we build with memory-poisoning support. This enables transparent testing with ASan without needing any extra configuration options.
|
||||
|
||||
Auto-detection and memory-poisoning with Valgrind is left for future work.
|
||||
|
||||
#### Validation of validation for copying
|
||||
|
||||
To make sure that we can correctly detect functions that access their input/output buffers rather than the copies, it would be best to write a test function that misbehaves and test it with memory poisoning. Specifically, the function should:
|
||||
|
||||
* Read its input buffer and after calling the input-buffer-copying function to create a local copy of its input.
|
||||
* Write to its output buffer before and after calling the output-buffer-copying function to copy-back its output.
|
||||
|
||||
Then, we could write a test that uses this function with memory poisoning and ensure that it fails. Since we are expecting a failure due to memory-poisoning, we would run this test separately from the rest of the memory-poisoning testing.
|
||||
|
||||
This testing is implemented in `programs/test/metatest.c`, which is a program designed to check that test failures happen correctly. It may be run via the script `tests/scripts/run-metatests.sh`.
|
@@ -1,536 +0,0 @@
|
||||
# PSA storage resilience design
|
||||
|
||||
## Introduction
|
||||
|
||||
The PSA crypto subsystem includes a persistent key store. It is possible to create a persistent key and read it back later. This must work even if the underlying storage exhibits non-nominal behavior. In this document, _resilience_ means correct behavior of the key store even under if the underlying platform behaves in a non-nominal, but still partially controlled way.
|
||||
|
||||
At this point, we are only concerned about one specific form of resilience: to a system crash or power loss. That is, we assume that the underlying platform behaves nominally, except that occasionally it may restart. In the field, this can happen due to a sudden loss of power.
|
||||
|
||||
This document explores the problem space, defines a library design and a test design.
|
||||
|
||||
## Resilience goals for API functions
|
||||
|
||||
**Goal: PSA Crypto API functions are atomic and committing.**
|
||||
|
||||
_Atomic_ means that when an application calls an API function, as far as the application is concerned, at any given point in time, the system is either in a state where the function has not started yet, or in a state where the function has returned. The application never needs to worry about an intermediate state.
|
||||
|
||||
_Committing_ means that when a function returns, the data has been written to the persistent storage. As a consequence, if the system restarts during a sequence of storage modifications $M_1, M_2, \ldots, M_n$, we know that when the system restarts, a prefix of the sequence has been performed. For example, there will never be a situation where $M_2$ has been performed but not $M_1$.
|
||||
|
||||
The committing property is important not only for sequences of operations, but also when reporting the result of an operation to an external system. For example, if a key creation function in the PSA Crypto API reports to the application that a key has been created, and the application reports to a server that the key has been created, it is guaranteed that the key exists even if the system restarts.
|
||||
|
||||
## Assumptions on the underlying file storage
|
||||
|
||||
PSA relies on a PSA ITS (Internal Trusted Storage) interface, which exposes a simple API. There are two functions to modify files:
|
||||
|
||||
* `set()` writes a whole file (either creating it, or replacing the previous content).
|
||||
* `remove()` removes a file (returning a specific error code if the file does not exist).
|
||||
|
||||
**Assumption: the underlying ITS functions are atomic and committing.**
|
||||
|
||||
Since the underlying functions are atomic, the content of a file is always a version that was previously passed to `set()`. We do not try to handle the case where a file might be partially written.
|
||||
|
||||
## Overview of API functions
|
||||
|
||||
For a transparent key, all key management operations (creation or destruction) on persistent keys rely on a single call to the underlying storage (`set()` for a key creation, `remove()` for a key destruction). This also holds for an opaque key stored in a secure element that does not have its own key store: in this case, the core stores a wrapped (i.e. encrypted) copy of the key material, but this does not impact how the core interacts with the storage. Other API functions do not modify the storage.
|
||||
|
||||
The following case requires extra work related to resilience:
|
||||
|
||||
* [Key management for stateful secure element keys](#designing-key-management-for-secure-element-keys).
|
||||
|
||||
As a consequence, apart from the listed cases, the API calls inherit directly from the [resilience properties of the underyling storage](#assumptions-on-the-underlying-file-storage). We do not need to take any special precautions in the library design, and we do not need to perform any testing of resilience for transparent keys.
|
||||
|
||||
(This section was last updated for Mbed TLS 3.4.0 implementing PSA Crypto API 1.1.)
|
||||
|
||||
## Designing key management for secure element keys
|
||||
|
||||
In this section, we use “(stateful) secure element key” to mean a key stored in a stateful secure element, i.e. a secure element that stores keys. This excludes keys in a stateleess secure element for which the core stores a wrapped copy of the key. We study the problem of how key management in stateful secure elements interacts with storage and explore the design space.
|
||||
|
||||
### Assumptions on stateful secure elements
|
||||
|
||||
**Assumption: driver calls for key management in stateful secure elements are atomic and committing.**
|
||||
|
||||
(For stateless secure elements, this assumption is vacuously true.)
|
||||
|
||||
### Dual management of keys: the problem
|
||||
|
||||
For a secure element key, key management requires a commitment on both sites. For example, consider a successful key creation operation:
|
||||
|
||||
1. The core sends a request to the secure element to create a key.
|
||||
2. The secure element modifies its key store to create the key.
|
||||
3. The secure element reports to the core that the key has been created.
|
||||
4. The core reports to the application that the key has been created.
|
||||
|
||||
If the core loses power between steps 1 and 2, the key does not exist yet. This is fine from an application's perspective since the core has not committed to the key's existence, but the core needs to take care not to leave resources in storage that are related to the non-existent key. If the core loses power between steps 2 and 3, the key exists in the secure element. From an application's perspective, the core may either report that the key exists or that it does not exist, but in the latter case, the core needs to free the key in the secure element, to avoid leaving behind inaccessible resources.
|
||||
|
||||
As a consequence, the content of the storage cannot remain the same between the end of step 1 and the end of step 3, since the core must behave differently depending on whether step 2 has taken place.
|
||||
|
||||
Accomplishing a transaction across system boundaries is a well-known problem in database management, with a well-known solution: two-phase commit.
|
||||
|
||||
### Overview of two-phase commit with stateful secure elements
|
||||
|
||||
With a key in a stateful secure element, a successful creation process goes as follows (see [“Key management in a secure element with storage” in the driver interface specification](../../proposed/psa-driver-interface.html#key-management-in-a-secure-element-with-storage)):
|
||||
|
||||
1. The core calls the driver's `"allocate_key"` entry point.
|
||||
2. The driver allocates a unique identifier _D_ for the key. This is unrelated to the key identifier _A_ used by the application interface. This step must not modify the state of the secure element.
|
||||
3. The core updates the storage to indicate that key identifier _A_ has the identifier _D_ in the driver, and that _A_ is in a half-created state.
|
||||
4. The core calls the driver's key creation entry point, passing it the driver's chosen identifier _D_.
|
||||
5. The driver creates the key in the secure element. When this happens, it concludes the voting phase of the two-phase commit: effectively, the secure element decides to commit. (It is however possible to revert this commitment by giving the secure element the order to destroy the key.)
|
||||
6. The core updates the storage to indicate that _A_ is now in a fully created state. This concludes the commit phase of the two-phase commit.
|
||||
|
||||
If there is a loss of power:
|
||||
|
||||
* Before step 3: the system state has not changed at all. As far as the world is concerned, the key creation attempt never happened.
|
||||
* Between step 3 and step 6: upon restart, the core needs to find out whether the secure element completed step 5 or not, and reconcile the state of the storage with the state of the secure element.
|
||||
* After step 6: the key has been created successfully.
|
||||
|
||||
Key destruction goes as follows:
|
||||
|
||||
1. The core updates the storage indicating that the key is being destroyed.
|
||||
2. The core calls the driver's `"destroy_key"` entry point.
|
||||
3. The secure element destroys the key.
|
||||
4. The core updates the storage to indicate that the key has been destroyed.
|
||||
|
||||
If there is a loss of power:
|
||||
|
||||
* Before step 1: the system state has not changed at all. As far as the world is concerned, the key destruction attempt never happened.
|
||||
* Between step 1 and step 4: upon restart, the core needs to find out whether the secure element completed step 3 or not, and reconcile the state of the storage with the state of the secure element.
|
||||
* After step 4: the key has been destroyed successfully.
|
||||
|
||||
In both cases, upon restart, the core needs to perform a transaction recovery. When a power loss happens, the core decides whether to commit or abort the transaction.
|
||||
|
||||
Note that the analysis in this section assumes that the driver does not update its persistent state during a key management operation (or at least not in a way that is influences the key management process — for example, it might renew an authorization token).
|
||||
|
||||
### Optimization considerations for transactions
|
||||
|
||||
We assume that power failures are rare. Therefore we will primarily optimize for the normal case. Transaction recovery needs to be practical, but does not have to be fully optimized.
|
||||
|
||||
The main quantity we will optimize for is the number of storage updates in the nominal case. This is good for performance because storage writes are likely to dominate the runtime in some hardware configurations where storage writes are slow and communication with the secure element is fast, for key management operations that require a small amount of computation. In addition, minimizing the number of storage updates is good for the longevity of flash media.
|
||||
|
||||
#### Information available during recovery
|
||||
|
||||
The PSA ITS API does not support enumerating files in storage: an ITS call can only access one file identifier. Therefore transaction recovery cannot be done by traversing files whose name is or encodes the key identifier. It must start by traversing a small number of files whose names are independent of the key identifiers involved.
|
||||
|
||||
#### Minimum effort for a transaction
|
||||
|
||||
Per the [assumptions on the underlying file storage](#assumptions-on-the-underlying-file-storage), each atomic operation in the internal storage concerns a single file: either removing it, or setting its content. Furthermore there is no way to enumerate the files in storage.
|
||||
|
||||
A key creation function must transform the internal storage from a state where file `id` does not exist, to a state where file `id` exists and has its desired final content (containing the key attributes and the driver's key identifier). The situation is similar with key destruction, except that the initial and final states are exchanged. Neither the initial state nor the final state reference `id` otherwise.
|
||||
|
||||
For a key that is not in a stateful element, the transaction consists of a single write operation. As discussed previously, this is not possible with a stateful secure element because the state of the internal storage needs to change both before and after the state change in the secure element. No other single-write algorithm works.
|
||||
|
||||
If there is a power failure around the time of changing the state of the secure element, there must be information in the internal storage that indicates that key `id` has a transaction in progress. The file `id` cannot be used for this purpose because there is no way to enumerate all keys (and even if there was, it would not be practical). Therefore the transaction will need to modify some other file `t` with a fixed name (a name that doesn't depend on the key). Since the final system state should be identical to the initial state except for the file `id`, the minimum number of storage operations for a transaction is 3:
|
||||
|
||||
* Write (create or update) a file `t` referencing `id`.
|
||||
* Write the final state of `id`.
|
||||
* Restore `t` to its initial state.
|
||||
|
||||
The strategies discussed in the [overview above](#overview-of-two-phase-commit-with-stateful-secure-elements) follow this pattern, with `t` being the file containing the transaction list that the recovery consults. We have just proved that this pattern is optimal.
|
||||
|
||||
Note that this pattern requires the state of `id` to be modified only once. In particular, if a key management involves writing an intermediate state for `id` before modifying the secure element state and writing a different state after that, this will require a total of 4 updates to internal storage. Since we want to minimize the number of storage updates, we will not explore designs that involved updating `id` twice or more.
|
||||
|
||||
### Recovery strategies
|
||||
|
||||
When the core starts, it needs to know about transaction(s) that need to be resumed. This information will be stored in a persistent “transaction list”, with one entry per key. In this section, we explore recovery strategies, and we determine what the transaction list needs to contain as well as when it needs to be updated. Other sections will explore the format of the transaction list, as well as how many keys it needs to contain.
|
||||
|
||||
#### Exploring the recovery decision tree
|
||||
|
||||
There are four cases for recovery when a transaction is in progress. In each case, the core can either decide to commit the transaction (which may require replaying the interrupted part) or abort it (which may require a rewind in the secure element). It may call the secure element driver's `"get_key_attributes"` entry point to find out whether the key is present.
|
||||
|
||||
* Key creation, key not present in the secure element:
|
||||
* Committing means replaying the driver call in the key creation. This requires all the input, for example the data to import. This seems impractical in general. Also, the second driver call require a new call to `"allocate_key"` which will in general changing the key's driver identifier, which complicates state management in the core. Given the likely complexity, we exclude this strategy.
|
||||
* Aborting means removing any trace of the key creation.
|
||||
* Key creation, key present in the secure element:
|
||||
* Committing means finishing the update of the core's persistent state, as would have been done if the transaction had not been interrupted.
|
||||
* Aborting means destroying the key in the secure element and removing any local storage used for that key.
|
||||
* Key destruction, key not present in the secure element:
|
||||
* Committing means finishing the update of the core's persistent state, as would have been done if the transaction had not been interrupted, by removing any remaining local storage used for that key.
|
||||
* Aborting would mean re-creating the key in the secure element, which is impossible in general since the key material is no longer present.
|
||||
* Key destruction, key present in the secure element:
|
||||
* Committing means finishing the update of the core's persistent state, as would have been done if the transaction had not been interrupted, by removing any remaining local storage used for that key and destroying the key in the secure element.
|
||||
* Aborting means keeping the key. This requires no action on the secure element, and is only practical locally if the local storage is intact.
|
||||
|
||||
#### Comparing recovery strategies
|
||||
|
||||
From the analysis above, assuming that all keys are treated in the same way, there are 4 possible strategies.
|
||||
|
||||
* [Always follow the state of the secure element](#exploring-the-follow-the-secure-element-strategy). This requires the secure element driver to have a `"get_key_attributes"` entry point. Recovery means resuming the operation where it left off. For key creation, this means that the key metadata needs to be saved before calling the secure element's key creation entry point.
|
||||
* Minimize the information processing: [always destroy the key](#exploring-the-always-destroy-strategy), i.e. abort all key creations and commit all key destructions. This does not require querying the state of the secure element. This does not require any special precautions to preserve information about the key during the transaction. It simplifies recovery in that the recovery process might not even need to know whether it's recovering a key creation or a key destruction.
|
||||
* Follow the state of the secure element for key creation, but always go ahead with key destruction. This requires the secure element driver to have a `"get_key_attributes"` entry point. Compared to always following the state of the secure element, this has the advantage of maximizing the chance that a command to destroy key material is effective. Compared to always destroying the key, this has a performance advantage if a key creation is interrupted. These do not seem like decisive advantages, so we will not consider this strategy further.
|
||||
* Always abort key creation, but follow the state of the secure element for key destruction. I can't think of a good reason to choose this strategy.
|
||||
|
||||
Requiring the driver to have a `"get_key_attributes"` entry point is potentially problematic because some secure elements don't have room to store key attributes: a key slot always exists, and it's up to the user to remember what, if anything, they put in it. The driver has to remember anyway, so that it can find a free slot when creating a key. But with a recovery strategy that doesn't involve a `"get_key_attributes"` entry point, the driver design is easier: the driver doesn't need to protect the information about slots in use against a power failure, the core takes care of that.
|
||||
|
||||
#### Exploring the follow-the-secure-element strategy
|
||||
|
||||
Each entry in the transaction list contains the API key identifier, the key lifetime (or at least the location), the driver key identifier (not constant-size), and an indication of whether the key is being created or destroyed.
|
||||
|
||||
For key creation, we have all the information to store in the key file once the `"allocate_key"` call returns. We must store all the information that will go in the key file before calling the driver's key creation entry point. Therefore the normal sequence of operations is:
|
||||
|
||||
1. Call the driver's `"allocate_key"` entry point.
|
||||
2. Add the key to the transaction list, indicating that it is being created.
|
||||
3. Write the key file.
|
||||
4. Call the driver's key creation entry point.
|
||||
5. Remove the key from the transaction list.
|
||||
|
||||
During recovery, for each key in the transaction list that was being created:
|
||||
|
||||
* If the key exists in the secure element, just remove it from the transaction list.
|
||||
* If the key does not exist in the secure element, first remove the key file if it is present, then remove the key from the transaction list.
|
||||
|
||||
For key destruction, we need to preserve the key file until after the key has been destroyed. Therefore the normal sequence of operations is:
|
||||
|
||||
1. Add the key to the transaction list, indicating that it is being destroyed.
|
||||
2. Call the driver's `"destroy_key"` entry point.
|
||||
3. Remove the key file.
|
||||
4. Remove the key from the transaction list.
|
||||
|
||||
During recovery, for each key in the transaction list that was being created:
|
||||
|
||||
* If the key exists in the secure element, call the driver's `"destroy_key"` entry point, then remove the key file, and finally remote the key from the transaction lits.
|
||||
* If the key does not exist in the secure element, remove the key file if it is still present, then remove the key from the transaction list.
|
||||
|
||||
#### Exploring the always-destroy strategy
|
||||
|
||||
Each entry in the transaction list contains the API key identifier, the key lifetime (or at least the location), and the driver key identifier (not constant-size).
|
||||
|
||||
For key creation, we do not need to store the key's metadata until it has been created in the secure element. Therefore the normal sequence of operations is:
|
||||
|
||||
1. Call the driver's `"allocate_key"` entry point.
|
||||
2. Add the key to the transaction list.
|
||||
3. Call the driver's key creation entry point.
|
||||
4. Write the key file.
|
||||
5. Remove the key from the transaction list.
|
||||
|
||||
For key destruction, we can remove the key file before contacting the secure element. Therefore the normal sequence of operations is:
|
||||
|
||||
1. Add the key to the transaction list.
|
||||
2. Remove the key file.
|
||||
3. Call the driver's `"destroy_key"` entry point.
|
||||
4. Remove the key from the transaction list.
|
||||
|
||||
Recovery means removing all traces of all keys on the transaction list. This means following the destruction process, starting after the point where the key has been added to the transaction list, and ignoring any failure of a removal action if the item to remove does not exist:
|
||||
|
||||
1. Remove the key file, treating `DOES_NOT_EXIST` as a success.
|
||||
2. Call the driver's `"destroy_key"` entry point, treating `DOES_NOT_EXIST` as a success.
|
||||
3. Remove the key from the transaction list.
|
||||
|
||||
#### Always-destroy strategy with a simpler transaction file
|
||||
|
||||
We can modify the [always-destroy strategy](#exploring-the-always-destroy-strategy) to make the transaction file simpler: if we ensure that the key file always exists if the key exists in the secure element, then the transaction list does not need to include the driver key identifier: it can be read from the key file.
|
||||
|
||||
For key creation, we need to store the key's metadata before creating in the secure element. Therefore the normal sequence of operations is:
|
||||
|
||||
1. Call the driver's `"allocate_key"` entry point.
|
||||
2. Add the key to the transaction list.
|
||||
3. Write the key file.
|
||||
4. Call the driver's key creation entry point.
|
||||
5. Remove the key from the transaction list.
|
||||
|
||||
For key destruction, we need to contact the secure element before removing the key file. Therefore the normal sequence of operations is:
|
||||
|
||||
1. Add the key to the transaction list.
|
||||
2. Call the driver's `"destroy_key"` entry point.
|
||||
3. Remove the key file.
|
||||
4. Remove the key from the transaction list.
|
||||
|
||||
Recovery means removing all traces of all keys on the transaction list. This means following the destruction process, starting after the point where the key has been added to the transaction list, and ignoring any failure of a removal action if the item to remove does not exist:
|
||||
|
||||
1. Load the driver key identifier from the key file. If the key file does not exist, skip to step 4.
|
||||
2. Call the driver's `"destroy_key"` entry point, treating `DOES_NOT_EXIST` as a success.
|
||||
3. Remove the key file, treating `DOES_NOT_EXIST` as a success.
|
||||
4. Remove the key from the transaction list.
|
||||
|
||||
Compared with the basic always-destroy strategy:
|
||||
|
||||
* The transaction file handling is simpler since its entries have a fixed size.
|
||||
* The flow of information is somewhat different from transparent keys and keys in stateless secure elements: we aren't just replacing “create the key material” by “tell the secure element to create the key material”, those happen at different times. But there's a different flow for stateful secure elements anyway, since the call to `"allocate_key"` has no analog in the stateless secure element or transparent cases.
|
||||
|
||||
#### Assisting secure element drivers with recovery
|
||||
|
||||
The actions of the secure element driver may themselves be non-atomic. So the driver must be given a chance to perform recovery.
|
||||
|
||||
To simplify the design of the driver, the core should guarantee that the driver will know if a transaction was in progress and the core cannot be sure about the state of the secure element. Merely calling a read-only entry point such as `"get_key_attributes"` does not provide enough information to the driver for it to know that it should actively perform recovery related to that key.
|
||||
|
||||
This gives an advantage to the “always destroy” strategy. Under this strategy, if the key might be in a transitional state, the core will request a key destruction from the driver. This means that, if the driver has per-key auxiliary data to clean up, it can bundle that as part of the key's destruction.
|
||||
|
||||
### Testing non-atomic processes
|
||||
|
||||
In this section, we discuss how to test non-atomic processes that must implement an atomic and committing interface. As discussed in [“Overview of API functions”](#overview-of-api-functions), this concerns key management in stateful secure elements.
|
||||
|
||||
#### Naive test strategy for non-atomic processes
|
||||
|
||||
Non-atomic processes consist of a series of atomic, committing steps.
|
||||
|
||||
Our general strategy to test them is as follows: every time there is a modification of persistent state, either in storage or in the (simulated) secure element, try both the nominal case and simulating a power loss. If a power loss occurs, restart the system (i.e. clean up and call `psa_crypto_init()`), and check that the system ends up in a consistent state.
|
||||
|
||||
Note that this creates a binary tree of possibilities: after each state modification, there may or may not be a restart, and after that different state modifications may occur, each of which may or may not be followed by a restart.
|
||||
|
||||
For example, consider testing of one key creation operation (see [“Overview of two-phase commit with stateful secure elements”](#overview-of-two-phase-commit-with-stateful-secure-elements), under the simplifying assumption that each storage update step, as well as the recovery after a restart, each make a single (atomic) storage modification and no secure element access. The nominal case consists of three state modifications: storage modification (start transaction), creation on the secure element, storage modification (commit transaction). We need to test the following sequences:
|
||||
|
||||
* Start transaction, restart, recovery.
|
||||
* Start transaction, secure element operation, restart, recovery.
|
||||
* Start transaction, secure element operation, commit transaction.
|
||||
|
||||
If, for example, recovery consists of two atomic steps, the tree of possibilities expands and may be infinite:
|
||||
|
||||
* Start transaction, restart, recovery step 1, restart, recovery step 1, recovery step 2.
|
||||
* Start transaction, restart, recovery step 1, restart, recovery step 1, restart, recovery step 1, recovery step 2.
|
||||
* Start transaction, restart, recovery step 1, restart, recovery step 1, restart, recovery step 1, restart, recovery step 1, recovery step 2.
|
||||
* etc.
|
||||
* Start transaction, secure element operation, restart, ...
|
||||
* Start transaction, secure element operation, commit transaction.
|
||||
|
||||
In order to limit the possibilities, we need to make some assumptions about the recovery step. For example, if we have confidence that recovery step 1 is idempotent (i.e. doing it twice is the same as doing it once), we don't need to test what happens in execution sequences that take recovery step 1 more than twice in a row.
|
||||
|
||||
### Splitting normal behavior and transaction recovery
|
||||
|
||||
We introduce an abstraction level in transaction recovery:
|
||||
|
||||
* Normal operation must maintain a certain invariant on the state of the world (internal storage and secure element).
|
||||
* Transaction recovery is defined over all states of the world that satisfy this invariant.
|
||||
|
||||
This separation of concerns greatly facilitates testing, since it is now split into two parts:
|
||||
|
||||
* During the testing of normal operation, we can use read-only invasive testing to ensure that the invariant is maintained. No modification of normal behavior (such as simulated power failures) is necessary.
|
||||
* Testing of transaction recovery is independent of how the system state was reached. We only need to artificially construct a representative sample of system states that match the invariant. Transaction recovery is itself an operation that must respect the invariant, and so we do not need any special testing for the case of an interrupted recovery.
|
||||
|
||||
Another benefit of this approach is that it is easier to specify and test what happens if the library is updated on a device with leftovers from an interrupted transaction. We will require and test that the new version of the library supports recovery of the old library's states, without worrying how those states were reached.
|
||||
|
||||
#### Towards an invariant for transactions
|
||||
|
||||
As discussed in the section [“Recovery strategies”](#recovery-strategies), the information about active transactions is stored in a transaction list file. The name of the transaction list file does not depend on the identifiers of the keys in the list, but there may be more than one transaction list, for example one per secure element. If so, each transaction list can be considered independently.
|
||||
|
||||
When no transaction is in progress, the transaction list does not exist, or is empty. The empty case must be supported because this is the initial state of the filesystem. When no transaction is in progress, the state of the secure element must be consistent with references to keys in that secure element contained in key files. More generally, if a key is not in the transaction list, then the key must be present in the secure element if and only if the key file is in the internal storage.
|
||||
|
||||
For the purposes of the state invariant, it matters whether the transaction list file contains the driver key identifier, or if the driver key identifier is only stored in the key file. This is because the core needs to know the driver key id in order to access the secure element. If the transaction list does not contain the driver key identifier, and the key file does not exist, the key must not be present in the secure element.
|
||||
|
||||
We thus have two scenarios, each with their own invariant: one where the transaction list contains only key identifiers, and one where it also contains the secure element's key identifier (as well as the location of the secure element if this is not encoded in the name of the transaction list file).
|
||||
|
||||
#### Storage invariant if the transaction list contains application key identifiers only
|
||||
|
||||
Invariants:
|
||||
|
||||
* If the file `id` does not exist, then no resources corresponding to that key are in a secure element. This holds whether `id` is in the transaction list or not.
|
||||
* If `id` is not in the transaction list and the file `id` exists and references a key in a stateful secure element, then the key is present in the secure element.
|
||||
|
||||
If `id` is in the transaction list and the file `id` exists, the key may or may not be present in the secure element.
|
||||
|
||||
The invariant imposes constraints on the [order of operations for the two-phase commit](#overview-of-two-phase-commit-with-stateful-secure-elements): key creation must create `id` before calling the secure element's key creation entry point, and key destruction must remove `id` after calling the secure element's key destruction entry point.
|
||||
|
||||
For recovery:
|
||||
|
||||
* If the file `id` does not exist, then nothing needs to be done for recovery, other than removing `id` from the transaction list.
|
||||
* If the file `id` exists:
|
||||
* It is correct to destroy the key in the secure element (treating a `DOES_NOT_EXIST` error as a success), then remove `id`.
|
||||
* It is correct to check whether the key exists in the secure element, and if it does, keep it and keep `id`. If not, remove `id` from the internal storage.
|
||||
|
||||
#### Storage invariant if the transaction list contains driver key identifiers
|
||||
|
||||
Invariants:
|
||||
|
||||
* If `id` is not in the transaction list and the file `id` does not exist, then no resources corresponding to that key are in a secure element.
|
||||
* If `id` is not in the transaction list and the file `id` exists, then the key is present in the secure element.
|
||||
|
||||
If `id` is in the transaction list, neither the state of `id` in the internal storage nor the state of the key in the secure element is known.
|
||||
|
||||
For recovery:
|
||||
|
||||
* If the file `id` does not exist, then destroy the key in the secure element (treating a `DOES_NOT_EXIST` error as a success).
|
||||
* If the file `id` exists:
|
||||
* It is correct to destroy the key in the secure element (treating a `DOES_NOT_EXIST` error as a success), then remove `id`.
|
||||
* It is correct to check whether the key exists in the secure element, and if it does, keep it and keep `id`. If not, remove `id` from the internal storage.
|
||||
|
||||
#### Coverage of states that respect the invariant
|
||||
|
||||
For a given key, we have to consider three a priori independent boolean states:
|
||||
|
||||
* Whether the key file exists.
|
||||
* Whether the key is in the secure element.
|
||||
* Whether the key is in the transaction list.
|
||||
|
||||
There is full coverage for one key if we have tests of recovery for the states among these $2^3 = 8$ possibilities that satisfy the storage invariant.
|
||||
|
||||
In addition, testing should adequately cover the case of multiple keys in the transaction list. How much coverage is adequate depends on the layout of the list as well as white-box considerations of how the list is manipulated.
|
||||
|
||||
### Choice of a transaction design
|
||||
|
||||
#### Chosen transaction algorithm
|
||||
|
||||
Based on [“Optimization considerations for transactions”](#optimization-considerations-for-transactions), we choose a transaction algorithm that consists in the following operations:
|
||||
|
||||
1. Add the key identifier to the transaction list.
|
||||
2. Call the secure element's key creation or destruction entry point.
|
||||
3. Remove the key identifier from the transaction list.
|
||||
|
||||
In addition, before or after step 2, create or remove the key file in the internal storage.
|
||||
|
||||
In order to conveniently support multiple transactions at the same time, we pick the simplest possible layout for the transaction list: a simple array of key identifiers. Since the transaction list does not contain the driver key identifier:
|
||||
|
||||
* During key creation, create the key file in internal storage in the internal storage before calling the secure element's key creation entry point.
|
||||
* During key destruction, call the secure element's key destruction entry point before removing the key file in internal storage.
|
||||
|
||||
This choice of algorithm does not require the secure element driver to have a `"get_key_attributes"` entry point.
|
||||
|
||||
#### Chosen storage invariant
|
||||
|
||||
The [storage invariant](#storage-invariant-if-the-transaction-list-contains-application-key-identifiers-only) is as follows:
|
||||
|
||||
* If the file `id` does not exist, then no resources corresponding to that key are in a secure element. This holds whether `id` is in the transaction list or not.
|
||||
* If `id` is not in the transaction list and the file `id` exists and references a key in a stateful secure element, then the key is present in the secure element.
|
||||
* If `id` is in the transaction list and a key exists by that identifier, the key's location is a stateful secure element.
|
||||
|
||||
#### Chosen recovery process
|
||||
|
||||
To [assist secure element drivers with recovery](#assisting-secure-element-drivers-with-recovery), we pick the [always-destroy recovery strategy with a simple transaction file](#always-destroy-strategy-with-a-simpler-transaction-file). The the recovery process is as follows:
|
||||
|
||||
* If the file `id` does not exist, then nothing needs to be done for recovery, other than removing `id` from the transaction list.
|
||||
* If the file `id` exists, call the secure element's key destruction entry point (treating a `DOES_NOT_EXIST` error as a success), then remove `id`.
|
||||
|
||||
## Specification of key management in stateful secure elements
|
||||
|
||||
This section only concerns stateful secure elements as discussed in [“Designing key management for secure element keys”](#designing-key-management-for-secure-element-keys), i.e. secure elements with an `"allocate_key"` entry point. The design follows the general principle described in [“Overview of two-phase commit with stateful secure elements”](#overview-of-two-phase-commit-with-stateful-secure-elements) and the specific choices justified in [“Choice of a transaction design”](choice-of-a-transaction-design).
|
||||
|
||||
### Transaction list file manipulation
|
||||
|
||||
The transaction list is a simple array of key identifiers.
|
||||
|
||||
To add a key identifier to the list:
|
||||
|
||||
1. Load the current list from the transaction list if it exists and it is not already cached in memory.
|
||||
2. Append the key identifier to the array.
|
||||
3. Write the updated list file.
|
||||
|
||||
To remove a key identifier from the list:
|
||||
|
||||
1. Load the current list if it is not already cached in memory. It is an error if the file does not exist since it must contain this identifier.
|
||||
2. Remove the key identifier from the array. If it wasn't the last element in array, move array elements to fill the hole.
|
||||
3. If the list is now empty, remove the transaction list file. Otherwise write the updated list to the file.
|
||||
|
||||
### Key creation process in the core
|
||||
|
||||
Let _A_ be the application key identifier.
|
||||
|
||||
1. Call the driver's `"allocate_key"` entry point, obtaining the driver key identifier _D_ chosen by the driver.
|
||||
2. Add _A_ [to the transaction list file](#transaction-list-file-manipulation).
|
||||
3. Create the key file _A_ in the internal storage. Note that this is done at a different time from what happens when creating a transparent key or a key in a stateless secure element: in those cases, creating the key file happens after the actual creation of the key material.
|
||||
4. Call the secure element's key creation entry point.
|
||||
5. Remove _A_ [from the transaction list file](#transaction-list-file-manipulation).
|
||||
|
||||
If any step fails:
|
||||
|
||||
* If the secure element's key creation entry point has been called and succeeded, call the secure element's destroy entry point.
|
||||
* If the key file has been created in the internal storage, remove it.
|
||||
* Remove the key from the transaction list.
|
||||
|
||||
Note that this process is identical to key destruction, except that the key is already in the transaction list.
|
||||
|
||||
### Key destruction process in the core
|
||||
|
||||
Let _A_ be the application key identifier.
|
||||
|
||||
We assume that the key is loaded in a key slot in memory: the core needs to know the key's location in order to determine whether the key is in a stateful secure element, and if so to know the driver key identifier. A possible optimization would be to load only that information in local variables, without occupying a key store; this has the advantage that key destruction works even if the key store is full.
|
||||
|
||||
1. Add _A_ [to the transaction list file](#transaction-list-file-manipulation).
|
||||
2. Call the secure element's `"destroy_key"` entry point.
|
||||
3. Remove the key file _A_ from the internal storage.
|
||||
4. Remove _A_ [from the transaction list file](#transaction-list-file-manipulation).
|
||||
5. Free the corresponding key slot in memory.
|
||||
|
||||
If any step fails, remember the error but continue the process, to destroy the resources associated with the key as much as is practical.
|
||||
|
||||
### Transaction recovery
|
||||
|
||||
For each key _A_ in the transaction list file, if the file _A_ exists in the internal storage:
|
||||
|
||||
1. Load the key into a key slot in memory (to get its location and the driver key identifier, although we could get the location from the transaction list).
|
||||
2. Call the secure element's `"destroy_key"` entry point.
|
||||
3. Remove the key file _A_ from the internal storage.
|
||||
4. Remove _A_ [from the transaction list file](#transaction-list-file-manipulation).
|
||||
5. Free the corresponding key slot in memory.
|
||||
|
||||
The transaction list file can be processed in any order.
|
||||
|
||||
It is correct to update the transaction list after recovering each key, or to only delete the transaction list file once the recovery is over.
|
||||
|
||||
### Concrete format of the transaction list file
|
||||
|
||||
The transaction list file contains a [fixed header](#transaction-list-header-format) followed by a list of [fixed-size elements](#transaction-list-element-format).
|
||||
|
||||
The file uid is `PSA_CRYPTO_ITS_TRANSACTION_LIST_UID` = 0xffffff53.
|
||||
|
||||
#### Transaction list header format
|
||||
|
||||
* Version (2 bytes): 0x0003. (Chosen to differ from the first two bytes of a [dynamic secure element transaction file](#dynamic-secure-element-transaction-file), to reduce the risk of a mix-up.)
|
||||
* Key name size (2 bytes): `sizeof(psa_storage_uid_t)`. Storing this size avoids reading bad data if Mbed TLS is upgraded to a different integration that names keys differently.
|
||||
|
||||
#### Transaction list element format
|
||||
|
||||
In practice, there will rarely be more than one active transaction at a time, so the size of an element is not critical for efficiency. Therefore, in addition to the key identifier which is required, we add some potentially useful information in case it becomes useful later. We do not put the driver key identifier because its size is not a constant.
|
||||
|
||||
* Key id: `sizeof(psa_storage_uid_t)` bytes.
|
||||
* Key lifetime: 4 bytes (`sizeof(psa_key_lifetime_t)`). Currently unused during recovery.
|
||||
* Operation type: 1 byte. Currently unused during recovery.
|
||||
* 0: destroy key.
|
||||
* 1: import key.
|
||||
* 2: generate key.
|
||||
* 3: derive key.
|
||||
* 4: import key.
|
||||
* Padding: 3 bytes. Reserved for future use. Currently unused during recovery.
|
||||
|
||||
#### Dynamic secure element transaction file
|
||||
|
||||
Note that the code base already references a “transaction file” (`PSA_CRYPTO_ITS_TRANSACTION_UID` = 0xffffff54), used by dynamic secure elements (feature enabled with `MBEDTLS_PSA_CRYPTO_SE_C`). This is a deprecated feature that has not been fully implemented: when this feature is enabled, the transaction file gets written during transactions, but if it exists when PSA crypto starts, `psa_crypto_init()` fails because [recovery has never been implemented](https://github.com/ARMmbed/mbed-crypto/issues/218).
|
||||
|
||||
For the new kind of secure element driver, we pick a different file name to avoid any mixup.
|
||||
|
||||
## Testing key management in secure elements
|
||||
|
||||
### Instrumentation for checking the storage invariant
|
||||
|
||||
#### Test hook locations
|
||||
|
||||
When `MBEDTLS_TEST_HOOKS` is enabled, each call to `psa_its_set()` or `psa_its_remove()` also calls a test hook, passing the file UID as an argument to the hook.
|
||||
|
||||
When a stateful secure element driver is present in the build, we use this hook to verify that the storage respects the [storage invariant](#chosen-storage-invariant). In addition, if there is some information about key ongoing operation (set explicitly by the test function as a global variable in the test framework), the hook tests that the content of the storage is compatible with the ongoing operation.
|
||||
|
||||
#### Test hook behavior
|
||||
|
||||
The storage invariant check cannot check all keys in storage, and does not need to (for example, it would be pointless to check anything about transparent keys). It checks the following keys:
|
||||
|
||||
* When invoked from the test hook on a key file: on that key.
|
||||
* When invoked from the test hook on the transaction file: on all the keys listed in the transaction file.
|
||||
* When invoked from a test secure element: on the specified key.
|
||||
|
||||
#### Test hook extra data
|
||||
|
||||
Some tests set global variables to indicate which persistent keys they manipulate. We instrument at least some of these tests to also indicate what operation is in progress on the key. See the GitHub issues or the source code for details.
|
||||
|
||||
### Testing of transaction recovery
|
||||
|
||||
When no secure element driver is present in the build, the presence of a transaction list file during initialization is an error.
|
||||
|
||||
#### Recovery testing process
|
||||
|
||||
When the stateful test secure element driver is present in the build, we run test cases on a representative selection of states of the internal storage and the test secure element. Each test case for transaction recovery has the following form:
|
||||
|
||||
1. Create the initial state:
|
||||
* Create a transaction list file with a certain content.
|
||||
* Create key files that we want to have in the test.
|
||||
* Call the secure element test driver to create keys without going throught the PSA API.
|
||||
2. Call `psa_crypto_init()`. Expect success if the initial state satisfies the [storage invariant](#chosen-storage-invariant) and failure otherwise.
|
||||
3. On success, check that the expected keys exist, and that keys that are expected to have been destroyed by recovery do not exist.
|
||||
4. Clean up the storage and the secure element test driver's state.
|
||||
|
||||
#### States to test recovery on
|
||||
|
||||
For a given key located in a secure element, the following combination of states are possible:
|
||||
|
||||
* Key file: present, absent.
|
||||
* Key in secure element: present, absent.
|
||||
* Key in the transaction file: no, creation (import), destruction.
|
||||
|
||||
We test all $2 \times 2 \times 3 = 12$ possibilities, each in its own test case. In each case, call the test function that checks the storage invariant and check that its result is as expected. Then, if the storage invariant is met, follow the [recovery testing process](#recovery-testing-process).
|
||||
|
||||
In addition, have at least one positive test case for each creation method other than import, to ensure that we don't reject a valid value.
|
||||
|
||||
Note: testing of a damaged filesystem (including a filesystem that doesn't meet the invariant) is out of scope of the present document.
|
Binary file not shown.
Before Width: | Height: | Size: 49 KiB |
@@ -1,367 +0,0 @@
|
||||
# Thread-safety of the PSA subsystem
|
||||
|
||||
Currently, PSA Crypto API calls in Mbed TLS releases are not thread-safe.
|
||||
|
||||
As of Mbed TLS 3.6, an MVP for making the [PSA Crypto key management API](https://arm-software.github.io/psa-api/crypto/1.1/api/keys/management.html) and [`psa_crypto_init`](https://arm-software.github.io/psa-api/crypto/1.1/api/library/library.html#c.psa_crypto_init) thread-safe has been implemented. Implementations which only ever call PSA functions from a single thread are not affected by this new feature.
|
||||
|
||||
Summary of recent work:
|
||||
|
||||
- Key Store:
|
||||
- Slot states are described in the [Key slot states](#key-slot-states) section. They guarantee safe concurrent access to slot contents.
|
||||
- Key slots are protected by a global mutex, as described in [Key store consistency and abstraction function](#key-store-consistency-and-abstraction-function).
|
||||
- Key destruction strategy abiding by [Key destruction guarantees](#key-destruction-guarantees), with an implementation discussed in [Key destruction implementation](#key-destruction-implementation).
|
||||
- `global_data` variables in `psa_crypto.c` and `psa_crypto_slot_management.c` are now protected by mutexes, as described in the [Global data](#global-data) section.
|
||||
- The testing system has now been made thread-safe. Tests can now spin up multiple threads, see [Thread-safe testing](#thread-safe-testing) for details.
|
||||
- Some multithreaded testing of the key management API has been added, this is outlined in [Testing-and-analysis](#testing-and-analysis).
|
||||
- The solution uses the pre-existing `MBEDTLS_THREADING_C` threading abstraction.
|
||||
- The core makes no additional guarantees for drivers. See [Driver policy](#driver-policy) for details.
|
||||
|
||||
The other functions in the PSA Crypto API are planned to be made thread-safe in future, but currently we are not testing this.
|
||||
|
||||
## Overview of the document
|
||||
|
||||
* The [Guarantees](#guarantees) section describes the properties that are followed when PSA functions are invoked by multiple threads.
|
||||
* The [Usage guide](#usage-guide) section gives guidance on initializing, using and freeing PSA when using multiple threads.
|
||||
* The [Current strategy](#current-strategy) section describes how thread-safety of key management and `global_data` is achieved.
|
||||
* The [Testing and analysis](#testing-and-analysis) section discusses the state of our testing, as well as how this testing will be extended in future.
|
||||
* The [Future work](#future-work) section outlines our long-term goals for thread-safety; it also analyses how we might go about achieving these goals.
|
||||
|
||||
## Definitions
|
||||
|
||||
*Concurrent calls*
|
||||
|
||||
The PSA specification defines concurrent calls as: "In some environments, an application can make calls to the Crypto API in separate threads. In such an environment, concurrent calls are two or more calls to the API whose execution can overlap in time." (See PSA documentation [here](https://arm-software.github.io/psa-api/crypto/1.1/overview/conventions.html#concurrent-calls).)
|
||||
|
||||
*Thread-safety*
|
||||
|
||||
In general, a system is thread-safe if any valid set of concurrent calls is handled as if the effect and return code of every call is equivalent to some sequential ordering. We implement a weaker notion of thread-safety, we only guarantee thread-safety in the circumstances described in the [PSA Concurrent calling conventions](#psa-concurrent-calling-conventions) section.
|
||||
|
||||
## Guarantees
|
||||
|
||||
### Correctness out of the box
|
||||
|
||||
Building with `MBEDTLS_PSA_CRYPTO_C` and `MBEDTLS_THREADING_C` gives code which is correct; there are no race-conditions, deadlocks or livelocks when concurrently calling any set of PSA key management functions once `psa_crypto_init` has been called (see the [Initialization](#initialization) section for details on how to correctly initialize the PSA subsystem when using multiple threads).
|
||||
|
||||
We do not test or support calling other PSA API functions concurrently.
|
||||
|
||||
There is no busy-waiting in our implementation, every API call completes in a finite number of steps regardless of the locking policy of the underlying mutexes.
|
||||
|
||||
When only considering key management functions: Mbed TLS 3.6 abides by the minimum expectation for concurrent calls set by the PSA specification (see [PSA Concurrent calling conventions](#psa-concurrent-calling-conventions)).
|
||||
|
||||
#### PSA Concurrent calling conventions
|
||||
|
||||
These are the conventions which are planned to be added to the PSA 1.2 specification, Mbed TLS 3.6 abides by these when only considering [key management functions](https://arm-software.github.io/psa-api/crypto/1.1/api/keys/management.html):
|
||||
|
||||
> The result of two or more concurrent calls must be consistent with the same set of calls being executed sequentially in some order, provided that the calls obey the following constraints:
|
||||
>
|
||||
> * There is no overlap between an output parameter of one call and an input or output parameter of another call. Overlap between input parameters is permitted.
|
||||
>
|
||||
> * A call to `psa_destroy_key()` must not overlap with a concurrent call to any of the following functions:
|
||||
> - Any call where the same key identifier is a parameter to the call.
|
||||
> - Any call in a multi-part operation, where the same key identifier was used as a parameter to a previous step in the multi-part operation.
|
||||
>
|
||||
> * Concurrent calls must not use the same operation object.
|
||||
>
|
||||
> If any of these constraints are violated, the behaviour is undefined.
|
||||
>
|
||||
> The consistency requirement does not apply to errors that arise from resource failures or limitations. For example, errors resulting from resource exhaustion can arise in concurrent execution that do not arise in sequential execution.
|
||||
>
|
||||
> As an example of this rule: suppose two calls are executed concurrently which both attempt to create a new key with the same key identifier that is not already in the key store. Then:
|
||||
> * If one call returns `PSA_ERROR_ALREADY_EXISTS`, then the other call must succeed.
|
||||
> * If one of the calls succeeds, then the other must fail: either with `PSA_ERROR_ALREADY_EXISTS` or some other error status.
|
||||
> * Both calls can fail with error codes that are not `PSA_ERROR_ALREADY_EXISTS`.
|
||||
>
|
||||
> If the application concurrently modifies an input parameter while a function call is in progress, the behaviour is undefined.
|
||||
|
||||
### Backwards compatibility
|
||||
|
||||
Code which was working prior to Mbed TLS 3.6 will still work. Implementations which only ever call PSA functions from a single thread, or which protect all PSA calls using a mutex, are not affected by this new feature. If an application previously worked with a 3.X version, it will still work on version 3.6.
|
||||
|
||||
### Supported threading implementations
|
||||
|
||||
Currently, the only threading library with support shipped in the code base is pthread (enabled by `MBEDTLS_THREADING_PTHREAD`). The only concurrency primitives we use are mutexes, see [Condition variables](#condition-variables) for discussion about implementing new primitives in future major releases.
|
||||
|
||||
Users can add support to any platform which has mutexes using the Mbed TLS platform abstraction layer (see `include/mbedtls/threading.h` for details).
|
||||
|
||||
We intend to ship support for other platforms including Windows in future releases.
|
||||
|
||||
### Key destruction guarantees
|
||||
|
||||
Much like all other API calls, `psa_destroy_key` does not block indefinitely, and when `psa_destroy_key` returns:
|
||||
|
||||
1. The key identifier does not exist. This is a functional requirement for persistent keys: any thread can immediately create a new key with the same identifier.
|
||||
2. The resources from the key have been freed. This allows threads to create similar keys immediately after destruction, regardless of resources.
|
||||
|
||||
When `psa_destroy_key` is called on a key that is in use, guarantee 2 may be violated. This is consistent with the PSA specification requirements, as destruction of a key in use is undefined.
|
||||
|
||||
In future versions we aim to enforce stronger requirements for key destruction, see [Long term key destruction requirements](#long-term-key-destruction-requirements) for details.
|
||||
|
||||
### Driver policy
|
||||
|
||||
The core makes no additional guarantees for drivers. Driver entry points may be called concurrently from multiple threads. Threads can concurrently call entry points using the same key, there is also no protection from destroying a key which is in use.
|
||||
|
||||
### Random number generators
|
||||
|
||||
The PSA RNG can be accessed both from various PSA functions, and from application code via `mbedtls_psa_get_random`.
|
||||
|
||||
When using the built-in RNG implementations, i.e. when `MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG` is disabled, querying the RNG is thread-safe (`mbedtls_psa_random_init` and `mbedtls_psa_random_seed` are only thread-safe when called while holding `mbedtls_threading_psa_rngdata_mutex`. `mbedtls_psa_random_free` is not thread-safe).
|
||||
|
||||
When `MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG` is enabled, it is down to the external implementation to ensure thread-safety, should threading be enabled.
|
||||
|
||||
## Usage guide
|
||||
|
||||
### Initialization
|
||||
|
||||
The PSA subsystem is initialized via a call to [`psa_crypto_init`](https://arm-software.github.io/psa-api/crypto/1.1/api/library/library.html#c.psa_crypto_init). This is a thread-safe function, and multiple calls to `psa_crypto_init` are explicitly allowed. It is valid to have multiple threads each calling `psa_crypto_init` followed by a call to any PSA key management function (if the init succeeds).
|
||||
|
||||
### General usage
|
||||
|
||||
Once initialized, threads can use any PSA function if there is no overlap between their calls. All threads share the same set of keys, as soon as one thread returns from creating/loading a key via a key management API call the key can be used by any thread. If multiple threads attempt to load the same persistent key, with the same key identifier, only one thread can succeed - the others will return `PSA_ERROR_ALREADY_EXISTS`.
|
||||
|
||||
Applications may need careful handling of resource management errors. As explained in ([PSA Concurrent calling conventions](#psa-concurrent-calling-conventions)), operations in progress can have memory related side effects. It is possible for a lack of resources to cause errors which do not arise in sequential execution. For example, multiple threads attempting to load the same persistent key can lead to some threads returning `PSA_ERROR_INSUFFICIENT_MEMORY` if the key is not currently in the key store - while trying to load a persistent key into the key store a thread temporarily reserves a free key slot.
|
||||
|
||||
If a mutex operation fails, which only happens if the mutex implementation fails, the error code `PSA_ERROR_SERVICE_FAILURE` will be returned. If this code is returned, execution of the PSA subsystem must be stopped. All functions which have internal mutex locks and unlocks (except for when the lock/unlock occurs in a function that has no return value) will return with this error code in this situation.
|
||||
|
||||
### Freeing
|
||||
|
||||
There is no thread-safe way to free all PSA resources. This is because any such operation would need to wait for all other threads to complete their tasks before wiping resources.
|
||||
|
||||
`mbedtls_psa_crypto_free` must only be called by a single thread once all threads have completed their operations.
|
||||
|
||||
## Current strategy
|
||||
|
||||
This section describes how we have implemented thread-safety. There is discussion of: techniques, internal properties for enforcing thread-safe access, how the system stays consistent and our abstraction model.
|
||||
|
||||
### Protected resources
|
||||
|
||||
#### Global data
|
||||
|
||||
We have added a mutex `mbedtls_threading_psa_globaldata_mutex` defined in `include/mbedtls/threading.h`, which is used to make `psa_crypto_init` thread-safe.
|
||||
|
||||
There are two `psa_global_data_t` structs, each with a single instance `global_data`:
|
||||
|
||||
* The struct in `library/psa_crypto.c` is protected by `mbedtls_threading_psa_globaldata_mutex`. The RNG fields within this struct are not protected by this mutex, and are not always thread-safe (see [Random number generators](#random-number-generators)).
|
||||
* The struct in `library/psa_crypto_slot_management.c` has two fields: `key_slots` is protected as described in [Key slots](#key-slots), `key_slots_initialized` is protected by the global data mutex.
|
||||
|
||||
#### Mutex usage
|
||||
|
||||
A deadlock would occur if a thread attempts to lock a mutex while already holding it. Functions which need to be called while holding the global mutex have documentation to say this.
|
||||
|
||||
To avoid performance degradation, functions must hold mutexes for as short a time as possible. In particular, they must not start expensive operations (eg. doing cryptography) while holding the mutex.
|
||||
|
||||
#### Key slots
|
||||
|
||||
|
||||
Keys are stored internally in a global array of key slots known as the "key store", defined in `library/psa_slot_management.c`.
|
||||
|
||||
##### Key slot states
|
||||
|
||||
Each key slot has a state variable and a `registered_readers` counter. These two variables dictate whether an operation can access a slot, and in what way the slot can be used.
|
||||
|
||||
There are four possible states for a key slot:
|
||||
|
||||
* `PSA_SLOT_EMPTY`: no thread is currently accessing the slot, and no information is stored in the slot. Any thread is able to change the slot's state to `PSA_SLOT_FILLING` and begin to load data into the slot.
|
||||
* `PSA_SLOT_FILLING`: one thread is currently loading or creating material to fill the slot, this thread is responsible for the next state transition. Other threads cannot read the contents of a slot which is in this state.
|
||||
* `PSA_SLOT_FULL`: the slot contains a key, and any thread is able to use the key after registering as a reader, increasing `registered_readers` by 1.
|
||||
* `PSA_SLOT_PENDING_DELETION`: the key within the slot has been destroyed or marked for destruction, but at least one thread is still registered as a reader (`registered_readers > 0`). No thread can register to read this slot. The slot must not be wiped until the last reader unregisters. It is during the last unregister that the contents of the slot are wiped, and the slot's state is set to `PSA_SLOT_EMPTY`.
|
||||
|
||||
###### Key slot state transition diagram
|
||||

|
||||
|
||||
In the state transition diagram above, an arrow between two states `q1` and `q2` with label `f` indicates that if the state of a slot is `q1` immediately before `f`'s linearization point, it may be `q2` immediately after `f`'s linearization point. Internal functions have italicized labels. The `PSA_SLOT_PENDING_DELETION -> PSA_SLOT_EMPTY` transition can be done by any function which calls `psa_unregister_read`.
|
||||
|
||||
The state transition diagram can be generated in https://app.diagrams.net/ via this [url](https://viewer.diagrams.net/?tags=%7B%7D&highlight=0000ff&edit=_blank&layers=1&nav=1#R3Vxbd5s4EP4t%2B%2BDH5CBxf6zrJJvW7aYn7W7dFx9qZFstBg7gW379CnMxkoUtY%2BGQ%2BiVISCPQjD59mhnSU98vNg%2BRE84%2FBS7yelBxNz110IMQAEsnf9KabVZjmHnFLMJu3mhf8YxfUF6p5LVL7KKYapgEgZfgkK6cBL6PJglV50RRsKabTQOPHjV0Zuig4nnieIe1%2F2E3mWe1FjT39X8jPJsXIwPDzu4snKJx%2Fibx3HGDdaVKveup76MgSLKrxeY98tLJK%2BYl63dfc7d8sAj5iUiHH%2BBlOP338cP6i%2B37%2Ff7oV%2Fjr442aSVk53jJ%2F4R40PCKv7%2BIVuZyll%2FffhsOimsiv3OE0njvxOEKOi6K4uPszYtuzUnbzk2yLSScPTvRLCv31HCfoOXQm6Z01MbF0hGThkRIgl04cZkqf4g1yS1HVScnnaYWiBG0qVfkkPaBggZJoS5rkdzUrV1hhsUpeXlf0n1fNK6ov6pzc4mal5L1SyEWulzN0BABHSeyM%2Be671NpJaeI5cYwn9ERFwdJ30xkaKKREJifafs9v7QqjamGwqbYbbIvSBidlJ3I9qtTvu6SFoketNuJgGU3QabtMnGiGkiPttKwdcqlVfKjbiu50ju6Kugh5ToJX9NrnKTQf4SnA5M1qTUc3GJvI3jvvVV2rrCDTvrUrP4sSq6mM2GyaDsTurK2chAsMENaiBC7WcBg746UfoRmOExTtEKCy2HH9UieaGzo%2Fya5BL2wPz%2FzUmInloIhUpOsXE1h%2Bl99YYNdNZfQjFOMX5%2BdOXmpzYToLu3nR%2Bz19wLXC48uMRYpyc8lHofCbhyDKLVRMm1LZDbzMwAoxgOkSTKcxakfpIjvD3aenr6O3CfOdQ3lbOsrneK1U8BocxetyXygLo2qhZl9ojvJQEOVBt1CetpwDNBYG%2BRObRcuoXvDSU6g%2BdbA3%2Fo224wkB9QQH%2FlvD9WJhdRHXc8mQEsr2bw%2FkDzf2%2B8fh8PHzQ6exWjVeGas1kb3xrFPTX3%2FcsenVlaSLKOnp7vNgZ%2B6CehrcDe%2B%2BPv7z%2BW3qqHOkx2yL84ifUZudhZtznsKJdYrzwE5xHqiQzc%2FSoAnI2VTTDXoX1DXj1gS6CS1TJwWVES9KiIDBMCvtuozIEkEMLkciZAVFKzSeRgjtuFLsBQmfJwkCDXeYmExAwuViXBw6OWpnOVuBC12kbKUY7VosDfD4hnyYvNWbHA6zXq96POyWEzCFSkUpoNIgqEaDGkhdewVWqpZiNgNLTWHAkti6yphk237B5oA5xT6O5wLHyjcGXOVSvRi5bogVabZJQ5cqx0ItrtQrABmPkzO6nCzJRuqWFOx6YQ1xN1lzRBMNa6idQjStiNmWMdyGHi%2FdYASxB4sawCI24GwrzfLlWf%2FANo2NpqIcfy7ItAcn2mvWMfnkInvipotn0NcmAD9MQu8FLR%2Fxs%2F7uaSN2nq1hpyejMpew0pqwTzNKKjYkMZKx47tjL5j8Lvn2%2BPtFA6VyJ14Q7wj8Wb3CJbHaaq%2BDwf8wel7iuIxdDqgWvZou5Oe5ZJr0Q%2F1ae5zKS6mQQtarG5SgT6PCztuN5GiCG1u3IjnQhJSV6HrDjQ3UOdauxMRV3gmRi1UuipMo2F6OcXLwtLMQVy5jCS4IzTLoM2CxDC403xuaTdktQByXicj32nKJ%2Bym0Oh8X28e3bnltVYbX6k1D1arJOBsEibssi6t3NDR1w3YBeI4uLinUymYc9ZJwBxRujjY9CNzZuUqSjLAnlIarFj2hon4DvdPwY4Cm8MOkyhjtJUByra547orZHXCpzgKKtPSXFFCKrpKJDO3mbCP9ha%2FXK2VWn4aGJjDUHE50QTjp2Gmtxkt3NpxAhs0Y7WXe8c0O1tKZhr42eZ61NQ4PqdPbdV8dX%2FYywsvlF05yIRGorwSJPKrNaFJ6iKaxX6oryMTEGxoHSFTNvIWWpWtQszUbqpbKyqVCy1AIts6NnpC3qY4CbPohTEW9NaFS%2FtTjbwTso8IAOEeY3vzJ2gnKcLP23%2FKnMcdBQQJgKrpFc0hJFLKNbJwnvNwMp3BsWbMvqx%2F3Hye%2BH3I%2FjJHDGanEmkZf47XGGEWzFruViqMyOTI667YSxmX9hCNNHmPk2pwQYUxxBi%2FCIEsRPMtPP0M%2BipykgYM%2FCM%2BPJaT00kURXu3yfsbBMgmX1DOfn1X9GlB5FB0kIKWuAe65%2BGLvHSX0almMsLMJDCeyCeScfv6wT%2FdEAyKimUz7YFkRebtSbpNNu7IPcs6F8zEZQaIh4L0gqUvww0j7vh7F%2FW9ujL7iR%2FfmYWy1QF0KOy2JxzmWSicnvP4nF93KumPJi9n4UMmQFxOKWea550bW3W9qcrPiuCZdz4yaJ4x1gVwcXb8SyAWwDTlsQmUijIxPogmYkeL%2B3%2BJkzff%2FXEi9%2Bx8%3D).
|
||||
##### Key slot access primitives
|
||||
|
||||
The state of a key slot is updated via the internal function `psa_key_slot_state_transition`. To change the state of `slot` from `expected_state` to `new_state`, when `new_state` is not `PSA_SLOT_EMPTY`, one must call `psa_key_slot_state_transition(slot, expected_state, new_state)`; if the state was not `expected_state` then `PSA_ERROR_CORRUPTION_DETECTED` is returned. The sole reason for having an expected state parameter here is to help guarantee that our functions work as expected, this error code cannot occur without an internal coding error.
|
||||
|
||||
Changing a slot's state to `PSA_SLOT_EMPTY` is done via `psa_wipe_key_slot`, this function wipes the entirety of the key slot.
|
||||
|
||||
The reader count of a slot is incremented via `psa_register_read`, and decremented via `psa_unregister_read`. Library functions register to read a slot via the `psa_get_and_lock_key_slot_X` functions, read from the slot, then call `psa_unregister_read` to make known that they have finished reading the slot's contents.
|
||||
|
||||
##### Key store consistency and abstraction function
|
||||
|
||||
The key store is protected by a single global mutex `mbedtls_threading_key_slot_mutex`.
|
||||
|
||||
We maintain the consistency of the key store by ensuring that all reads and writes to `slot->state` and `slot->registered_readers` are performed under `mbedtls_threading_key_slot_mutex`. All the access primitives described above must be called while the mutex is held; there is a convenience function `psa_unregister_read_under_mutex` which wraps a call to `psa_unregister_read` in a mutex lock/unlock pair.
|
||||
|
||||
A thread can only traverse the key store while holding `mbedtls_threading_key_slot_mutex`, the set of keys within the key store which the thread holding the mutex can access is equivalent to the set:
|
||||
|
||||
{mbedtls_svc_key_id_t k : (\exists slot := &global_data.key_slots[i]) [
|
||||
(slot->state == PSA_SLOT_FULL) &&
|
||||
(slot->attr.id == k)]}
|
||||
|
||||
The union of this set and the set of persistent keys not currently loaded into slots is our abstraction function for the key store, any key not in this union does not currently exist as far as the code is concerned (even if the key is in a slot which has a `PSA_SLOT_FILLING` or `PSA_SLOT_PENDING_DELETION` state). Attempting to start using any key which is not a member of the union will result in a `PSA_ERROR_INVALID_HANDLE` error code.
|
||||
|
||||
##### Locking and unlocking the mutex
|
||||
|
||||
If a lock or unlock operation fails and this is the first failure within a function, the function will return `PSA_ERROR_SERVICE_FAILURE`. If a lock or unlock operation fails after a different failure has been identified, the status code is not overwritten.
|
||||
|
||||
We have defined a set of macros in `library/psa_crypto_core.h` to capture the common pattern of (un)locking the mutex and returning or jumping to an exit label upon failure.
|
||||
|
||||
##### Key creation and loading
|
||||
|
||||
To load a new key into a slot, the following internal utility functions are used:
|
||||
|
||||
* `psa_reserve_free_key_slot` - This function, which must be called under `mbedtls_threading_key_slot_mutex`, iterates through the key store to find a slot whose state is `PSA_SLOT_EMPTY`. If found, it reserves the slot by setting its state to `PSA_SLOT_FILLING`. If not found, it will see if there are any persistent keys loaded which do not have any readers, if there are it will kick one such key out of the key store.
|
||||
* `psa_start_key_creation` - This function wraps around `psa_reserve_free_key_slot`, if a slot has been found then the slot id is set. This second step is not done under the mutex, at this point the calling thread has exclusive access to the slot.
|
||||
* `psa_finish_key_creation` - After the contents of the key have been loaded (again this loading is not done under the mutex), the thread calls `psa_finish_key_creation`. This function takes the mutex, checks that the key does not exist in the key store (this check cannot be done before this stage), sets the slot's state to `PSA_SLOT_FULL` and releases the mutex. Upon success, any thread is immediately able to use the new key.
|
||||
* `psa_fail_key_creation` - If there is a failure at any point in the key creation stage, this clean-up function takes the mutex, wipes the slot, and releases the mutex. Immediately after this unlock, any thread can start to use the slot for another key load.
|
||||
|
||||
##### Re-loading persistent keys
|
||||
|
||||
As described above, persistent keys can be kicked out of the key slot array provided they are not currently being used (`registered_readers == 0`). When attempting to use a persistent key that has been kicked out of a slot, the call to `psa_get_and_lock_key_slot` will see that the key is not in a slot, call `psa_reserve_free_key_slot` and load the key back into the reserved slot. This entire sequence is done during a single mutex lock, which is necessary for thread-safety (see documentation of `psa_get_and_lock_key_slot`).
|
||||
|
||||
If `psa_reserve_free_key_slot` cannot find a suitable slot, the key cannot be loaded back in. This will lead to a `PSA_ERROR_INSUFFICIENT_MEMORY` error.
|
||||
|
||||
##### Using existing keys
|
||||
|
||||
One-shot operations follow a standard pattern when using an existing key:
|
||||
|
||||
* They call one of the `psa_get_and_lock_key_slot_X` functions, which then finds the key and registers the thread as a reader.
|
||||
* They operate on the key slot, usually copying the key into a separate buffer to be used by the operation. This step is not performed under the key slot mutex.
|
||||
* Once finished, they call `psa_unregister_read_under_mutex`.
|
||||
|
||||
Multi-part and restartable operations each have a "setup" function where the key is passed in, these functions follow the above pattern. The key is copied into the `operation` object, and the thread unregisters from reading the key (the operations do not access the key slots again). The copy of the key will not be destroyed during a call to `psa_destroy_key`, the thread running the operation is responsible for deleting its copy in the clean-up. This may need to change to enforce the long term key requirements ([Long term key destruction requirements](#long-term-key-destruction-requirements)).
|
||||
|
||||
##### Key destruction implementation
|
||||
|
||||
The locking strategy here is explained in `library/psa_crypto.c`. The destroying thread (the thread calling `psa_destroy_key`) does not always wipe the key slot. The destroying thread registers to read the key, sets the slot's state to `PSA_SLOT_PENDING_DELETION`, wipes the slot from memory if the key is persistent, and then unregisters from reading the slot.
|
||||
|
||||
`psa_unregister_read` internally calls `psa_wipe_key_slot` if and only if the slot's state is `PSA_SLOT_PENDING_DELETION` and the slot's registered reader counter is equal to 1. This implements a "last one out closes the door" approach. The final thread to unregister from reading a destroyed key will automatically wipe the contents of the slot; no readers remain to reference the slot post deletion, so there cannot be corruption.
|
||||
|
||||
### linearizability of the system
|
||||
|
||||
To satisfy the requirements in [Correctness out of the box](#correctness-out-of-the-box), we require our functions to be "linearizable" (under certain constraints). This means that any (constraint satisfying) set of concurrent calls are performed as if they were executed in some sequential order.
|
||||
|
||||
The standard way of reasoning that this is the case is to identify a "linearization point" for each call, this is a single execution step where the function takes effect (this is usually a step in which the effects of the call become visible to other threads). If every call has a linearization point, the set of calls is equivalent to sequentially performing the calls in order of when their linearization point occurred.
|
||||
|
||||
We only require linearizability to hold in the case where a resource-management error is not returned. In a set of concurrent calls, it is permitted for a call c to fail with a `PSA_ERROR_INSUFFICIENT_MEMORY` return code even if there does not exist a sequential ordering of the calls in which c returns this error. Even if such an error occurs, all calls are still required to be functionally correct.
|
||||
|
||||
To help justify that our system is linearizable, here are the linearization points/planned linearization points of each PSA call :
|
||||
|
||||
* Key creation functions (including `psa_copy_key`) - The linearization point for a successful call is the mutex unlock within `psa_finish_key_creation`; it is at this point that the key becomes visible to other threads. The linearization point for a failed call is the closest mutex unlock after the failure is first identified.
|
||||
* `psa_destroy_key` - The linearization point for a successful destruction is the mutex unlock, the slot is now in the state `PSA_SLOT_PENDING_DELETION` meaning that the key has been destroyed. For failures, the linearization point is the same.
|
||||
* `psa_purge_key`, `psa_close_key` - The linearization point is the mutex unlock after wiping the slot for a success, or unregistering for a failure.
|
||||
* One shot operations - The linearization point is the final unlock of the mutex within `psa_get_and_lock_key_slot`, as that is the point in which it is decided whether or not the key exists.
|
||||
* Multi-part operations - The linearization point of the key input function is the final unlock of the mutex within `psa_get_and_lock_key_slot`. All other steps have no non resource-related side effects (except for key derivation, covered in the key creation functions).
|
||||
|
||||
Please note that one shot operations and multi-part operations are not yet considered thread-safe, as we have not yet tested whether they rely on unprotected global resources. The key slot access in these operations is thread-safe.
|
||||
|
||||
## Testing and analysis
|
||||
|
||||
### Thread-safe testing
|
||||
|
||||
It is now possible for individual tests to spin up multiple threads. This work has made the global variables used in tests thread-safe. If multiple threads fail a test assert, the first failure will be reported with correct line numbers.
|
||||
|
||||
Although the `step` feature used in some tests is thread-safe, it may produce unexpected results for multi-threaded tests. `mbedtls_test_set_step` or `mbedtls_test_increment_step` calls within threads can happen in any order, thus may not produce the desired result when precise ordering is required.
|
||||
|
||||
### Current state of testing
|
||||
|
||||
Our testing is a work in progress. It is not feasible to run our traditional, single-threaded, tests in such a way that tests concurrency. We need to write new test suites for concurrency testing.
|
||||
|
||||
Our tests currently only run on pthread, we hope to expand this in the future (our API already allows this).
|
||||
|
||||
We run tests using [ThreadSanitizer](https://clang.llvm.org/docs/ThreadSanitizer.html) to detect data races. We test the key store, and test that our key slot state system is enforced. We also test the thread-safety of `psa_crypto_init`.
|
||||
|
||||
Currently, not every API call is tested, we also cannot feasibly test every combination of concurrent API calls. API calls can in general be split into a few categories, each category calling the same internal key management functions in the same order - it is the internal functions that are in charge of locking mutexes and interacting with the key store; we test the thread-safety of these functions.
|
||||
|
||||
Since we do not run every cryptographic operation concurrently, we do not test that operations are free of unexpected global variables.
|
||||
|
||||
### Expanding testing
|
||||
|
||||
Through future work on testing, it would be good to:
|
||||
|
||||
* For every API call, have a test which runs multiple copies of the call simultaneously.
|
||||
* After implementing other threading platforms, expand the tests to these platforms.
|
||||
* Have increased testing for kicking persistent keys out of slots.
|
||||
* Explicitly test that all global variables are protected, for this we would need to cover every operation in a concurrent scenario while running ThreadSanitizer.
|
||||
* Run tests on more threading implementations, once these implementations are supported.
|
||||
|
||||
### Performance
|
||||
|
||||
Key loading does somewhat run in parallel, deriving the key and copying it key into the slot is not done under any mutex.
|
||||
|
||||
Key destruction is entirely sequential, this is required for persistent keys to stop issues with re-loading keys which cannot otherwise be avoided without changing our approach to thread-safety.
|
||||
|
||||
|
||||
## Future work
|
||||
|
||||
### Long term requirements
|
||||
|
||||
As explained previously, we eventually aim to make the entirety of the PSA API thread-safe. This will build on the work that we have already completed. This requires a full suite of testing, see [Expanding testing](#expanding-testing) for details.
|
||||
|
||||
### Long term performance requirements
|
||||
|
||||
Our plan for cryptographic operations is that they are not performed under any global mutex. One-shot operations and multi-part operations will each only hold the global mutex for finding the relevant key in the key slot, and unregistering as a reader after the operation, using their own operation-specific mutexes to guard any shared data that they use.
|
||||
|
||||
We aim to eventually replace some/all of the mutexes with RWLocks, if possible.
|
||||
|
||||
### Long term key destruction requirements
|
||||
|
||||
The [PSA Crypto Key destruction specification](https://arm-software.github.io/psa-api/crypto/1.1/api/keys/management.html#key-destruction) mandates that implementations make a best effort to ensure that the key material cannot be recovered. In the long term, it would be good to guarantee that `psa_destroy_key` wipes all copies of the key material.
|
||||
|
||||
Here are our long term key destruction goals:
|
||||
|
||||
`psa_destroy_key` does not block indefinitely, and when `psa_destroy_key` returns:
|
||||
|
||||
1. The key identifier does not exist. This is a functional requirement for persistent keys: any thread can immediately create a new key with the same identifier.
|
||||
2. The resources from the key have been freed. This allows threads to create similar keys immediately after destruction, regardless of resources.
|
||||
4. No copy of the key material exists. Rationale: this is a security requirement. We do not have this requirement yet, but we need to document this as a security weakness, and we would like to satisfy this security requirement in the future.
|
||||
|
||||
#### Condition variables
|
||||
|
||||
It would be ideal to add these to a future major version; we cannot add these as requirements to the default `MBEDTLS_THREADING_C` for backwards compatibility reasons.
|
||||
|
||||
Condition variables would enable us to fulfil the final requirement in [Long term key destruction requirements](#long-term-key-destruction-requirements). Destruction would then work as follows:
|
||||
|
||||
* When a thread calls `psa_destroy_key`, they continue as normal until the `psa_unregister_read` call.
|
||||
* Instead of calling `psa_unregister_read`, the thread waits until the condition `slot->registered_readers == 1` is true (the destroying thread is the final reader).
|
||||
* At this point, the destroying thread directly calls `psa_wipe_key_slot`.
|
||||
|
||||
A few changes are needed for this to follow our destruction requirements:
|
||||
|
||||
* Multi-part operations will need to remain registered as readers of their key slot until their copy of the key is destroyed, i.e. at the end of the finish/abort call.
|
||||
* The functionality where `psa_unregister_read` can wipe the key slot will need to be removed, slot wiping is now only done by the destroying/wiping thread.
|
||||
|
||||
### Protecting operation contexts
|
||||
|
||||
Currently, we rely on the crypto service to ensure that the same operation is not invoked concurrently. This abides by the PSA Crypto API Specification ([PSA Concurrent calling conventions](#psa-concurrent-calling-conventions)).
|
||||
|
||||
Concurrent access to the same operation object can compromise the crypto service. For example, if the operation context has a pointer (depending on the compiler and the platform, the pointer assignment may or may not be atomic). This violates the functional correctness requirement of the crypto service.
|
||||
|
||||
If, in future, we want to protect against this within the library then operations will require a status field protected by a global mutex. On entry, API calls would check the state and return an error if the state is ACTIVE. If the state is INACTIVE, then the call will set the state to ACTIVE, do the operation section and then restore the state to INACTIVE before returning.
|
||||
|
||||
### Future driver work
|
||||
|
||||
A future policy we may wish to enforce for drivers is:
|
||||
|
||||
* By default, each driver only has at most one entry point active at any given time. In other words, each driver has its own exclusive lock.
|
||||
* Drivers have an optional `"thread_safe"` boolean property. If true, it allows concurrent calls to this driver.
|
||||
* Even with a thread-safe driver, the core never starts the destruction of a key while there are operations in progress on it, and never performs concurrent calls on the same multipart operation.
|
||||
|
||||
In the non-thread-safe case we have these natural assumptions/requirements:
|
||||
|
||||
1. Drivers don't call the core for any operation for which they provide an entry point.
|
||||
2. The core doesn't hold the driver mutex between calls to entry points.
|
||||
|
||||
With these, the only way of a deadlock is when there are several drivers with circular dependencies. That is, Driver A makes a call that is dispatched to Driver B; upon executing this call Driver B makes a call that is dispatched to Driver A. For example Driver A does CCM, which calls driver B to do CBC-MAC, which in turn calls Driver A to perform AES.
|
||||
|
||||
Potential ways for resolving this:
|
||||
|
||||
1. Non-thread-safe drivers must not call the core.
|
||||
2. Provide a new public API that drivers can safely call.
|
||||
3. Make the dispatch layer public for drivers to call.
|
||||
4. There is a whitelist of core APIs that drivers can call. Drivers providing entry points to these must not make a call to the core when handling these calls. (Drivers are still allowed to call any core API that can't have a driver entry point.)
|
||||
|
||||
The first is too restrictive, the second and the third would require making it a stable API, and would likely increase the code size for a relatively rare feature. We are choosing the fourth as that is the most viable option.
|
||||
|
||||
**Thread-safe drivers:**
|
||||
|
||||
A driver would be non-thread-safe if the `thread-safe` property is set to true.
|
||||
|
||||
To make re-entrancy in non-thread-safe drivers work, thread-safe drivers must not make a call to the core when handling a call that is on the non-thread-safe driver core API whitelist.
|
||||
|
||||
Thread-safe drivers have fewer guarantees from the core and need to implement more complex logic. We can reasonably expect them to be more flexible in terms of re-entrancy as well. At this point it is hard to see what further guarantees would be useful and feasible. Therefore, we don't provide any further guarantees for now.
|
||||
|
||||
Thread-safe drivers must not make any assumption about the operation of the core beyond what is discussed here.
|
@@ -1,597 +0,0 @@
|
||||
# Mbed TLS driver interface test strategy
|
||||
|
||||
This document describes the test strategy for the driver interfaces in Mbed TLS. Mbed TLS has interfaces for secure element drivers, accelerator drivers and entropy drivers. This document is about testing Mbed TLS itself; testing drivers is out of scope.
|
||||
|
||||
The driver interfaces are standardized through PSA Cryptography functional specifications.
|
||||
|
||||
## Secure element driver interface testing
|
||||
|
||||
### Secure element driver interfaces
|
||||
|
||||
#### Opaque driver interface
|
||||
|
||||
The [unified driver interface](../../proposed/psa-driver-interface.md) supports both transparent drivers (for accelerators) and opaque drivers (for secure elements).
|
||||
|
||||
Drivers exposing this interface need to be registered at compile time by declaring their JSON description file.
|
||||
|
||||
#### Dynamic secure element driver interface
|
||||
|
||||
The dynamic secure element driver interface (SE interface for short) is defined by [`psa/crypto_se_driver.h`](../../../tf-psa-crypto/include/psa/crypto_se_driver.h). This is an interface between Mbed TLS and one or more third-party drivers.
|
||||
|
||||
The SE interface consists of one function provided by Mbed TLS (`psa_register_se_driver`) and many functions that drivers must implement. To make a driver usable by Mbed TLS, the initialization code must call `psa_register_se_driver` with a structure that describes the driver. The structure mostly contains function pointers, pointing to the driver's methods. All calls to a driver function are triggered by a call to a PSA crypto API function.
|
||||
|
||||
### SE driver interface unit tests
|
||||
|
||||
This section describes unit tests that must be implemented to validate the secure element driver interface. Note that a test case may cover multiple requirements; for example a “good case” test can validate that the proper function is called, that it receives the expected inputs and that it produces the expected outputs.
|
||||
|
||||
Many SE driver interface unit tests could be covered by running the existing API tests with a key in a secure element.
|
||||
|
||||
#### SE driver registration
|
||||
|
||||
This applies to dynamic drivers only.
|
||||
|
||||
* Test `psa_register_se_driver` with valid and with invalid arguments.
|
||||
* Make at least one failing call to `psa_register_se_driver` followed by a successful call.
|
||||
* Make at least one test that successfully registers the maximum number of drivers and fails to register one more.
|
||||
|
||||
#### Dispatch to SE driver
|
||||
|
||||
For each API function that can lead to a driver call (more precisely, for each driver method call site, but this is practically equivalent):
|
||||
|
||||
* Make at least one test with a key in a secure element that checks that the driver method is called. A few API functions involve multiple driver methods; these should validate that all the expected driver methods are called.
|
||||
* Make at least one test with a key that is not in a secure element that checks that the driver method is not called.
|
||||
* Make at least one test with a key in a secure element with a driver that does not have the requisite method (i.e. the method pointer is `NULL`) but has the substructure containing that method, and check that the return value is `PSA_ERROR_NOT_SUPPORTED`.
|
||||
* Make at least one test with a key in a secure element with a driver that does not have the substructure containing that method (i.e. the pointer to the substructure is `NULL`), and check that the return value is `PSA_ERROR_NOT_SUPPORTED`.
|
||||
* At least one test should register multiple drivers with a key in each driver and check that the expected driver is called. This does not need to be done for all operations (use a white-box approach to determine if operations may use different code paths to choose the driver).
|
||||
* At least one test should register the same driver structure with multiple lifetime values and check that the driver receives the expected lifetime value.
|
||||
|
||||
Some methods only make sense as a group (for example a driver that provides the MAC methods must provide all or none). In those cases, test with all of them null and none of them null.
|
||||
|
||||
#### SE driver inputs
|
||||
|
||||
For each API function that can lead to a driver call (more precisely, for each driver method call site, but this is practically equivalent):
|
||||
|
||||
* Wherever the specification guarantees parameters that satisfy certain preconditions, check these preconditions whenever practical.
|
||||
* If the API function can take parameters that are invalid and must not reach the driver, call the API function with such parameters and verify that the driver method is not called.
|
||||
* Check that the expected inputs reach the driver. This may be implicit in a test that checks the outputs if the only realistic way to obtain the correct outputs is to start from the expected inputs (as is often the case for cryptographic material, but not for metadata).
|
||||
|
||||
#### SE driver outputs
|
||||
|
||||
For each API function that leads to a driver call, call it with parameters that cause a driver to be invoked and check how Mbed TLS handles the outputs.
|
||||
|
||||
* Correct outputs.
|
||||
* Incorrect outputs such as an invalid output length.
|
||||
* Expected errors (e.g. `PSA_ERROR_INVALID_SIGNATURE` from a signature verification method).
|
||||
* Unexpected errors. At least test that if the driver returns `PSA_ERROR_GENERIC_ERROR`, this is propagated correctly.
|
||||
|
||||
Key creation functions invoke multiple methods and need more complex error handling:
|
||||
|
||||
* Check the consequence of errors detected at each stage (slot number allocation or validation, key creation method, storage accesses).
|
||||
* Check that the storage ends up in the expected state. At least make sure that no intermediate file remains after a failure.
|
||||
|
||||
#### Persistence of SE keys
|
||||
|
||||
The following tests must be performed at least one for each key creation method (import, generate, ...).
|
||||
|
||||
* Test that keys in a secure element survive `psa_close_key(); psa_open_key()`.
|
||||
* Test that keys in a secure element survive `mbedtls_psa_crypto_free(); psa_crypto_init()`.
|
||||
* Test that the driver's persistent data survives `mbedtls_psa_crypto_free(); psa_crypto_init()`.
|
||||
* Test that `psa_destroy_key()` does not leave any trace of the key.
|
||||
|
||||
#### Resilience for SE drivers
|
||||
|
||||
Creating or removing a key in a secure element involves multiple storage modifications (M<sub>1</sub>, ..., M<sub>n</sub>). If the operation is interrupted by a reset at any point, it must be either rolled back or completed.
|
||||
|
||||
* For each potential interruption point (before M<sub>1</sub>, between M<sub>1</sub> and M<sub>2</sub>, ..., after M<sub>n</sub>), call `mbedtls_psa_crypto_free(); psa_crypto_init()` at that point and check that this either rolls back or completes the operation that was started.
|
||||
* This must be done for each key creation method and for key destruction.
|
||||
* This must be done for each possible flow, including error cases (e.g. a key creation that fails midway due to `OUT_OF_MEMORY`).
|
||||
* The recovery during `psa_crypto_init` can itself be interrupted. Test those interruptions too.
|
||||
* Two things need to be tested: the key that is being created or destroyed, and the driver's persistent storage.
|
||||
* Check both that the storage has the expected content (this can be done by e.g. using a key that is supposed to be present) and does not have any unexpected content (for keys, this can be done by checking that `psa_open_key` fails with `PSA_ERROR_DOES_NOT_EXIST`).
|
||||
|
||||
This requires instrumenting the storage implementation, either to force it to fail at each point or to record successive storage states and replay each of them. Each `psa_its_xxx` function call is assumed to be atomic.
|
||||
|
||||
### SE driver system tests
|
||||
|
||||
#### Real-world use case
|
||||
|
||||
We must have at least one driver that is close to real-world conditions:
|
||||
|
||||
* With its own source tree.
|
||||
* Running on actual hardware.
|
||||
* Run the full driver validation test suite (which does not yet exist).
|
||||
* Run at least one test application (e.g. the Mbed OS TLS example).
|
||||
|
||||
This requirement shall be fulfilled by the [Microchip ATECC508A driver](https://github.com/ARMmbed/mbed-os-atecc608a/).
|
||||
|
||||
#### Complete driver
|
||||
|
||||
We should have at least one driver that covers the whole interface:
|
||||
|
||||
* With its own source tree.
|
||||
* Implementing all the methods.
|
||||
* Run the full driver validation test suite (which does not yet exist).
|
||||
|
||||
A PKCS#11 driver would be a good candidate. It would be useful as part of our product offering.
|
||||
|
||||
## Unified driver interface testing
|
||||
|
||||
The [unified driver interface](../../proposed/psa-driver-interface.md) defines interfaces for accelerators.
|
||||
|
||||
### Test requirements
|
||||
|
||||
#### Requirements for transparent driver testing
|
||||
|
||||
Every cryptographic mechanism for which a transparent driver interface exists (key creation, cryptographic operations, …) must be exercised in at least one build. The test must verify that the driver code is called.
|
||||
|
||||
#### Requirements for fallback
|
||||
|
||||
The driver interface includes a fallback mechanism so that a driver can reject a request at runtime and let another driver handle the request. For each entry point, there must be at least three test runs with two or more drivers available with driver A configured to fall back to driver B, with one run where A returns `PSA_SUCCESS`, one where A returns `PSA_ERROR_NOT_SUPPORTED` and B is invoked, and one where A returns a different error and B is not invoked.
|
||||
|
||||
### Test drivers
|
||||
|
||||
We have test drivers that are enabled by `PSA_CRYPTO_DRIVER_TEST` (not present
|
||||
in the usual config files, must be defined on the command line or in a custom
|
||||
config file). Those test drivers are implemented in `framework/tests/src/drivers/*.c`
|
||||
and their API is declared in `framework/tests/include/test/drivers/*.h`.
|
||||
|
||||
We have two test driver registered: `mbedtls_test_opaque_driver` and
|
||||
`mbedtls_test_transparent_driver`. These are described in
|
||||
`scripts/data_files/driver_jsons/mbedtls_test_xxx_driver.json` (as much as our
|
||||
JSON support currently allows). Each of the drivers can potentially implement
|
||||
support for several mechanism; conversely, each of the file mentioned in the
|
||||
previous paragraph can potentially contribute to both the opaque and the
|
||||
transparent test driver.
|
||||
|
||||
Each entry point is instrumented to record the number of hits for each part of
|
||||
the driver (same division as the files) and the status of the last call. It is
|
||||
also possible to force the next call to return a specified status, and
|
||||
sometimes more things can be forced: see the various
|
||||
`mbedtls_test_driver_XXX_hooks_t` structures declared by each driver (and
|
||||
subsections below).
|
||||
|
||||
The drivers can use one of two back-ends:
|
||||
- internal: this requires the built-in implementation to be present.
|
||||
- libtestdriver1: this allows the built-in implementation to be omitted from
|
||||
the build.
|
||||
|
||||
Historical note: internal was initially the only back-end; then support for
|
||||
libtestdriver1 was added gradually. Support for libtestdriver1 is now complete
|
||||
(see following sub-sections), so we could remove internal now. Note it's
|
||||
useful to have builds with both a driver and the built-in, in order to test
|
||||
fallback to built-in, which is currently done only with internal, but this can
|
||||
be achieved with libtestdriver1 just as well.
|
||||
|
||||
Note on instrumentation: originally, when only the internal backend was
|
||||
available, hits were how we knew that the driver was called, as opposed to
|
||||
directly calling the built-in code. With libtestdriver1, we can check that by
|
||||
ensuring that the built-in code is not present, so if the operation gives the
|
||||
correct result, only a driver call can have calculated that result. So,
|
||||
nowadays there is low value in checking the hit count. There is still some
|
||||
value for hit counts, e.g. checking that we don't call a multipart entry point
|
||||
when we intended to call the one-shot entry point, but it's limited.
|
||||
|
||||
Note: our test drivers tend to provide all possible entry points (with a few
|
||||
exceptions that may not be intentional, see the next sections). However, in
|
||||
some cases, when an entry point is not available, the core is supposed to
|
||||
implement it using other entry points, for example:
|
||||
- `mac_verify` may use `mac_compute` if the driver does no provide verify;
|
||||
- for things that have both one-shot and multi-part API, the driver can
|
||||
provide only the multi-part entry points, and the core is supposed to
|
||||
implement one-shot on top of it (but still call the one-shot entry points when
|
||||
they're available);
|
||||
- `sign/verify_message` can be implemented on top of `sign/verify_hash` for
|
||||
some algorithms;
|
||||
- (not sure if the list is exhaustive).
|
||||
|
||||
Ideally, we'd want build options for the test drivers so that we can test with
|
||||
different combinations of entry points present, and make sure the core behaves
|
||||
appropriately when some entry points are absent but other entry points allow
|
||||
implementing the operation. This will remain hard to test until we have proper
|
||||
support for JSON-defined drivers with auto-generation of dispatch code.
|
||||
(The `MBEDTLS_PSA_ACCEL_xxx` macros we currently use are not expressive enough
|
||||
to specify which entry points are supported for a given mechanism.)
|
||||
|
||||
Our implementation of PSA Crypto is structured in a way that the built-in
|
||||
implementation of each operation follows the driver API, see
|
||||
[`../architecture/psa-crypto-implementation-structure.md`](../architecture/psa-crypto-implementation-structure.html).
|
||||
This makes implementing the test drivers very easy: each entry point has a
|
||||
corresponding `mbedtls_psa_xxx()` function that it can call as its
|
||||
implementation - with the `libtestdriver1` back-end the function is called
|
||||
`libtestdriver1_mbedtls_psa_xxx()` instead.
|
||||
|
||||
A nice consequence of that strategy is that when an entry point has
|
||||
test-driver support, most of the time, it automatically works for all
|
||||
algorithms and key types supported by the library. (The exception being when
|
||||
the driver needs to call a different function for different key types, as is
|
||||
the case with some asymmetric key management operations.) (Note: it's still
|
||||
useful to test drivers in configurations with partial algorithm support, and
|
||||
that can still be done by configuring libtestdriver1 and the main library as
|
||||
desired.)
|
||||
|
||||
The renaming process for `libtestdriver1` is implemented as a few Perl regexes
|
||||
applied to a copy of the library code, see the `libtestdriver1.a` target in
|
||||
`tests/Makefile`. Another modification that's done to this copy is appending
|
||||
`tests/configs/crypto_config_test_driver_extension.h` to `psa/crypto_config.h`.
|
||||
This file reverses the `ACCEL`/`BUILTIN` macros so that `libtestdriver1`
|
||||
includes as built-in what the main `libmbedcrypto.a` will have accelerated;
|
||||
see that file's initial comment for details. See also `helper_libtestdriver1_`
|
||||
functions and the preceding comment in `all.sh` for how libtestdriver is used
|
||||
in practice.
|
||||
|
||||
This general framework needs specific code for each family of operations. At a
|
||||
given point in time, not all operations have the same level of support. The
|
||||
following sub-sections describe the status of the test driver support, mostly
|
||||
following the structure and order of sections 9.6 and 10.2 to 10.10 of the
|
||||
[PSA Crypto standard](https://arm-software.github.io/psa-api/crypto/1.1/) as
|
||||
that is also a natural division for implementing test drivers (that's how the
|
||||
code is divided into files).
|
||||
|
||||
#### Key management
|
||||
|
||||
The following entry points are declared in `test/drivers/key_management.h`:
|
||||
|
||||
- `"init"` (transparent and opaque)
|
||||
- `"generate_key"` (transparent and opaque)
|
||||
- `"export_public_key"` (transparent and opaque)
|
||||
- `"import_key"` (transparent and opaque)
|
||||
- `"export_key"` (opaque only)
|
||||
- `"get_builtin_key"` (opaque only)
|
||||
- `"copy_key"` (opaque only)
|
||||
|
||||
The transparent driver fully implements the declared entry points, and can use
|
||||
any backend: internal or libtestdriver1.
|
||||
|
||||
The opaque's driver implementation status is as follows:
|
||||
- `"generate_key"`: not implemented, always returns `NOT_SUPPORTED`.
|
||||
- `"export_public_key"`: implemented only for ECC and RSA keys, both backends.
|
||||
- `"import_key"`: implemented except for DH keys, both backends.
|
||||
- `"export_key"`: implemented for built-in keys (ECC and AES), and for
|
||||
non-builtin keys except DH keys. (Backend not relevant.)
|
||||
- `"get_builtin_key"`: implemented - provisioned keys: AES-128 and ECC
|
||||
secp2456r1. (Backend not relevant.)
|
||||
- `"copy_key"`: implemented - emulates a SE without storage. (Backend not
|
||||
relevant.)
|
||||
|
||||
Note: the `"init"` entry point is not part of the "key management" family, but
|
||||
listed here as it's declared and implemented in the same file. With the
|
||||
transparent driver and the libtestdriver1 backend, it calls
|
||||
`libtestdriver1_psa_crypto_init()`, which partially but not fully ensures
|
||||
that this entry point is called before other entry points in the test drivers.
|
||||
With the opaque driver, this entry point just does nothing an returns success.
|
||||
|
||||
The following entry points are defined by the driver interface but missing
|
||||
from our test drivers:
|
||||
- `"allocate_key"`, `"destroy_key"`: this is for opaque drivers that store the
|
||||
key material internally.
|
||||
|
||||
Note: the instrumentation also allows forcing the output and its length.
|
||||
|
||||
#### Message digests (Hashes)
|
||||
|
||||
The following entry points are declared (transparent only):
|
||||
- `"hash_compute"`
|
||||
- `"hash_setup"`
|
||||
- `"hash_clone"`
|
||||
- `"hash_update"`
|
||||
- `"hash_finish"`
|
||||
- `"hash_abort"`
|
||||
|
||||
The transparent driver fully implements the declared entry points, and can use
|
||||
any backend: internal or libtestdriver1.
|
||||
|
||||
This family is not part of the opaque driver as it doesn't use keys.
|
||||
|
||||
#### Message authentication codes (MAC)
|
||||
|
||||
The following entry points are declared (transparent and opaque):
|
||||
- `"mac_compute"`
|
||||
- `"mac_sign_setup"`
|
||||
- `"mac_verify_setup"`
|
||||
- `"mac_update"`
|
||||
- `"mac_sign_finish"`
|
||||
- `"mac_verify_finish"`
|
||||
- `"mac_abort"`
|
||||
|
||||
The transparent driver fully implements the declared entry points, and can use
|
||||
any backend: internal or libtestdriver1.
|
||||
|
||||
The opaque driver only implements the instrumentation but not the actual
|
||||
operations: entry points will always return `NOT_SUPPORTED`, unless another
|
||||
status is forced.
|
||||
|
||||
The following entry points are not implemented:
|
||||
- `mac_verify`: this mostly makes sense for opaque drivers; the core will fall
|
||||
back to using `"mac_compute"` if this is not implemented. So, perhaps
|
||||
ideally we should test both with `"mac_verify"` implemented and with it not
|
||||
implemented? Anyway, we have a test gap here.
|
||||
|
||||
#### Unauthenticated ciphers
|
||||
|
||||
The following entry points are declared (transparent and opaque):
|
||||
- `"cipher_encrypt"`
|
||||
- `"cipher_decrypt"`
|
||||
- `"cipher_encrypt_setup"`
|
||||
- `"cipher_decrypt_setup"`
|
||||
- `"cipher_set_iv"`
|
||||
- `"cipher_update"`
|
||||
- `"cipher_finish"`
|
||||
- `"cipher_abort"`
|
||||
|
||||
The transparent driver fully implements the declared entry points, and can use
|
||||
any backend: internal or libtestdriver1.
|
||||
|
||||
The opaque driver is not implemented at all, neither instumentation nor the
|
||||
operation: entry points always return `NOT_SUPPORTED`.
|
||||
|
||||
Note: the instrumentation also allows forcing a specific output and output
|
||||
length.
|
||||
|
||||
#### Authenticated encryption with associated data (AEAD)
|
||||
|
||||
The following entry points are declared (transparent only):
|
||||
- `"aead_encrypt"`
|
||||
- `"aead_decrypt"`
|
||||
- `"aead_encrypt_setup"`
|
||||
- `"aead_decrypt_setup"`
|
||||
- `"aead_set_nonce"`
|
||||
- `"aead_set_lengths"`
|
||||
- `"aead_update_ad"`
|
||||
- `"aead_update"`
|
||||
- `"aead_finish"`
|
||||
- `"aead_verify"`
|
||||
- `"aead_abort"`
|
||||
|
||||
The transparent driver fully implements the declared entry points, and can use
|
||||
any backend: internal or libtestdriver1.
|
||||
|
||||
The opaque driver does not implement or even declare entry points for this
|
||||
family.
|
||||
|
||||
Note: the instrumentation records the number of hits per entry point, not just
|
||||
the total number of hits for this family.
|
||||
|
||||
#### Key derivation
|
||||
|
||||
Not covered at all by the test drivers.
|
||||
|
||||
That's a test gap which reflects a feature gap: the driver interface does
|
||||
define a key derivation family of entry points, but we don't currently
|
||||
implement that part of the driver interface, see #5488 and related issues.
|
||||
|
||||
#### Asymmetric signature
|
||||
|
||||
The following entry points are declared (transparent and opaque):
|
||||
|
||||
- `"sign_message"`
|
||||
- `"verify_message"`
|
||||
- `"sign_hash"`
|
||||
- `"verify_hash"`
|
||||
|
||||
The transparent driver fully implements the declared entry points, and can use
|
||||
any backend: internal or libtestdriver1.
|
||||
|
||||
The opaque driver is not implemented at all, neither instumentation nor the
|
||||
operation: entry points always return `NOT_SUPPORTED`.
|
||||
|
||||
Note: the instrumentation also allows forcing a specific output and output
|
||||
length, and has two instance of the hooks structure: one for sign, the other
|
||||
for verify.
|
||||
|
||||
Note: when a driver implements only the `"xxx_hash"` entry points, the core is
|
||||
supposed to implement the `psa_xxx_message()` functions by computing the hash
|
||||
itself before calling the `"xxx_hash"` entry point. Since the test driver does
|
||||
implement the `"xxx_message"` entry point, it's not exercising that part of
|
||||
the core's expected behaviour.
|
||||
|
||||
#### Asymmetric encryption
|
||||
|
||||
The following entry points are declared (transparent and opaque):
|
||||
|
||||
- `"asymmetric_encrypt"`
|
||||
- `"asymmetric_decrypt"`
|
||||
|
||||
The transparent driver fully implements the declared entry points, and can use
|
||||
any backend: internal or libtestdriver1.
|
||||
|
||||
The opaque driver implements the declared entry points, and can use any
|
||||
backend: internal or libtestdriver1. However it does not implement the
|
||||
instrumentation (hits, forced output/status), as this [was not an immediate
|
||||
priority](https://github.com/Mbed-TLS/mbedtls/pull/8700#issuecomment-1892466159).
|
||||
|
||||
Note: the instrumentation also allows forcing a specific output and output
|
||||
length.
|
||||
|
||||
#### Key agreement
|
||||
|
||||
The following entry points are declared (transparent and opaque):
|
||||
|
||||
- `"key_agreement"`
|
||||
|
||||
The transparent driver fully implements the declared entry points, and can use
|
||||
any backend: internal or libtestdriver1.
|
||||
|
||||
The opaque driver is not implemented at all, neither instumentation nor the
|
||||
operation: entry points always return `NOT_SUPPORTED`.
|
||||
|
||||
Note: the instrumentation also allows forcing a specific output and output
|
||||
length.
|
||||
|
||||
#### Other cryptographic services (Random number generation)
|
||||
|
||||
Not covered at all by the test drivers.
|
||||
|
||||
The driver interface defines a `"get_entropy"` entry point, as well as a
|
||||
"Random generation" family of entry points. None of those are currently
|
||||
implemented in the library. Part of it will be planned for 4.0, see #8150.
|
||||
|
||||
#### PAKE extension
|
||||
|
||||
The following entry points are declared (transparent only):
|
||||
- `"pake_setup"`
|
||||
- `"pake_output"`
|
||||
- `"pake_input"`
|
||||
- `"pake_get_implicit_key"`
|
||||
- `"pake_abort"`
|
||||
|
||||
Note: the instrumentation records hits per entry point and allows forcing the
|
||||
output and its length, as well as forcing the status of setup independently
|
||||
from the others.
|
||||
|
||||
The transparent driver fully implements the declared entry points, and can use
|
||||
any backend: internal or libtestdriver1.
|
||||
|
||||
The opaque driver does not implement or even declare entry points for this
|
||||
family.
|
||||
|
||||
### Driver wrapper test suite
|
||||
|
||||
We have a test suite dedicated to driver dispatch, which takes advantage of the
|
||||
instrumentation in the test drivers described in the previous section, in
|
||||
order to check that drivers are called when they're supposed to, and that the
|
||||
core behaves as expected when they return errors (in particular, that we fall
|
||||
back to the built-in implementation when the driver returns `NOT_SUPPORTED`).
|
||||
|
||||
This is `test_suite_psa_crypto_driver_wrappers`, which is maintained manually
|
||||
(that is, the test cases in the `.data` files are not auto-generated). The
|
||||
entire test suite depends on the test drivers being enabled
|
||||
(`PSA_CRYPTO_DRIVER_TEST`), which is not the case in the default or full
|
||||
config.
|
||||
|
||||
The test suite is focused on driver usage (mostly by checking the expected
|
||||
number of hits) but also does some validation of the results: for
|
||||
deterministic algorithms, known-answers tests are used, and for the rest, some
|
||||
consistency checks are done (more or less detailed depending on the algorithm
|
||||
and build configuration).
|
||||
|
||||
#### Configurations coverage
|
||||
|
||||
The driver wrappers test suite has cases that expect both the driver and the
|
||||
built-in to be present, and also cases that expect the driver to be present
|
||||
but not the built-in. As such, it's impossible for a single configuration to
|
||||
run all test cases, and we need at least two: driver+built-in, and
|
||||
driver-only.
|
||||
|
||||
- The driver+built-in case is covered by `test_psa_crypto_drivers` in `all.sh`.
|
||||
This covers all areas (key types and algs) at once.
|
||||
- The driver-only case is split into multiple `all.sh` components whose names
|
||||
start with `test_psa_crypto_config_accel`; we have one or more component per
|
||||
area, see below.
|
||||
|
||||
Here's a summary of driver-only coverage, grouped by families of key types.
|
||||
|
||||
Hash (key types: none)
|
||||
- `test_psa_crypto_config_accel_hash`: all algs, default config, no parity
|
||||
testing.
|
||||
- `test_psa_crypto_config_accel_hash_use_psa`: all algs, full config, with
|
||||
parity testing.
|
||||
|
||||
HMAC (key type: HMAC)
|
||||
- `test_psa_crypto_config_accel_hmac`: all algs, full config except a few
|
||||
exclusions (PKCS5, PKCS7, HMAC-DRBG, legacy HKDF, deterministic ECDSA), with
|
||||
parity testing.
|
||||
|
||||
Cipher, AEAD and CMAC (key types: DES, AES, ARIA, CHACHA20, CAMELLIA):
|
||||
- `test_psa_crypto_config_accel_cipher_aead_cmac`: all key types and algs, full
|
||||
config with a few exclusions (NIST-KW), with parity testing.
|
||||
- `test_psa_crypto_config_accel_des`: only DES (with all algs), full
|
||||
config, no parity testing.
|
||||
- `test_psa_crypto_config_accel_aead`: only AEAD algs (with all relevant key
|
||||
types), full config, no parity testing.
|
||||
|
||||
Key derivation (key types: `DERIVE`, `RAW_DATA`, `PASSWORD`, `PEPPER`,
|
||||
`PASSWORD_HASH`):
|
||||
- No testing as we don't have driver support yet (see previous section).
|
||||
|
||||
RSA (key types: `RSA_KEY_PAIR_xxx`, `RSA_PUBLIC_KEY`):
|
||||
- `test_psa_crypto_config_accel_rsa_crypto`: all 4 algs (encryption &
|
||||
signature, v1.5 & v2.1), config `crypto_full`, with parity testing excluding
|
||||
PK.
|
||||
|
||||
DH (key types: `DH_KEY_PAIR_xxx`, `DH_PUBLIC_KEY`):
|
||||
- `test_psa_crypto_config_accel_ffdh`: all key types and algs, full config,
|
||||
with parity testing.
|
||||
- `test_psa_crypto_config_accel_ecc_ffdh_no_bignum`: with also bignum removed.
|
||||
|
||||
ECC (key types: `ECC_KEY_PAIR_xxx`, `ECC_PUBLIC_KEY`):
|
||||
- Single algorithm accelerated (both key types, all curves):
|
||||
- `test_psa_crypto_config_accel_ecdh`: default config, no parity testing.
|
||||
- `test_psa_crypto_config_accel_ecdsa`: default config, no parity testing.
|
||||
- `test_psa_crypto_config_accel_pake`: full config, no parity testing.
|
||||
- All key types, algs and curves accelerated (full config with exceptions,
|
||||
with parity testing):
|
||||
- `test_psa_crypto_config_accel_ecc_ecp_light_only`: `ECP_C` mostly disabled
|
||||
- `test_psa_crypto_config_accel_ecc_no_ecp_at_all`: `ECP_C` fully disabled
|
||||
- `test_psa_crypto_config_accel_ecc_no_bignum`: `BIGNUM_C` disabled (DH disabled)
|
||||
- `test_psa_crypto_config_accel_ecc_ffdh_no_bignum`: `BIGNUM_C` disabled (DH accelerated)
|
||||
- Other - all algs accelerated but only some algs/curves (full config with
|
||||
exceptions, no parity testing):
|
||||
- `test_psa_crypto_config_accel_ecc_some_key_types`
|
||||
- `test_psa_crypto_config_accel_ecc_non_weierstrass_curves`
|
||||
- `test_psa_crypto_config_accel_ecc_weierstrass_curves`
|
||||
|
||||
Note: `analyze_outcomes.py` provides a list of test cases that are not
|
||||
executed in any configuration tested on the CI. We're missing driver-only HMAC
|
||||
testing, but no test is flagged as never executed there; this reveals we don't
|
||||
have "fallback not available" cases for MAC, see #8565.
|
||||
|
||||
#### Test case coverage
|
||||
|
||||
Since `test_suite_psa_crypto_driver_wrappers.data` is maintained manually,
|
||||
we need to make sure it exercises all the cases that need to be tested. In the
|
||||
future, this file should be generated in order to ensure exhaustiveness.
|
||||
|
||||
In the meantime, one way to observe (lack of) completeness is to look at line
|
||||
coverage in test driver implementations - this doesn't reveal all gaps, but it
|
||||
does reveal cases where we thought about something when writing the test
|
||||
driver, but not when writing test functions/data.
|
||||
|
||||
Key management:
|
||||
- `mbedtls_test_transparent_generate_key()` is not tested with RSA keys.
|
||||
- `mbedtls_test_transparent_import_key()` is not tested with DH keys.
|
||||
- `mbedtls_test_opaque_import_key()` is not tested with unstructured keys nor
|
||||
with RSA keys (nor DH keys since that's not implemented).
|
||||
- `mbedtls_test_opaque_export_key()` is not tested with non-built-in keys.
|
||||
- `mbedtls_test_transparent_export_public_key()` is not tested with RSA or DH keys.
|
||||
- `mbedtls_test_opaque_export_public_key()` is not tested with non-built-in keys.
|
||||
- `mbedtls_test_opaque_copy_key()` is not tested at all.
|
||||
|
||||
Hash:
|
||||
- `mbedtls_test_transparent_hash_finish()` is not tested with a forced status.
|
||||
|
||||
MAC:
|
||||
- The following are not tested with a forced status:
|
||||
- `mbedtls_test_transparent_mac_sign_setup()`
|
||||
- `mbedtls_test_transparent_mac_verify_setup()`
|
||||
- `mbedtls_test_transparent_mac_update()`
|
||||
- `mbedtls_test_transparent_mac_verify_finish()`
|
||||
- `mbedtls_test_transparent_mac_abort()`
|
||||
- No opaque entry point is tested (they're not implemented either).
|
||||
|
||||
Cipher:
|
||||
- The following are not tested with a forced status nor with a forced output:
|
||||
- `mbedtls_test_transparent_cipher_encrypt()`
|
||||
- `mbedtls_test_transparent_cipher_finish()`
|
||||
- No opaque entry point is tested (they're not implemented either).
|
||||
|
||||
AEAD:
|
||||
- The following are not tested with a forced status:
|
||||
- `mbedtls_test_transparent_aead_set_nonce()`
|
||||
- `mbedtls_test_transparent_aead_set_lengths()`
|
||||
- `mbedtls_test_transparent_aead_update_ad()`
|
||||
- `mbedtls_test_transparent_aead_update()`
|
||||
- `mbedtls_test_transparent_aead_finish()`
|
||||
- `mbedtls_test_transparent_aead_verify()`
|
||||
- `mbedtls_test_transparent_aead_verify()` is not tested with an invalid tag
|
||||
(though it might be in another test suite).
|
||||
|
||||
Signature:
|
||||
- `sign_hash()` is not tested with RSA-PSS
|
||||
- No opaque entry point is tested (they're not implemented either).
|
||||
|
||||
Key agreement:
|
||||
- `mbedtls_test_transparent_key_agreement()` is not tested with FFDH.
|
||||
- No opaque entry point is tested (they're not implemented either).
|
||||
|
||||
PAKE:
|
||||
- All lines are covered.
|
@@ -1,127 +0,0 @@
|
||||
# Mbed TLS PSA keystore format stability testing strategy
|
||||
|
||||
## Introduction
|
||||
|
||||
The PSA crypto subsystem includes a persistent key store. It is possible to create a persistent key and read it back later. This must work even if Mbed TLS has been upgraded in the meantime (except for deliberate breaks in the backward compatibility of the storage).
|
||||
|
||||
The goal of this document is to define a test strategy for the key store that not only validates that it's possible to load a key that was saved with the version of Mbed TLS under test, but also that it's possible to load a key that was saved with previous versions of Mbed TLS.
|
||||
|
||||
Interoperability is not a goal: PSA crypto implementations are not intended to have compatible storage formats. Downgrading is not required to work.
|
||||
|
||||
## General approach
|
||||
|
||||
### Limitations of a direct approach
|
||||
|
||||
The goal of storage format stability testing is: as a user of Mbed TLS, I want to store a key under version V and read it back under version W, with W ≥ V.
|
||||
|
||||
Doing the testing this way would be difficult because we'd need to have version V of Mbed TLS available when testing version W.
|
||||
|
||||
An alternative, semi-direct approach consists of generating test data under version V, and reading it back under version W. Done naively, this would require keeping a large amount of test data (full test coverage multiplied by the number of versions that we want to preserve backward compatibility with).
|
||||
|
||||
### Save-and-compare approach
|
||||
|
||||
Importing and saving a key is deterministic. Therefore we can ensure the stability of the storage format by creating test cases under a version V of Mbed TLS, where the test case parameters include both the parameters to pass to key creation and the expected state of the storage after the key is created. The test case creates a key as indicated by the parameters, then compares the actual state of the storage with the expected state.
|
||||
|
||||
In addition, the test case also loads the key and checks that it has the expected data and metadata. Import-and-save testing and load-and-check testing can be split into separate test functions with the same payloads.
|
||||
|
||||
If the test passes with version V, this means that the test data is consistent with what the implementation does. When the test later runs under version W ≥ V, it creates and reads back a storage state which is known to be identical to the state that V would have produced. Thus, this approach validates that W can read storage states created by V.
|
||||
|
||||
Note that it is the combination of import-and-save passing on version V and load-and-check passing on version W with the same data that proves that version W can read back what version V wrote. From the perspective of a particular version of the library, the import-and-save tests guarantee forward compatibility while the load-and-check tests guarantee backward compatibility.
|
||||
|
||||
Use a similar approach for files other than keys where possible and relevant.
|
||||
|
||||
### Keeping up with storage format evolution
|
||||
|
||||
Test cases should normally not be removed from the code base: if something has worked before, it should keep working in future versions, so we should keep testing it.
|
||||
|
||||
This cannot be enforced solely by looking at a single version of Mbed TLS, since there would be no indication that more test cases used to exist. It can only be enforced through review of library changes. The review is be assisted by a tool that compares the old and the new version, which is implemented in `scripts/abi_check.py`. This tool fails the CI if load-and-check test case disappears (changed test cases are raised as false positives).
|
||||
|
||||
If the way certain keys are stored changes, and we don't deliberately decide to stop supporting old keys (which should only be done by retiring a version of the storage format), then we should keep the corresponding test cases in load-only mode: create a file with the expected content, load it and check the data that it contains.
|
||||
|
||||
## Storage architecture overview
|
||||
|
||||
The PSA subsystem provides storage on top of the PSA trusted storage interface. The state of the storage is a mapping from file identifier (a 64-bit number) to file content (a byte array). These files include:
|
||||
|
||||
* [Key files](#key-storage) (files containing one key's metadata and, except for some secure element keys, key material).
|
||||
* The [random generator injected seed or state file](#random-generator-state) (`PSA_CRYPTO_ITS_RANDOM_SEED_UID`).
|
||||
* [Storage transaction file](#storage-transaction-resumption).
|
||||
* [Driver state files](#driver-state-files).
|
||||
|
||||
For a more detailed description, refer to the [Mbed TLS storage specification](../mbed-crypto-storage-specification.md).
|
||||
|
||||
In addition, Mbed TLS includes an implementation of the PSA trusted storage interface on top of C stdio. This document addresses the test strategy for [PSA ITS over file](#psa-its-over-file) in a separate section below.
|
||||
|
||||
## Key storage testing
|
||||
|
||||
This section describes the desired test cases for keys created with the current storage format version. When the storage format changes, if backward compatibility is desired, old test data should be kept as described under [“Keeping up with storage format evolution”](#keeping-up-with-storage-format-evolution).
|
||||
|
||||
### Keystore layout
|
||||
|
||||
Objective: test that the key file name corresponds to the key identifier.
|
||||
|
||||
Method: Create a key with a given identifier (using `psa_import_key`) and verify that a file with the expected name is created, and no other. Repeat for different identifiers.
|
||||
|
||||
### General key format
|
||||
|
||||
Objective: test the format of the key file: which field goes where and how big it is.
|
||||
|
||||
Method: Create a key with certain metadata with `psa_import_key`. Read the file content and validate that it has the expected layout, deduced from the storage specification. Repeat with different metadata. Ensure that there are test cases covering all fields.
|
||||
|
||||
### Enumeration of test cases for keys
|
||||
|
||||
Objective: ensure that the coverage is sufficient to have assurance that all keys are stored correctly. This requires a sufficient selection of key types, sizes, policies, etc.
|
||||
|
||||
In particular, the tests must validate that each `PSA_xxx` constant that is stored in a key is covered by at least one test case:
|
||||
|
||||
* Lifetimes: `PSA_KEY_LIFETIME_xxx`, `PSA_KEY_PERSISTENCE_xxx`, `PSA_KEY_LOCATION_xxx`.
|
||||
* Usage flags: `PSA_KEY_USAGE_xxx`.
|
||||
* Algorithms in policies: `PSA_ALG_xxx`.
|
||||
* Key types: `PSA_KEY_TYPE_xxx`, `PSA_ECC_FAMILY_xxx`, `PSA_DH_FAMILY_xxx`.
|
||||
|
||||
In addition, the coverage of key material must ensure that any variation in key representation is detected. See [“Considerations on key material representations”](#Considerations-on-key-material-representations) for considerations regarding key types.
|
||||
|
||||
Method: Each test case creates a key with `psa_import_key`, purges it from memory, then reads it back and exercises it.
|
||||
|
||||
Generate test cases automatically based on an enumeration of available constants and some knowledge of what attributes (sizes, algorithms, …) and content to use for keys of a certain type.
|
||||
|
||||
### Testing with alternative lifetime values
|
||||
|
||||
Objective: have test coverage for lifetimes other than the default persistent lifetime (`PSA_KEY_LIFETIME_PERSISTENT`).
|
||||
|
||||
Method:
|
||||
|
||||
* For alternative locations: have tests conditional on the presence of a driver for that location.
|
||||
* For alternative persistence levels: have load-and-check tests for supported persistence levels. We may also want to have negative tests ensuring that keys with a not-supported persistence level are not accidentally created.
|
||||
|
||||
### Considerations on key material representations
|
||||
|
||||
The risks of incompatibilities in key representations depends on the key type and on the presence of drivers. Compatibility of and with drivers is currently out of scope of this document.
|
||||
|
||||
Some types only have one plausible representation. Others admit alternative plausible representations (different encodings, or non-canonical representations).
|
||||
Here are some areas to watch for, with an identified risk of incompatibilities.
|
||||
|
||||
* HMAC keys longer than the block size: pre-hashed or not?
|
||||
* DES keys: was parity enforced?
|
||||
* RSA keys: can invalid DER encodings (e.g. leading zeros, ignored sign bit) have been stored?
|
||||
* RSA private keys: can invalid CRT parameters have been stored?
|
||||
* Montgomery private keys: were they stored in masked form?
|
||||
|
||||
## Random generator state
|
||||
|
||||
TODO
|
||||
|
||||
## Driver state files
|
||||
|
||||
Not yet implemented.
|
||||
|
||||
TODO
|
||||
|
||||
## Storage transaction resumption
|
||||
|
||||
Only relevant for secure element support. Not yet fully implemented.
|
||||
|
||||
TODO
|
||||
|
||||
## PSA ITS over file
|
||||
|
||||
TODO
|
Reference in New Issue
Block a user