1
0
mirror of https://github.com/Mbed-TLS/mbedtls.git synced 2025-10-26 00:37:41 +03:00
Commit Graph

60 Commits

Author SHA1 Message Date
Valerio Setti
779a1a5b20 aria: remove leftover in comments
Signed-off-by: Valerio Setti <valerio.setti@nordicsemi.no>
2024-01-30 16:28:09 +01:00
Valerio Setti
ea3a6114e6 aria: replace ARIA_VALIDATE_RET() with a simple "if" block
Signed-off-by: Valerio Setti <valerio.setti@nordicsemi.no>
2024-01-29 12:00:21 +01:00
Valerio Setti
a45a399a6b lib: remove NULL pointer checks performed with MBEDTLS_INTERNAL_VALIDATE[_RET]
Symbols defined starting from MBEDTLS_INTERNAL_VALIDATE[_RET]
are removed as well.

Signed-off-by: Valerio Setti <valerio.setti@nordicsemi.no>
2024-01-29 12:00:15 +01:00
Yanray Wang
690ee81533 Merge remote-tracking branch 'origin/development' into support_cipher_encrypt_only 2023-11-23 10:31:26 +08:00
Dave Rodgman
16799db69a update headers
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
2023-11-02 19:47:20 +00:00
Yanray Wang
b67b47425e Rename MBEDTLS_CIPHER_ENCRYPT_ONLY as MBEDTLS_BLOCK_CIPHER_NO_DECRYPT
Signed-off-by: Yanray Wang <yanray.wang@arm.com>
2023-10-31 17:22:06 +08:00
Yanray Wang
9141ad1223 aria/camellia/des: guard setkey_dec by CIPHER_ENCRYPT_ONLY
This is a pre-step to remove *setkey_dec_func in cipher_wrap ctx
when CIPHER_ENCRYPT_ONLY is enabled.

Signed-off-by: Yanray Wang <yanray.wang@arm.com>
2023-09-01 17:06:38 +08:00
Gilles Peskine
449bd8303e Switch to the new code style
Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
2023-01-11 14:50:10 +01:00
Dave Rodgman
2d0f27d0fc Make use of optimised bswap from ARIA
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
2022-11-30 12:16:21 +00:00
Dave Rodgman
5a1d00f03d Merge remote-tracking branch 'origin/development' into fast_xor
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
2022-11-25 17:10:25 +00:00
Gilles Peskine
5a34b36bbd Remove more now-redundant definitions of inline
Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
2022-11-25 13:26:44 +01:00
Dave Rodgman
7bb6b84b29 Use mbedtls_xor in ARIA
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
2022-11-22 17:32:43 +00:00
Gilles Peskine
744fd37d23 Merge pull request #6467 from davidhorstmann-arm/fix-unusual-macros-0
Fix unusual macros
2022-10-25 19:55:29 +02:00
David Horstmann
9b0eb90131 Rename ARIA_SELF_TEST_IF_FAIL
Change to ARIA_SELF_TEST_ASSERT

Signed-off-by: David Horstmann <david.horstmann@arm.com>
2022-10-25 10:23:34 +01:00
David Horstmann
0763ccf04f Refactor ARIA_SELF_TEST_IF_FAIL macro
Change the ARIA_SELF_TEST_IF_FAIL macro to be more code-style friendly.
Currently it expands to the body of an if statement, which causes
problems for automatic brace-addition for if statements.

Convert the macro to a function-like macro that takes the condition as
an argument and expands to a full if statement inside a do {} while (0)
idiom.

Signed-off-by: David Horstmann <david.horstmann@arm.com>
2022-10-06 14:32:30 +01:00
Gilles Peskine
a7aa80c058 Include platform.h unconditionally: second automatic part
Some source files included platform.h in a nested conditional. The previous
commit "Include platform.h unconditionally: automatic part" only removed
the outer conditional. This commit removes the inner conditional.

This commit once again replaces most occurrences of conditional inclusion of
platform.h, using the following code:

```
perl -i -0777 -pe 's!#if.*\n#include "mbedtls/platform.h"\n(#else.*\n(#define (mbedtls|MBEDTLS)_.*\n|#include <(stdarg|stddef|stdio|stdlib|string|time)\.h>\n)*)?#endif.*!#include "mbedtls/platform.h"!mg' $(git grep -l '#include "mbedtls/platform.h"')
```

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
2022-09-15 20:34:10 +02:00
Gilles Peskine
945b23c46f Include platform.h unconditionally: automatic part
We used to include platform.h only when MBEDTLS_PLATFORM_C was enabled, and
to define ad hoc replacements for mbedtls_xxx functions on a case-by-case
basis when MBEDTLS_PLATFORM_C was disabled. The only reason for this
complication was to allow building individual source modules without copying
platform.h. This is not something we support or recommend anymore, so get
rid of the complication: include platform.h unconditionally.

There should be no change in behavior since just including the header should
not change the behavior of a program.

This commit replaces most occurrences of conditional inclusion of
platform.h, using the following code:

```
perl -i -0777 -pe 's!#if.*\n#include "mbedtls/platform.h"\n(#else.*\n(#define (mbedtls|MBEDTLS)_.*\n|#include <(stdarg|stddef|stdio|stdlib|string|time)\.h>\n)*)?#endif.*!#include "mbedtls/platform.h"!mg' $(git grep -l '#include "mbedtls/platform.h"')
```

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
2022-09-15 20:33:07 +02:00
Joe Subbiani
54550f7fca Replace 3 byte shift with appropriate macro
aria.c has a shift by 3 bytes, but does not use the 0xff masking.
aparently this is not a problem and it is tidier to use the maco.

Signed-off-by: Joe Subbiani <joe.subbiani@arm.com>
2021-08-19 09:55:42 +01:00
Joe Subbiani
cd84d76e9b Add Character byte reading macros
These cast to an unsigned char rather than a uint8_t
like with MBEDTLS_BYTE_x
These save alot of space and will improve maintence by
replacing the appropriate code with MBEDTLS_CHAR_x

Signed-off-by: Joe Subbiani <joe.subbiani@arm.com>
2021-08-19 09:55:41 +01:00
Joe Subbiani
6a50631497 GET macros use a target variable
The GET macros used to write to a macro parameter, but now
they can be used to assign a value to the desired variable
rather than pass it in as an argument and have it modified
in the macro function.

Due to this MBEDTLS_BYTES_TO_U32_LE is the same as
MBEDTLS_GET_UINT32_LE and was there for replaced in the
appropriate files and removed from common.h

Signed-off-by: Joe Subbiani <joe.subbiani@arm.com>
2021-08-19 09:31:55 +01:00
Joe Subbiani
394bdd662b Document common.h and remove changelog
Added documenttion comments to common.h and removed the changelog
as it is not really necessary for refactoring.

Also modified a comment in aria.c to be clearer

Signed-off-by: Joe Subbiani <joe.subbiani@arm.com>
2021-08-19 09:31:55 +01:00
Joe Subbiani
5ecac217f0 Prefixed macros with MBEDTLS
As per tests/scripts/check-names.sh, macros in
library/ header files should be prefixed with
MBEDTLS_
The macro functions in common.h where also indented
to comply with the same test

Signed-off-by: Joe Subbiani <joe.subbiani@arm.com>
2021-08-19 09:31:54 +01:00
Joe Subbiani
54c6134ff7 Move UINT32_LE macros to common.h
32-bit integer manipulation macros (little edian):
GET_UINT32_LE and PUT_UINT32_LE appear in several
files in library/.
Removes duplicate code and save vertical
space the macro has been moved to common.h.
Improves maintainability.

Also provided brief comment in common.h for
BYTES_TO_U32_LE. comment/documentation will
probably need to be edited further for all
recent additions to library/common.h

Signed-off-by: Joe Subbiani <joe.subbiani@arm.com>
2021-08-19 09:31:53 +01:00
Gilles Peskine
be89fea1a7 ARIA: add missing context init/free
This fixes the self-test with alternative implementations.

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
2021-05-25 09:23:10 +02:00
Bence Szépkúti
1e14827beb Update copyright notices to use Linux Foundation guidance
As a result, the copyright of contributors other than Arm is now
acknowledged, and the years of publishing are no longer tracked in the
source files.

Also remove the now-redundant lines declaring that the files are part of
MbedTLS.

This commit was generated using the following script:

# ========================
#!/bin/sh

# Find files
find '(' -path './.git' -o -path './3rdparty' ')' -prune -o -type f -print | xargs sed -bi '

# Replace copyright attribution line
s/Copyright.*Arm.*/Copyright The Mbed TLS Contributors/I

# Remove redundant declaration and the preceding line
$!N
/This file is part of Mbed TLS/Id
P
D
'
# ========================

Signed-off-by: Bence Szépkúti <bence.szepkuti@arm.com>
2020-08-19 10:35:41 +02:00
Gilles Peskine
db09ef6d22 Include common.h instead of config.h in library source files
In library source files, include "common.h", which takes care of
including "mbedtls/config.h" (or the alternative MBEDTLS_CONFIG_FILE)
and other things that are used throughout the library.

FROM=$'#if !defined(MBEDTLS_CONFIG_FILE)\n#include "mbedtls/config.h"\n#else\n#include MBEDTLS_CONFIG_FILE\n#endif' perl -i -0777 -pe 's~\Q$ENV{FROM}~#include "common.h"~' library/*.c 3rdparty/*/library/*.c scripts/data_files/error.fmt scripts/data_files/version_features.fmt

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
2020-07-02 11:26:57 +02:00
Andrzej Kurek
c470b6b021 Merge development commit 8e76332 into development-psa
Additional changes to temporarily enable running tests:
ssl_srv.c and test_suite_ecdh use mbedtls_ecp_group_load instead of
mbedtls_ecdh_setup
test_suite_ctr_drbg uses mbedtls_ctr_drbg_update instead of 
mbedtls_ctr_drbg_update_ret
2019-01-31 08:20:20 -05:00
Ron Eldor
d1a4762adb Use mbedtls_printf instead of printf
Replace usages of `printf()` with `mbedtls_printf()` in `aria.c`
which were accidently merged. Fixes #1908
2018-08-13 13:49:52 +03:00
Manuel Pégourié-Gonnard
7124fb63be Use zeroize function from new platform_util 2018-05-22 16:05:33 +02:00
Manuel Pégourié-Gonnard
2df4bfe803 Fix typo in comments 2018-05-22 13:39:01 +02:00
Manuel Pégourié-Gonnard
565e4e0fb2 Use more appropriate type for local variable 2018-05-22 13:30:28 +02:00
Manuel Pégourié-Gonnard
08c337d058 Remove useless parameter from function 2018-05-22 13:18:01 +02:00
Manuel Pégourié-Gonnard
89924ddc7e Wipe sensitive info from the stack 2018-05-22 13:07:07 +02:00
Manuel Pégourié-Gonnard
12e2fbdf29 Style adjustments 2018-05-22 13:01:09 +02:00
Manuel Pégourié-Gonnard
d418b0dcba Fix typo in comment 2018-05-22 12:56:11 +02:00
Manuel Pégourié-Gonnard
366e1b0464 aria: fix comment on aria_a function
The new version of the comment has been generated by the following python3
script, when the first constant is copy-pasted from RFC 5794 2.4.3.

 #!/usr/bin/python3

RFC_A = """
      y0  = x3 ^ x4 ^ x6 ^ x8  ^ x9  ^ x13 ^ x14,
      y1  = x2 ^ x5 ^ x7 ^ x8  ^ x9  ^ x12 ^ x15,
      y2  = x1 ^ x4 ^ x6 ^ x10 ^ x11 ^ x12 ^ x15,
      y3  = x0 ^ x5 ^ x7 ^ x10 ^ x11 ^ x13 ^ x14,
      y4  = x0 ^ x2 ^ x5 ^ x8  ^ x11 ^ x14 ^ x15,
      y5  = x1 ^ x3 ^ x4 ^ x9  ^ x10 ^ x14 ^ x15,
      y6  = x0 ^ x2 ^ x7 ^ x9  ^ x10 ^ x12 ^ x13,
      y7  = x1 ^ x3 ^ x6 ^ x8  ^ x11 ^ x12 ^ x13,
      y8  = x0 ^ x1 ^ x4 ^ x7  ^ x10 ^ x13 ^ x15,
      y9  = x0 ^ x1 ^ x5 ^ x6  ^ x11 ^ x12 ^ x14,
      y10 = x2 ^ x3 ^ x5 ^ x6  ^ x8  ^ x13 ^ x15,
      y11 = x2 ^ x3 ^ x4 ^ x7  ^ x9  ^ x12 ^ x14,
      y12 = x1 ^ x2 ^ x6 ^ x7  ^ x9  ^ x11 ^ x12,
      y13 = x0 ^ x3 ^ x6 ^ x7  ^ x8  ^ x10 ^ x13,
      y14 = x0 ^ x3 ^ x4 ^ x5  ^ x9  ^ x11 ^ x14,
      y15 = x1 ^ x2 ^ x4 ^ x5  ^ x8  ^ x10 ^ x15.
"""

matrix = []
for l in RFC_A.split('\n')[1:-1]:
    rhs = l.split('=')[1][:-1]
    row = tuple(hex(int(t[2:]))[2:] for t in rhs.split('^'))
    matrix.append(row)

out = {}
out['a'] = tuple(''.join(w) for w in zip(*(matrix[0:4])))
out['b'] = tuple(''.join(w) for w in zip(*(matrix[4:8])))
out['c'] = tuple(''.join(w) for w in zip(*(matrix[8:12])))
out['d'] = tuple(''.join(w) for w in zip(*(matrix[12:])))

out2 = {}
for o, r in out.items():
    row = list(r)
    for i in range(len(r) - 1):
        w1 = row[i]
        if len(set(w1)) == 2:
            w2 = row[i+1]
            nw1 = nw2 = ''
            for j in range(len(w1)):
                if w1[j] in nw1:
                    nw1 += w2[j]
                    nw2 += w1[j]
                else:
                    nw1 += w1[j]
                    nw2 += w2[j]
            row[i] = nw1
            row[i+1] = nw2

    out2[o] = row

for o in 'abcd':
    print(o,   '=', ' + '.join(out[o]))
    print(' ', '=', ' + '.join(out2[o]))
2018-03-01 14:48:10 +01:00
Manuel Pégourié-Gonnard
21662148f7 aria: improve compiler compat by using __asm
gcc --std=c99 doesn't like the shorter "asm" (this broke all.sh)
2018-03-01 11:28:51 +01:00
Manuel Pégourié-Gonnard
2078725feb aria: check arm arch version for asm
rev and rev16 are only supported from v6 (all profiles) and up.

arm-none-eabi-gcc picks a lower architecture version by default, which means
before this commit it would fail to build (assembler error) unless you
manually specified -march=armv6-m -mthumb or similar, which broke all.sh.

Source for version-checking macros:
- GCC/Clang: use the -E -dM - </dev/null trick
- armcc5: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0472k/chr1359125007083.html
- armclang 6: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0774g/chr1383660321827.html

Tested with the following script:

 #!/bin/sh

set -eu

ARMCLANG="env ARM_TOOL_VARIANT=ult $ARMC6_BIN_DIR/armclang"

build() {
    echo "$@"
    "$@" -Iinclude -c library/aria.c

}

build arm-none-eabi-gcc
build arm-none-eabi-gcc -march=armv5

build clang --target=arm-none-eabi
build clang --target=arm-none-eabi -march=armv5

build armcc
build armcc --gnu
build armcc --cpu=5T
build armcc --cpu=5T --gnu

build $ARMCLANG --target=arm-arm-none-eabi

check_asm() {
    rm -f aria.o
    build "$@"
    arm-none-eabi-objdump -d aria.o | grep rev16
}

check_asm arm-none-eabi-gcc -march=armv6-m -mthumb
check_asm arm-none-eabi-gcc -march=armv7-m -mthumb
check_asm arm-none-eabi-gcc -march=armv8-m.base -mthumb

check_asm arm-none-eabi-gcc -march=armv7-a -mthumb
check_asm arm-none-eabi-gcc -march=armv8-a -mthumb
check_asm arm-none-eabi-gcc -march=armv7-a -marm
check_asm arm-none-eabi-gcc -march=armv8-a -marm

check_asm clang --target=arm-none-eabi -march=armv6-m
check_asm clang --target=arm-none-eabi -march=armv7-a
check_asm clang --target=arm-none-eabi -march=armv7-m
check_asm clang --target=arm-none-eabi -march=armv7-r
check_asm clang --target=arm-none-eabi -march=armv8-a

check_asm armcc -O0 --cpu=6-M
check_asm armcc -O0 --cpu=7-M
check_asm armcc -O0 --cpu=6
check_asm armcc -O0 --cpu=7-A

check_asm $ARMCLANG --target=arm-arm-none-eabi -march=armv6-m
check_asm $ARMCLANG --target=arm-arm-none-eabi -march=armv7-a
check_asm $ARMCLANG --target=arm-arm-none-eabi -march=armv7-m
check_asm $ARMCLANG --target=arm-arm-none-eabi -march=armv7-r
check_asm $ARMCLANG --target=arm-arm-none-eabi -march=armv8-a
check_asm $ARMCLANG --target=arm-arm-none-eabi -march=armv8-m.base
2018-03-01 11:28:51 +01:00
Manuel Pégourié-Gonnard
7fc08795c1 aria: more whitespace fixes 2018-03-01 09:33:20 +01:00
Manuel Pégourié-Gonnard
5ad88b6d0d aria: define constants for block size and max rounds 2018-03-01 09:25:31 +01:00
Manuel Pégourié-Gonnard
f3a46a9b4f aria: fix some typos in comments 2018-03-01 09:25:05 +01:00
Manuel Pégourié-Gonnard
c0bb66f47e aria: improve compiler inline compatibility 2018-03-01 09:25:05 +01:00
Manuel Pégourié-Gonnard
4231e7f46f Fix some whitespace and other style issues
In addition to whitespace:
- wrapped a few long lines
- added parenthesis to return statements
2018-02-28 11:34:01 +01:00
Manuel Pégourié-Gonnard
377b2b624d aria: optimize byte perms on Arm
Use specific instructions for moving bytes around in a word. This speeds
things up, and as a side-effect, slightly lowers code size.

ARIA_P3 and ARIA_P1 are now 1 single-cycle instruction each (those
instructions are available in all architecture versions starting from v6-M).
Note: ARIA_P3 was already translated to a single instruction by Clang 3.8 and
armclang 6.5, but not arm-gcc 5.4 nor armcc 5.06.

ARIA_P2 is already efficiently translated to the minimal number of
instruction (1 in ARM mode, 2 in thumb mode) by all tested compilers

Manually compiled and inspected generated code with the following compilers:
arm-gcc 5.4, clang 3.8, armcc 5.06 (with and without --gnu), armclang 6.5.

Size reduction (arm-none-eabi-gcc -march=armv6-m -mthumb -Os): 5288 -> 5044 B

Effect on executing time of self-tests on a few boards:
FRDM-K64F   (Cortex-M4):    444 ->  385 us (-13%)
LPC1768     (Cortex-M3):    488 ->  432 us (-11%)
FRDM-KL64Z  (Cortex-M0):   1429 -> 1134 us (-20%)

Measured using a config.h with no cipher mode and the following program with
aria.c and aria.h copy-pasted to the online compiler:

 #include "mbed.h"
 #include "aria.h"

int main() {
    Timer t;
    t.start();
    int ret = mbedtls_aria_self_test(0);
    t.stop();
    printf("ret = %d; time = %d us\n", ret, t.read_us());
}
2018-02-27 12:39:12 +01:00
Manuel Pégourié-Gonnard
fb0e4f0d1a aria: optimise byte perms on Intel
(A similar commit for Arm follows.)

Use specific instructions for moving bytes around in a word. This speeds
things up, and as a side-effect, slightly lowers code size.

ARIA_P3 (aka reverse byte order) is now 1 instruction on x86, which speeds up
key schedule. (Clang 3.8 finds this but GCC 5.4 doesn't.)

I couldn't find an Intel equivalent of ARM's ret16 (aka ARIA_P1), so I made it
two instructions, which is still much better than the code generated with
the previous mask-shift-or definition, and speeds up en/decryption. (Neither
Clang 3.8 nor GCC 5.4 find this.)

Before:
O	aria.o	ins
s	7976	43,865
2	10520	37,631
3	13040	28,146

After:
O	aria.o	ins
s	7768	33,497
2	9816	28,268
3	11432	20,829

For measurement method, see previous commit:
"aria: turn macro into static inline function"
2018-02-27 12:39:12 +01:00
Manuel Pégourié-Gonnard
cac5008b17 aria: define P3 macro
This will allow to replace it with an optimised implementation later
2018-02-27 12:39:12 +01:00
Manuel Pégourié-Gonnard
f205a012b8 aria: comment implementation of A transform
The line-by-line comments were generated using the following Python 3 script:

 #!/usr/bin/python3

class Atom:
    def __init__(self, val):
        self.v = val

    def __str__(self):
        return self.v

    def p1(self):
        v = self.v
        return Atom(v[1] + v[0] + v[3] + v[2])

    def p2(self):
        v = self.v
        return Atom(v[2] + v[3] + v[0] + v[1])

    def __xor__(self, other):
        return Sum(self.tuple() + other.tuple())

    def tuple(self):
        return (self,)

class Sum:
    def __init__(self, terms):
        self.t = terms
        assert(type(terms) == tuple)
        for t in terms:
            assert(type(t) == Atom)

    def __str__(self):
        return '+'.join(sorted((str(t) for t in self.t),
                        key=lambda v: int(v, 16)))

    def p1(self):
        return Sum(tuple(t.p1() for t in self.t))

    def p2(self):
        return Sum(tuple(t.p2() for t in self.t))

    def tuple(self):
        return self.t

    def __xor__(self, other):
        return Sum(self.t + other.tuple())

class LoggingDict(dict):
    def __setitem__(self, key, val):
        print(key, '=', val)
        dict.__setitem__(self, key, val)

    def set(self, key, val):
        dict.__setitem__(self, key, val)

env = LoggingDict()

env.set('ra', Atom('0123'))
env.set('rb', Atom('4567'))
env.set('rc', Atom('89ab'))
env.set('rd', Atom('cdef'))
env.set('ARIA_P1', lambda x: x.p1())
env.set('ARIA_P2', lambda x: x.p2())

code = """
ta  =   rb;
rb  =   ra;
ra  =   ARIA_P2( ta );
tb  =   ARIA_P2( rd );
rd  =   ARIA_P1( rc );
rc  =   ARIA_P1( tb );
ta  ^=  rd;
tc  =   ARIA_P2( rb );
ta  =   ARIA_P1( ta ) ^ tc ^ rc;
tb  ^=  ARIA_P2( rd );
tc  ^=  ARIA_P1( ra );
rb  ^=  ta ^ tb;
tb  =   ARIA_P2( tb ) ^ ta;
ra  ^=  ARIA_P1( tb );
ta  =   ARIA_P2( ta );
rd  ^=  ARIA_P1( ta ) ^ tc;
tc  =   ARIA_P2( tc );
rc  ^=  ARIA_P1( tc ) ^ ta;
"""

exec(code, env)
2018-02-27 12:39:12 +01:00
Manuel Pégourié-Gonnard
35ad891aee aria: internal names closer to standard document 2018-02-27 12:39:12 +01:00
Manuel Pégourié-Gonnard
64744f88b6 aria: define SLA() as sl(a())
This decreases the size with -Os by nearly 1k while
not hurting performance too much with -O2 and -O3

Before:
O	aria.o	ins
s	8784	41,408
2	11112	37,001
3	13096	27,438

After:
O	aria.o	ins
s	7976	43,865
2	10520	37,631
3	13040	28,146

(See previous commit for measurement details.)
2018-02-27 12:39:12 +01:00
Manuel Pégourié-Gonnard
8c76a9489e aria: turn macro into static inline function
Besides documenting types better and so on, this give the compiler more room
to optimise either for size or performance.

Here are some before/after measurements of:
- size of aria.o in bytes (less is better)
- instruction count for the selftest function (less is better)
with various -O flags.

Before:
O	aria.o	ins
s	10896	37,256
2	11176	37,199
3	12248	27,752

After:
O	aria.o	ins
s	8784	41,408
2	11112	37,001
3	13096	27,438

The new version allows the compiler to reach smaller size with -Os while
maintaining (actually slightly improving) performance with -O2 and -O3.

Measurements were done on x86_64 (but since this is mainly about inlining
code, this should transpose well to other platforms) using the following
helper program and script, after disabling CBC, CFB and CTR in config.h, in
order to focus on the core functions.

==> st.c <==
 #include "mbedtls/aria.h"

int main( void ) {
    return mbedtls_aria_self_test( 0 );
}

==> p.sh <==
 #!/bin/sh

set -eu

ccount () {
    (
    valgrind --tool=callgrind --dump-line=no --callgrind-out-file=/dev/null --collect-atstart=no --toggle-collect=main $1
    ) 2>&1 | sed -n -e 's/.*refs: *\([0-9,]*\)/\1/p'
}

printf "O\taria.o\tins\n"
for O in s 2 3; do
    GCC="gcc -Wall -Wextra -Werror -Iinclude"

    $GCC -O$O -c library/aria.c
    $GCC -O1 st.c aria.o -o st
   ./st

    SIZE=$( du -b aria.o | cut -f1 )
    INS=$( ccount ./st )

    printf "$O\t$SIZE\t$INS\n"
done
2018-02-27 12:39:12 +01:00