• Ard Biesheuvel's avatar
    crypto: x86/aes-ni-xts - use direct calls to and 4-way stride · 8e970771
    Ard Biesheuvel authored
    commit 86ad60a6 upstream.
    
    The XTS asm helper arrangement is a bit odd: the 8-way stride helper
    consists of back-to-back calls to the 4-way core transforms, which
    are called indirectly, based on a boolean that indicates whether we
    are performing encryption or decryption.
    
    Given how costly indirect calls are on x86, let's switch to direct
    calls, and given how the 8-way stride doesn't really add anything
    substantial, use a 4-way stride instead, and make the asm core
    routine deal with any multiple of 4 blocks. Since 512 byte sectors
    or 4 KB blocks are the typical quantities XTS operates on, increase
    the stride exported to the glue helper to 512 bytes as well.
    
    As a result, the number of indirect calls is reduced from 3 per 64 bytes
    of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
    when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)
    
    Fixes: 9697fa39
    
     ("x86/retpoline/crypto: Convert crypto assembler indirect jumps")
    Tested-by: Eric Biggers <ebiggers@google.com> # x86_64
    Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
    Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    8e970771
aesni-intel_glue.c 32.5 KB