TY - CHAP
T1 - Compact Ring-LWE Cryptoprocessor
AU - Roy, Sujoy Sinha
AU - Vercauteren, Frederik
AU - Mentens, Nele
AU - Chen, Donald Donglong
AU - Verbauwhede, Ingrid
PY - 2014/9/23
Y1 - 2014/9/23
N2 - In this paper we propose an efficient and compact processor for a ring-LWE based encryption scheme. We present three optimizations for the Number Theoretic Transform (NTT) used for polynomial multiplication: we avoid pre-processing in the negative wrapped convolution by merging it with the main algorithm, we reduce the fixed computation cost of the twiddle factors and propose an advanced memory access scheme. These optimization techniques reduce both the cycle and memory requirements. Finally, we also propose an optimization of the ring-LWE encryption system that reduces the number of NTT operations from five to four resulting in a 20% speed-up. We use these computational optimizations along with several architectural optimizations to design an instruction-set ring-LWE cryptoprocessor. For dimension 256, our processor performs encryption/decryption operations in 20/9 μs on a Virtex 6 FPGA and only requires 1349 LUTs, 860 FFs, 1 DSP-MULT and 2 BRAMs. Similarly for dimension 512, the processor takes 48/21 μs for performing encryption/decryption operations and only requires 1536 LUTs, 953 FFs, 1 DSP-MULT and 3 BRAMs. Our processors are therefore more than three times smaller than the current state of the art hardware implementations, whilst running somewhat faster.
AB - In this paper we propose an efficient and compact processor for a ring-LWE based encryption scheme. We present three optimizations for the Number Theoretic Transform (NTT) used for polynomial multiplication: we avoid pre-processing in the negative wrapped convolution by merging it with the main algorithm, we reduce the fixed computation cost of the twiddle factors and propose an advanced memory access scheme. These optimization techniques reduce both the cycle and memory requirements. Finally, we also propose an optimization of the ring-LWE encryption system that reduces the number of NTT operations from five to four resulting in a 20% speed-up. We use these computational optimizations along with several architectural optimizations to design an instruction-set ring-LWE cryptoprocessor. For dimension 256, our processor performs encryption/decryption operations in 20/9 μs on a Virtex 6 FPGA and only requires 1349 LUTs, 860 FFs, 1 DSP-MULT and 2 BRAMs. Similarly for dimension 512, the processor takes 48/21 μs for performing encryption/decryption operations and only requires 1536 LUTs, 953 FFs, 1 DSP-MULT and 3 BRAMs. Our processors are therefore more than three times smaller than the current state of the art hardware implementations, whilst running somewhat faster.
KW - Lattice-based cryptography
KW - ring-LWE
KW - Polynomial multiplication
KW - Number Theoretic Transform
KW - Hardware Implementation
U2 - 10.1007/978-3-662-44709-3_21
DO - 10.1007/978-3-662-44709-3_21
M3 - Chapter
SN - 9783662447086
T3 - Lecture Notes in Computer Science
SP - 371
EP - 391
BT - Cryptographic Hardware and Embedded Systems - CHES 2014
A2 - Batina, Lejla
A2 - Robshaw, Matthew
PB - Springer
T2 - 16th International Workshop on Cryptographic Hardware and Embedded Systems (CHES 2014)
Y2 - 23 September 2014 through 26 September 2014
ER -