Implementation of a Digital TRNG Using Jitter Based Multiple Entropy Source on FPGA

: In this study, hardware implementation and evaluation of a true random number generator (TRNG) is presented. For the implementation, Field Programmable Gate Array (FPGA) hardware, in which numerical processes based on an algorithmic basis are carried out, was used. In the system, ring oscillators (ROs) with similar structures were used as a noise source, and true randomness was obtained by sampling the jitter signals originating from the oscillators. However, the most critical cryptographic disadvantage of jitter-based TRNGs is the statistical inadequacy of the system. At this point, in contrast to existing designs, entropy sources derived from the subsets of ROs were used in the sampling and post-processing stage. The statistical quality of the system was improved by using true random numbers/inputs obtained from these entropy sources in the sampling and post-processing stage. With sampling and post-processing inputs, the use of complex post-processing techniques that limit the output bit rate of the generator in the system was not required. Thus, a high-performance adaptable TRNG model with reduced hardware resource consumption is obtained. The statistical validation of the TRNG, which was tested in 6 different scenarios for two separate ring oscillator (RO) architectures and three different operating frequencies, was performed with the NIST 800-22 and AIS31 test packages.


Introduction
In computer science, random numbers are used in many different fields such as programming, simulation, statistical sampling, chance games, and cryptography. While simple statistical features are often sufficient, random numbers must meet strict requirements when it comes to cryptography because the understanding of security in cryptographic systems is based on the confidentiality of randomly generated numbers used for performing critical functions in the system. In addition to their excellent statistical properties, random numbers, which are the complementary element of cryptographic systems, should not contain hidden or explicit patterns between their elements and should be unpredictable. Random numbers, which do not meet these characteristic requirements, jeopardize the reliability of cryptographic systems in which they are used.
Generation of random numbers, which provide the characteristic requirements needed in cryptography, forms an essential and challenging problem area [1]. For obtaining these numbers, customized components known as the Random Number Generators (RNGs) are needed. Random numbers are obtained from two separate design classes: PRNG (pseudo RNG) and TRNG (true RNG). A TRNG is a mechanism that generates true random numbers which are difficult to predict and impossible to reproduce, by using physical events/situations as an entropy source. A TRNG design architecture is presented in Figure 1 consists of the noise source, sampler, and post-processing hierarchical components. The uncontrollability of the physical processes used as the noise source makes the outputs of the generator unpredictable and unreproducible. Pure random numbers with a poor statistical quality obtained by the digitalization of noise sources in the system are postprocessed and passed to the output. The overall design architecture and characteristic behavior of any RNG should coincide with the ideal definition of cryptographic random numbers. In contrast to TRNGs, the PRNG, which corresponds to a standard definition, is a deterministic function. Although their quality of randomness is good, the fact that their outputs appear to be random, yet are predictable limits the use of PRNGs in sensitive cryptographic applications. Although TRNGs are usually slow, costly, and hardwaredependent, they are frequently preferred since they can meet cryptographic requirements.
Random numbers should not be generated on uncontrolled hardware and should not be taken out of the system. If possible, cryptographic systems should be implemented as a whole in a secure computing zone where direct access and programming are not possible on embedded systems [2,3]. Therefore, the implementation of random number generators is critical. Nondeterministic events/situations on these devices, in which algorithmic processes are performed at the hardware level, adversely affect the behavior of the system. Although non-deterministic processes are minimized by device manufacturers, they cannot be completely eliminated. This situation demonstrates that randomness/noise sources, which are the most critical design components of a TRNG, can be obtained from these devices. Jitter [4][5] and metastability [6][7] on FPGA are frequently used randomness sources, for TRNG designs based on digital design techniques. Especially ROs are used as a source of jitter signals. The clock signals obtained from ROs occur with a deviation from the ideal positions because of the unstable propagation delay in the delay chain. The fact that jitter signals are truly ran-dom and easily obtainable has made ROs an important component of TRNGs [8].
In this study, the hardware implementation and evaluation of a TRNG on an FPGA in which ROs were used as the source of noise/randomness are presented. In contrast to the studies in the literature, additional sources of true randomness are used for triggering signal sampling and as an additional input for the post-processing components of the system. In addition to improving the statistical quality of the system, the way in which the entropy sources for additional inputs are implemented further simplifies the system in terms of hardware. Besides to simplifying the system and improving its statistical quality, additional inputs have also eliminated the necessity of complicated post-processing techniques that limit the output bit rate. Therefore, the output bit rate of TRNG is high, although the sampling input is non-periodic. Furthermore, the fact that the additional inputs used in the hierarchical components of the system are truly random made the system cryptographically secure.
The remaining part of the study is organized as follows: Developments in the literature are presented in Section II. In addition to the architectures of the used RO, the conceptual infrastructure of the proposed system is displayed in Section III. Detailed information about the hardware implementation of TRNGs and the results of the experimental analysis are presented in Sections IV and V, respectively. Finally, the study is terminated by presenting the conclusion and recommendations in Section VI.

Related works
In cryptography, implementation of an RNG, which is the most important component of the system, in a computation zone where access and manipulation are not possible, is essential regarding the security of the system. Again, the possible attacks on the principles representing the generator and the sub-components of the generator can change the constant theoretical safety limit of TRNGs over time. This causes analytic attacks to occur on the cryptographic system in which the generator is used in a shorter time than expected. By keeping the theoretical safety limit constant, the ability to re-configure implementation platforms to minimize the impact of possible attacks is another important step in system security. For providing these basic requirements, FPGA hardware, in which cryptosystems can be applied as a whole, is a popular platform. Therefore, obtaining the noise source, which is the most critical design component of a TRNG, on these devices is a desired feature [9,10].
Noise sources such as jitter, metastability, and clock jitter, which mostly emerge as a result of the use of digital devices such as FPGA, are inefficient in terms of cryptographic competencies [11]. In the literature, there are many TRNG designs in which these randomness sources are used. The focus of the designs is to improve the basic design parameters and cryptographic competencies of the generator, as in this study. Some of these studies are as follows: Phase-Locked Loop (PLL) [4], [12] and multi-ROs [13][14] are used to obtain jitter on digital devices. In [13], 114 free-running ROs, each consisting of 13 inverters, were used in the system. The sampling frequency and output bit rate of the system, in which resilient functions are used as the post-processing technique, are 40 MHz and 2 Mbps, respectively. Due to the complexity of the system, the energy consumption is high, and the output bit rate is low. An improved version of the model proposed by Sunar was proposed by Knuth in [14]. Post-processing was not required in the system, in which the number of inverters and free oscillation ROs was reduced. The most significant deficiency of the system proposed by Wold in [11] was that the generator became insecure due to the entropy loss caused by the reduction of the number of oscillators in the system and occurrence of deterministic randomness. In another study [15], in which the model of Sunar was referenced, a total of 110 free oscillation oscillators each comprising 3 inverters were used in the system. The output bit rate of the system implemented on the Xilinx Virtex II Pro FPGA was measured to be 2.5 Mbs. Other TRNG designs using ROs were also proposed by Kollhenberger and Gaj [16], Golic [17], Dichtl and Golic [18], Tuncer [19], and Avaroglu [8].
In the literature, ROs, in which jitter represents the source of randomness, are also used in Physically Unclonable Function (PUF) based TRNG designs [20][21]. In [20], for two separate PUF circuits in the system, 64 ROs, each consisting of 13 inverters, were used. Random numbers obtained from the RO-PUF (Ring Oscillator-Physically Unclonable Function) implemented on two different FPGA cores against the same query were passed through the post-processing technique, and successful results were obtained. In [21], the query input of the PUF circuit with 128 ROs, each consisting of 3 inverters, was obtained from the logistic map with chaotic behavior. In [4] and [12], instead of ROs, jitter signals obtained by the numerical implementation of analog PLL components were used. The most significant disadvantage of both systems is that a limited number of PLL components can be used on digital devices and a limited number of outputs can be obtained from these components.
A hard-core TRNG design in which the R-S flip-flop is used as a metastability-based source of randomness was presented in [22]. In the system in which the NAND gates of the R-S latch are applied as LUT (look-up table), the outputs of 64, 128, 256 parallelly connected R-S latches were sampled by combining them with the XOR (exclusive OR) operation. The TRNG which did not need post-processing passed NIST tests successfully. Another metastability-based TRNG design was presented in [23]. The system, in which cross-linked NAND gates were used as a source of metastability, was implemented on the Xilinx Virtex XC5VLX50T FPGA chip. The system that reached an output bit rate of 30 Mbps by using the Von Neumann and XOR post-processing techniques passed the NIST tests successfully.

Generic architecture of the proposed TRNG
Within the FPGA, each of the digital circuit elements that make up the integrated structure has its own specific unstable time delay. This instability can also be observed on closed or open loop combinational structures, such as ROs, which are used for generating clock signals and which consist of a certain number of delay elements. The temporal deviations caused by the propagation delay occurring on the inverters of the RO cause a period irregularity (jitter) in clock signals [2]. The amount of this irregularity is one of the most important performance metrics of TRNGs and directly affects the quality of the generated random numbers. The jitter occurs in two different ways as deterministic and random. The random jitter which is unpredictable for any sampling time is usually expressed with the Gaussian distribution presented in Equation 1. The jitter occurs as a natural result of flicker, shot and thermal noise which depend on the generation and operating conditions of the logic circuit elements. Periodic irregularity and uncertainty occurring in clock signals due to jitter increases over time as depicted in Figure 2 (a) [2,9,19].
The RO (Figure 2 (b)) is a combinational structure in which an odd number of inverters is sequentially connected forming a delay chain. In addition to their simple combinational definitions, free oscillation ROs have a structure that is easy to implement on digital devices. Therefore, they are frequently used in TRNG designs for obtaining jitter signals. The fundamental characteristic of the jitter occurring on ROs is as follows [3,13]: The RO-based architecture used as the true randomness source of TRNG is depicted in Figure 3. In the system, ) N denotes the number of ROs, and f i is the ideal square wave signal obtained at each oscillator output. The average period of the f i signal at any oscillator output T 0 is given by Equation 2, where n is the number of inverters and τ is the delay of a single inverter. The periodic nature of f i is given by Equation 3.
However, f i signals at the oscillator outputs are not in an ideal form due to the instability of the delay occurring on the inverters. In the system in which this unstable delay is represented by T GAUSS, the actual period of the oscillator output signal f i is given by Equation 4. T GAUSS, which represents the jitter, is the random variable of the system and can take values from (-T 0 /2, T 0 /2) for any time t. The true randomness of TRNGs required in terms of cryptography is based on the jitter's Gaussian distribution in Equation 1 [3].
The generic design architecture of the system proposed within the scope of the study is depicted in Figure 3. System consists of three separate oscillatorbased hierarchical true randomness sources used for sampling and post-processing inputs together with the noise source. In the system, two separate RO scenarios, taken from [8,13] consisting of 3 and 13 invert-ers, respectively, were tested as noise/entropy source. These oscillator scenarios are represented by the block structure in Figure 3 (A). The block structures in Figure 3 (B) and 3 (C) are other entropy sources obtained with a minimum design cost from the noise source of the generator. True random signals obtained from these entropy sources are used as sampling and post-processing inputs of TRNG in the system.
The operation logic of the system can be briefly described as follows: The RO outputs of each entropy source combined with XOR process were sampled through D-type flip-flops to obtain f s , f p , and f k true random outputs. In the system, the non-periodic f s signal is obtained by sampling from the entropy source in  are formed as follows: The RO cluster selected for the sampling frequency (f s ) in Figure 3 (B) is B R ={R 1, R 3, R 5 …. R 2k-1 }, and the RO cluster selected for the post-processing input (f p ) in Figure 3 (C) is C R ={R 2, R 4, R 6 ….R 2k }. The use of free-running ROs in the TRNG design was proposed by Sunar in [13]. In the proposed system, it was assumed that the ROs are independent. References [2] and [11] indicate that dependency (also described as phase locking) occurs in 25% of ROs that are supposed to be independent. This caused a loss of entropy in the system. In order to minimize the possible loss of entropy due to this dependence, B R and C R oscillator clusters from which additional inputs were obtained were uniformly spread across the whole set of A R oscillators.
The oscillators in Figure 3  open cyclical structures will not change the oscillator outputs at the logic level. However, they will randomly modify the time-dependent periodic irregularity of the output signals. At this point, the basic idea is to increase true randomness time-dependently by making the random behavior of ROs as different as possible from each other and the noise source.
The relationship between the prime number of inverters in a RO and randomness is explained in [13]. Attention was paid to this relationship when increasing the number of inverters. Let τ a , τ b and τ c denote the number of inverters (which must be prime). Then τ c = τ a + p, τ b = τ c + k, and k ≥ p (see Figure 2 (a)). In order to minimize the power consumption of ROs in the system, values of k and p were kept at a minimum.

Hardware implementation
TRNG scenarios were created with dataflow and schematic design techniques on the Quartus II implementation development platform. In order to measure the real-time performance of the system, the Altera EP-C4GX150 FPGA development board was used during Figure 4: Hardware modeling of the TRNG for a (114,13) scenario the implementation stage. The overall structure of the proposed system is presented in Figure 3. In the system, two separate RO architectures (114,3) and (114, 13) were used as noise sources. The TRNG was tested in six different scenarios for two different noise source architectures and three different operating frequencies. Entropy sources were obtained by modifying the design parameters of these architectures.
The hardware implementation of a TRNG for the (114,3) oscillator architecture is depicted in Figure 4. Figure 4 (A) is the core oscillator structure which is used as the noise source of the TRNG and that consists of three inverters. 2x1 mux was used as the control variable of the oscillator (Figure 2 (a)). The oscillator output is obtained by applying the output of the mux to the input of the block structure containing the inverters forming the delay chain of the oscillator. The data0 and data1 pins of the mux are the enable and feedback inputs, respectively. The selection pin (sel) of the mux is an excitation signal obtained from the physical environment. For the logic '0' value of the selection pin, the oscillator outputs are constant. For the logic '1' state, the RO is in the feedback position, and the outputs oscillate. The noise source is formed by connecting 114 core structures in parallel. For synchronization of the system, the data0 and sel pins of the muxs at the input of the ring oscillators are common and the data0 pin is connected to the + VCC. In the system, by combining the high oscillating RO outputs, which were used for the noise/randomness source, post-processing input, and sampling operation, with the XOR operation ( Figure 4 (B)), R1, R2, and R3 outputs were obtained. Unlike Reference [13] in which a fixed sampling frequency was used, R1 and R2 outputs were sampled with truly random signals (f s ) obtained from R 3 .
In Figure 4, R 1 , R 2 , and R 3 are the combined RO outputs. The oscillator architecture was used for obtaining the combined oscillator outputs R 2 and R 3 . Even-numbered outputs of the (114,3) ROs, the noise source, were used for the output R 3 while odd-numbered outputs were used for the output R 2 . Before they were obtained, the high oscillation R 2 and R 3 outputs were combined with the XOR operation after they were passed through the block structure as depicted in Figure 4 (C) and (D). In Figure 4 (C), as a result of passing each single oscillator output through two extra inverters, the RO cluster with the (57,5) R 2 post-processing input was obtained. In Figure 4 (D), the oscillator cluster, in which (57,7) R 3 non-periodic sampling signals were attained by using four extra inverters, was obtained. The true random numbers/signals (f s ) in Figure 4 (E) used for sampling the outputs R 1 and R 2 in the system were obtained from the output R 3 . For this purpose, the output R 3 was sam-pled with the help of a type D flip-flop at three different frequencies of 50, 100 and 200 MHz obtained from the PLL for each scenario. The obtained true random numbers and the true random outputs obtained by a synchronized sampling of R1 and R2 outputs were combined with the XOR operation in Figure 4 (E), and the outputs of the TRNG were obtained. The simulation results of the random numbers obtained from the system for the 50 MHz sampling frequency are as shown in Figure 5   In order to measure the statistical adequacy of TRNG, two separate test techniques with different application methods were used. Therefore, two different embedded memory components, such as in Figure 4 (F) and (G), were used to obtain suitable random numbers to each test technique from TRNG. Raw true random numbers (digital noise) sampled from the noise source and post-processed (internal) random numbers were recorded to the memory components, respectively. The width of the memory components using the 16-bit up counter for addressing is 1 bit, and depth is 65536 bits.
Therefore, the sample length of each random number sequence obtained for statistical verification from the TRNG is 65536 bits.
In Figure 4, the non-periodic sampling input (f s ) was used at the same time as the clock signal of the counter circuit and memory component. Thus, truly random numbers at the bit level sampled at equal times in the system are simultaneously recorded the memory cells indicated by the counter. The other embedded memory architecture given in Figure 4 (H) is used to measure the output bit rate of the TRNG. Unlike other memory architectures, the periodic sampling input (f clk ) is used as the clock signal of the counter and memory component. Thus, the non-periodic f s signal is also recorded to the memory component in Figure 4 for output bit rate analysis of TRNG.

Experimental results
Cryptographically, one of the most important design evaluation criteria of any TRNG is the process of statistical verification of randomness. This process is essential for the security of   The NIST 800-22 test suite consists of 15 different subtest criteria. The p-value (probability value), which corresponds to the randomness probability of the number sequence subjected to the test for each sub-test criterion, is measured. This value is expected to be absolutely greater than parameter a, which changes according to the typical importance level [0.001-0.01] of cryptographic applications, for each test criterion [1,8]. Statistical validation for a random number sequence for any test criterion where this condition is not met is considered to be unsuccessful.
The real-time test setup, in which statistical results were obtained for NIST 800-22, is presented in Figure 5.
The test technique was applied in three stages in order to observe the effect of additional true random inputs on statistical results in the system. In the first stage, the combined outputs of the (114,3) and (114,13) ROs were sampled independently with 50, 100 and 200 MHz clock signals. Then, the sampling process was repeated with true random signals/numbers, and the effect of the sampling input on statistical results was observed. In this stage in which partial statistical success was achieved, the successful results presented in Table 1 were obtained by including the post-processing input in the system. AIS31 test proposed by BSI (German Federal Office for Information Security) was used another statistical validation tool for TRNG. The AIS31 test, which can be used as a statistical verification tool for generators, is also accepted an international standardization process for RNG designs. For this reason, it is frequently used as a popular test technique in recent studies. AIS31 consists of two separate test protocols A and B applied to raw random numbers (digital noise) and internal random numbers (after the post-processing), respectively. In procedure A, the statistical tests cover only the randomness of the bits and they do not cover their unpredictability. In other words, the statistical tests may detect defects of the randomness source, but they cannot verify its randomness. For this reason, in procedure B the entropy tests are added (Coron's test, Collision test etc.) which can verify the unpredictability of the bits and thus the randomness of the source [24,25].
The test technique consists of a total of 9 separate statistical test criteria, 6 (T0-T5) of these are for procedure A and 3 (T6-T8) for procedure B.  [26,27].
As in the NIST 800-22 test, the AIS31 test was applied to the TRNG offline as in Figure 1 and the results in Table 2 were obtained. All tests from the test procedure A (T0-T5) were executed on the internal random numbers, and all tests from the test procedure B (T6-T8) were executed on the raw random numbers. Target random numbers in which A and B procedures are applied in the system are different from each other. Therefore, a second additional memory architecture as in Figure  5 (G) was used to obtain the raw random numbers needed for procedure A. In addition, for procedures A and B, the minimum length of the target random number sequences must be 5.140.000 bits (20000 * 257 = 5140000 bits) and 7.200.000 bits, respectively.
However, the maximum length of each random number sequence obtained from the system for the test is equal to the depth (65536 bits) of 16-bit counter-supported memory units. Quartus II allows export of block memory contents to a text file in address format at run time. In order to obtain sufficient long random number sequence for test, the memory contents of Figure 5 (F) and (G) were consecutively exported to a text file 80 (80*65536=5.242.880) and 110 (110*65536=7.208.960) times respectively. Then, the memory contents in text format were combined in MATLAB and sufficient length two different test files for AIS31 test were obtained. The number of export process is quite high for six different scenarios in the system. Therefore, the AIS31 test was applied only to random numbers obtained from the 50 MHz sampling scenario and test results are given in Table2. In the system, 100 and 200 MHz sampling scenarios are ignored.
Upon examining the results in Table 1 and Table 2, it is observed that the TRNG provides statistical efficiency needed in terms of cryptography for six different scenarios. When the test results were evaluated, it was demonstrated that the proposed system could also be used for cryptographic applications. . In the system, with additional inputs derived from the noise source in order to provide statistical randomness, a TRNG model, which is simple in terms of hardware and which is easyto-implement with a push button control on digital devices, was obtained. In contrast to complicated postprocessing techniques, XOR post-processing technique was used in the system and the output bit rate of the system was not reduced with the post-processing input. Furthermore, the results given in Table 2 show that TRNG has high entropy per bit and its outputs are unpredictable.
The output bit rate of TRNG is directly dependent on the frequency of the non-periodic sampling input f s obtained from the R3 input. Because the one-bit true random output of TRNG occurs dependent on the changes in the logical level of f s . The f s sampling signal obtained from both noise source scenarios is also non-periodic in other words, it is truly random. In addition to statistical verification for TRNG, the measurement of the output bit rate, another important evaluation criterion, is based on the off-line analysis of the non-periodic f s sampling signal in MATLAB.
Random changes of the non-periodic sampling input f s are recorded to the memory architecture given Figure 5 (H). Thus, 20 different text files, each consisting of 65536 bits, were obtained for the analysis process for two separate scenarios from the memory architecture at run time. In the system, rising-edge triggered D-type flip-flops were used for sampling. Therefore, the output bit rate of the system is determined by looking at the total number of "01" logical transitions randomly occurred in each text file for f s . The logical validation of the method is as in Figure 6. In Figure 6, the R1 and Q represent the combined noise source outputs (R 1 ) and raw true random numbers (f k ) obtained from the noise source in Figure 4, respectively. The total number of sampling transitions of f s for two separate RO scenarios (57,7) and (57,19) is as in Table 3. The f clk is 50 MHz for both RO scenarios in the system. The total number of logic transitions given in Table 3 also represents the average output bit rate of TRNG. The minimum average output bit rate of TRNG is 30.82 and 61.64 Mbps for 100 and 200 MHz values of f clk , respectively. The performance of the proposed TRNG architecture in terms of output bit rate is considerably higher than the other known oscillator-based studies in the literature. When the comparison results given in Table 4 are examined, it is seen that TRNG is successful in terms of output bit rate, which is another important evaluation criterion besides safety.

Conclusions
For TRNGs implemented on digital devices, the entropy of sources of randomness is low. Therefore, it was observed that the statistical quality of random numbers obtained by the pure sampling of ROs in the system was not sufficient to meet the cryptographic competencies. The design architecture was simplified by using true random inputs obtained from the noise source to provide these competencies in the system. The proposed TRNG passed the statistical tests successfully for six different scenarios. The hardware cost of the system, which does not require any additional input from the outside and which does not need the complex postprocessing techniques which limit the bit generation rates of the generators, is very low. Therefore, the number of programmable logic elements required for the implementation is less than 1% of the number of the programmable logic elements of the FPGA device used. Thus, it can even be applied to restricted devices.
The designed TRNG has low power consumption, and the final output can pass the NIST800-22 and AIS31 statistical tests, for a minimum output bit rates of 15.41, 30.82 and 61.64 Mbps. The average power consumption of the TRNG for any scenario is 131 milliwatt (mW). This situation can constitute a disadvantage when the increased structural complexity of cryptographic applications depending on security needs is considered. Therefore, the generator can be stopped and operated for critical applications in which energy consumption is important. A true RNG, which can be controlled easily, is fast enough for cryptographic applications and is easily integrable into the system. Besides, the TRNG has an easily adaptable structure for a hierarchically different scenario, frequency, and post-processing techniques. Again, the fact that the additional inputs obtained from entropy sources are truly random made the generator more secure against possible attacks.