It had been almost two years since NIST SP800-90B, draft 2 was released. When the final special publication was released on January 10, 2018, we hadn’t expect it to change as dramatically as between draft 1 to draft 2. After a cursory review, it would appear there are only minor changes to the quantitative elements. With the new published status, we will expect many Common Criteria schemes — if they don’t already — to soon mandate quantitative analysis of the raw entropy source.
While SP 800-90B was in draft form, North American schemes (NIAP and CSE) have permitted labs to evaluate a quantitative analysis (if available) or a qualitative analysis of a vendor’s entropy source. Qualitative analysis is usually relied upon when raw entropy is not easily obtained (such as from hardware sources or from closed-source systems), but is often onerous to author and often inefficient to get through evaluation. By contrast, quantitative analysis can bypass significant discussions on the merits of otherwise opaque hardware and software constructs and quantify the raw entropy as a single number. In this technical post, we will discuss one structured approach to quantitative analysis of a raw entropy source.
Caveat: Entropy is a complex topic. There are often many small but significant details to consider in the noise source under test. We gloss over some of these to, instead, focus on our discussion of the analysis approach. We also are not attempting to reproduce the requirements set down in NIST SP 800-90B.
To begin, it is important to be able to collect from the raw entropy noise source. This is often easier said than done. Assuming there is a way to tap the noise source and have it spill a minimum of 1,000,000 consecutive samples (not bits or bytes), then a quantitative analysis is essentially composed of the following tasks:
- Transform the data from the raw output format into a format more easily digested for analysis;
- Determine and/or show the noise is independent and identically distributed (IID) (or not);
- Bit-rank large sample structures;
- Run through an 800-90B compliant test tool;
- Analyse the results; and
- Use the results to influence the entropy gathering mechanism for the DRBG.
Entropy collection can be difficult; assuming non-IID for all cases is okay for now; decoding the data can be very hard; and testing and analysing the data can lead to necessary development changes in the entropy collection engine. Make sure to start this process early on!
Transform the Sample Data
The initial transformation is important to be able to make sense of the noise samples. Among the first questions we ask at Lightship when we receive an entropy sample file is “what format is the data in?” and “what does each sample represent?”. The first question might have answers such as “it is straight binary data, with each bit representing an independent 1 or a 0 from our TRNG”, or “it is hexadecimal format, with each ASCII hex digit pair representing a single sample byte of binary data”. Ultimately the quantitative toolkit that will be used will demand the necessary transformation. Most likely is that you will need to transform the universe of sample data into a single aggregated bitstring with each sample organized from most significant bit to least significant bit.
Secondly, each sample represents something. In many hardware-based entropy noise sources, the raw sample might be a single bit or a single byte from the hardware mechanism used to generate it. However, when it comes to software noise, the sample is usually a larger, more complicated structure. In Linux (eg. kernel 3.10.14) on a single-CPU x86-64 Intel-compatible virtual machine, we might find each sample based from the timer information to be 16 bytes long:
# from linux-3.10.14/drivers/char/random.c:add_timer_randomness()
(The actual size of the structure can depend on the compiler and whether this is a 32-bit or 64-bit kernel since long integers can be 32-bits or 64-bits on those systems, respectively)
The timer samples from the Linux entropy system might be represented as the following 32-character hex strings:
The remainder of this post will refer back to these two samples as well as the struct code snippet above.
Independent and Identically Distributed (IID)
Once it is clear what each sample represents structurally, it may be possible to easily determine if the samples are IID or not. Statistical samples are independent and identically distributed if each random variable has the same probability distribution as the others and are all mutually independent. In the case of our Linux examples above, the data is clearly not IID because they are (definitely) not identically distributed and they are not independent (primarily because they are related to my single CPU high precision timer — more on this in a moment). If it is difficult to determine the IID-ness of the data quickly, then the IID sanity checks described in section 5 of NIST 800-90B can be run.
If the data is deemed and checked to be IID, then the test cases in section 6.1 of the SP can be run. Else the tests in section 6.2 must be run.
Note that it is safe to always consider data to be non-IID since the IID test cases are a subset of the non-IID test cases. Be aware, however, that running all of the tests can take longer and may end up with a more conservative estimate.
Some of the tests in section 6.3 of 800-90B are intensive for time or space on the number of bits in the input sample space. For this reason, it is somewhat impractical to analyse the entire sample. Therefore, the tester must choose which bits to analyse as per section 6.4 of the SP. Based on the knowledge of the structure of the sample, we can determine the bits most likely to be of value in the calculations.
Using the Linux samples above, we will break down the timing information based on the structure previously presented:
The very next sample is decoded similarly:
There are patterns in here. Specifically, there is a varying amount of uncertainty depending on which bits are being examined. (It is obviously easier to find structure when there are significantly more than 2 samples, but in the interest of brevity, we’ll only consider these two.)
We might consider the jiffies component to be basically deterministic as it is based on the frequency of the programmable interrupt timer. After going through our exercise, if we are desperate for bits to use, then we might consider including the least significant nibble of the jiffie counter in our bit-ranking.
The cycles is, by default on Intel x86 systems, based on the value of a 64-bit Time-Stamp Counter register and shows some variance between the two samples. (Note that the cycles component is only 32-bits regardless as the upper 32-bits of the TSC are discarded leaving the higher-entropy bits available.) Based on these two consecutive samples, we might consider only the lower two bytes (eg. the least significant four hex digits) to be entropy-significant.
Finally, the num is actually an encoding of a specific piece of information tied back to one of the add_input_randomness() and add_disk_randomness() functions and may have very little entropy. For the time-being we can ignore num, unless, again, we are desperate.
Once we go through this mental structural exercise, we can highly rank those previously identified bits we believe have the most entropy as per the description in section 6.4 of NIST SP800-90B. Note that it isn’t really necessary to rank each individual bit when dealing with large sample structures (with the exception of the Markov Test which only uses the lower six bits of the input). You should ensure that the highest ranking bits fall within the input symbol space in a consistent fashion — ideally with the top six in the bottom six bits of the byte. The algorithm presented in section 6.4 of SP 800-90B is one way to provide that consistency. The NIST SP and the associated tool (described below) has an upper limit of 8-bits per symbol (refer to section 3.1.3 of the SP) which means that any entropy-significant bits in the noise sample must lie within those 8 bits.
Of course, any bit-ranking actions must be justifiable and consistent with the sampling and conditioning actions performed in the live system. If the live system is throwing away bits, but the bit-ranking processing stage is not, this will result in an artificially high or low min-entropy value compared with what is being done in the system.
NIST 800-90B Test Tool
With NIST 800-90B now published, it will be important to adhere to the standard. Certainly, NIST’s own Entropy Assessment tool over at Github is a good place to start, though it was originally constructed for draft 1 and draft 2 of the SP. Be sure the tool you select is compliant with the final standard.
Assuming you are able to use the NIST tool, the input bitstring must be an aggregate of the (optionally ranked) sample data organized from most significant bit to least significant bit. The NIST tool doesn’t carve individual bits from each input byte, so a noise source that generates 4-bit samples will encode the 4-bit value within the least significant byte nibble and leave the most significant byte nibble zeroed. If the samples are larger than 8 bits, then they will need to be bit-ranked and truncated since that is the maximum symbol size that the NIST tool deals with. Most interestingly, for 1-bit samples, each byte in the input stream represents a single bit in the least significant bit position. So for a bit-oriented noise noise producing 1 million samples (eg. 1 million bits of noise), the input file to the NIST tool must be 1 million BYTES, where each byte is either 0x00 or 0x01 depending on the discrete bit value.
The output from the tool can then be analysed to determine min-entropy. Be sure to run on a fast machine with lots of RAM.
Analysing the Output of min-entropy
After determining the min-entropy using either the NIST tool above, or another justified statistical analysis package, it is important to know what to do with the result. The NIST tool above will yield a min-entropy from each of the given test cases. The minimum min-entropy from all tests is the final number. Keep in mind that for validation purposes, there might be up to three min-entropy results to sift through to determine the final overall min-entropy. See section 3.1.3 in the SP for more details.
How Much Noise to Seed the DRBG
Once the entropy source data has been boiled down to a floating point value representing the level of uncertainty found in each sample, a developer must ensure their DRBG is being seeded with enough entropy to satisfy strength of function claims. For example, generating AES-256 bit symmetric keys requires a DRBG capable of 256-bits of initial strength. If using the CTR_DRBG with AES256, then as per table 3 of NIST 800-90A, it needs to be supplied with 256 bits of entropy. Using the min-entropy value from the test tool will yield the entropy per sample that is needed to seed the DRBG. This is where one needs to be careful: a sample from the test tool’s perspective may have been a truncated version (eg. from 96-bits to 8-bits). Therefore, ensure that an appropriate entropy rate is being calculated by min-entropy/sizeof(sample). This value can be used to determine the total number of bits from the system needed to successfully seed the DRBG with the required entropy.
While this is all fine and good, it is important to understand the ramifications of data transformations made to the entropy noise data prior to feeding it to the DRBG. Data transformations such as XOR operations, hashing, bit-mixing, etc. may or may not have affects on the entropy and the entropy rate. Such operations can be classified as destructive (eg. throwing away bits) or non-destructive. There are no operations that increase entropy; some operations can only increase the rate of entropy (ie. the density spread over the bit string). Operations that are destructive to the noise sample prior to seeding the DRBG will obviously necessitate more entropy gathering up-front to accommodate the later loss.
One widely used conditioning function to help distribute entropy over a fixed bit string is the use of a cryptographic hashing function such as SHA256. However, carefully consider how the use of hash functions can affect entropy as per section 22.214.171.124.2 of NIST SP800-90B.
Finally, after knowing how much input entropy is required from the raw source(s), a developer will be able to influence how much noise must be gathered prior to seeding the DRBG. If min-entropy is low, this necessitates more noise gathering which might adversely affect startup times (especially in low-entropy environments such as headless Linux servers with low-latency storage disks). Thus, it is important to perform entropy analysis tests early on in the development cycle to ensure there is sufficient time to make any changes if necessary.
For more information about approaches to collecting, decoding and making sense of your entropy sources, talk to Lightship Security.
Greg McLearn is Lightship’s Technical Director. He has been doing Common Criteria and security certifications for 10+ years and enjoys getting his hands on some of the latest technology. He has authored several tools to help facilitate security testing.