The mighty SlashDot has a post about the Random Number Generators that come supplied with the Linux OS may not be random enough, or specifically may not emulate entropy well enough*. As I find RNG in programming to be darn interesting (also know as PRNG – pseudorandom number generation, because you know it can never be really truly random) I’m sharing it here so I can ponder it and thrash it out.
As a fan of this stuff the /. post page is wonderful. Arguments back and forth, supposition and analysis on the arguments and allegedly brutal attacks between Linus and the community, and the fall out when cooler heads prevail. It is a great example of the online debate that happens around the open source community, where we can actually read for ourselves and appreciate the breadth of the impacts and discussion. The impacts from the choices made in the development of RdRand, vs /dev/random and /dev/urandom are very important within the scope of PRNG, but they are also a very small part of the overall Linux kernel, such that the scope of change and the community based observations can be consumed.
I love it for the fact that it is a debate amongst a niche community (relative to the Internet) where passions drive separate and disparate goals. Here is a bit of info and a quick meandering.
As the header in the SlashDot article quotes from the source article:
“As a followup to Linus’s opinion about people skeptical of the Linux random number generator, a new paper analyzes the robustness of /dev/urandom and /dev/random . From the paper: ‘From a practical side, we also give a precise assessment of the security of the two Linux PRNGs, /dev/random and /dev/urandom. In particular, we show several attacks proving that these PRNGs are not robust according to our definition, and do not accumulate entropy properly.
These attacks are due to the vulnerabilities of the entropy estimator and the internal mixing function of the Linux PRNGs. These attacks against the Linux PRNG show that it does not satisfy the “robustness” notion of security, but it remains unclear if these attacks lead to actual exploitable vulnerabilities in practice.'”
It is ponderous to think of how random and how entropic a PRND process has to be when shipped with an OS. I can follow that the best possible solution should ideally be used, but can also see a point at which the degree of randomness/entropy desired is provided by a range of tools; and perhaps for most installations that vulnerability in the RNG’s breadth of entropy is actually fine. Meaning – how random does a media box for a TV need to be? I’d start the analysis with a purpose in mind to keep to the best of the communities use for the OS, and consider the edge case where a specialised implementation might be performed anyway, which renders the default delivered in the OS package somewhat moot.
i.e. A server running primarily as a file server, or providing innocuous functions probably needs very little in advanced RNG as it has very little Crypto base tasks to perform. Sure there might be a few impacts which are less than perfect around the edges, but not many. Conversely a system which is being depended on for very high security functions or transactions does not want a vulnerability which could be exploited, or a value which could predicted.
If the default OS packages have a security flaw inherent rather than a point of vulnerability in their implementation then that is fixable. The debate and banter really got into high gear in the comments when the specialists joined the discussions.
The debate also breached the possibility of some of these tools for randomness having vulnerabilities which were either exploited by, or designed by the American NSA. Respectfully it is an discussion that I find distracting at best. If the community can not find over time the vulnerabilities in the code, be they either intentional or not, then the experiment has failed. In truth so far, the experiment is working. At least from where I am sitting looking at the body of work for Linux, and all the derivations.
Firstly, what in hell does the entropy have to do with the random numbers in a program?
If I understand it (which is questionable) then the entropy as in use by an algorithm needs to be sourced so that it is not predictable. This is not “entropy” meaning “decay”, this is “entropy” meaning “unpredictable”.
How random does a program’s RNG need to be?
For me, a 1 in 1000 generated number range being guessed correctly 1 in 200 times shows a problem. That is not a huge problem if app itself is just a toy, but in a commercial product I think that can be considered undesirable. This is the range of random that one of the default tools in the Linux distro potentially could create if an attack was performed and it was not setup correctly. I think that is a narrow attack probability, but it is there.
Like everything else in IT, the context of use is really critical to how severe the impact is, and how much work is invested to getting to a better result.
i.e. N 1 to 1000 can be correctly guessed roughly every 200 attempts – that is poor. I guess that 1 in 200 is really low by layman’s standards, but exceedingly high by crypto and pRNG standards.
A chance of guessing a 1 in 1000 number being 1 in 1000 is obviously darn reasonable, and not an issue at all. Another part of the setup which can sometimes be performed incorrectly (especially in Windows based apps using the default RNG functions) is when the app is not seeded properly.
I found this a long time ago when writing a simple dice roller program where my seed value was going through a small algorithm, but about a third of the time the result was the same. This meant that it was likely (1 in 4) that the app would have the same result for the first “random” number. Not sufficiently random at all. My implementation in this case was the problem, which I fixed, but it did get me thinking about how better to do it.
I’m still pondering and reading.
What is true random, what is PRNG (computer algorithms) vs TRNG (a physical phenomenon)?
Random is quite a theoretical and philosophical concept. The Pseudo in P-RNG is really saying that the machine always need a way to get a source number (seed value) and they use computed values and algorithms to do so. TRNG (or True RNG) uses non-machine generated sources for these source numbers, and so are considered True-er. TRNG is better, but PRNG is very efficient, reliable, and often faster to create. In highly complex systems the TRNG vs PRNG numbers could yield very very similar results.
Random.org has a summary worth reading if you’re curious. I found much of the site to be darn complex and heavy reading, but also very fun. Their service gives TRNG numbers from atmospheric noise. That alone is funky.
Yes, yes, incorrect and nice blather Andrew, but where are the web comics?
Straight from the powerful and godlike comments in SlashDot, here are a few most trustworthy and darn sly web-comics about random from Dilbert and XKCD.
I’m sure you follow their gist, fair point well made; although plainly ridiculous. I really like them.
I hope you enjoyed this post at least 1/8th as much as I did thinking about it.
* How random/entropic was (probably) the key point of discussion, and segwayed into how rambunctious Linus’s response to the feedback was when he refused to default the system to exclude the pRNGs that were deemed by some to be not good enough.