# Primes, parameters and moduli

## First a brief history of Diffie-Hellman for those not familiar with it

The short version of Diffie-Hellman is that two parties (Alice and Bob) want to share a secret so they can encrypt their communications and talk securely without an eavesdropper (Eve) listening in. So Alice and Bob first share a public prime number and modulus (which Eve can see). Alice and Bob then each choose a large random number (referred to as their private key) and apply some modular arithmetic using the shared prime and modulus. If everything goes as planned Alice sends her answer to Bob, and Bob sends his answer to Alice. They each take the number sent by the other party, and using modular arithmetic, and their respective private keys are able to derive a number that will be the same for both of them, known as the shared secret. Even if Eve listens in she cannot easily derive the shared secret.

However if Eve has sufficiently advanced cryptographic experts, and sufficient computing power it is conceivable that she can derive the private keys for exchanges when a sufficiently small modulus is used. Most keys, today, are in the range of 1024 bit or larger meaning that the modulus is at least several hundred digits long.

Essentially if Alice and Bob agree to a poorly chosen modulus number then Eve will have a much easier time deriving the secret keys and listening in on their conversation. Poor examples of modulus numbers include numbers that are not actually prime, and modulus numbers that aren’t sufficiently large (e.g. a 1024 bit modulus provides vastly less protection than a 2048 bit modulus).

## Why you need a good prime for your modulus

A prime number is needed for your modulus. For this example we’ll use 23. Is 23 prime? 23 is small enough that you can easily walk through all the possible factors (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), divide 23 by them and see if there is a remainder. But much larger prime numbers, such as ones that are in the mid to high hundreds of digits long, are essentially impossible to factor unless you have a lot of computational resources, and some really efficient ways to try and factor prime numbers. Fortunately there is a simple solution to this, just use an even larger prime number, such as a 2048, 4096 or even 16384 bit prime number. But when picking such a number how can you be sure it’s a prime and not easily factored? Ignoring the obvious give-aways (like all numbers ending in 0, 2, 4, 5, 6 and 8) there are several clever mathematical algorithms for testing the primality of numbers and for generating prime numbers.

## Miller-Rabin, Shawe-Taylor and FIPS 186-4

The Miller-Rabin primality test was first proposed in 1976, and the Shawe-Taylor strong prime generation was first proposed in 1986. One thing that is important to remember, is that back when these algorithms were made public the amount of computing power available to generate/factorize prime numbers was much smaller than is now available. The Miller-Rabin test is a probabilistic primality test, you cannot conclusively prove a number is prime, but by running the Miller-Rabin test multiple times with different parameters you can be reasonably certain that the number in question is probably prime and with enough tests your confidence can approach almost 100%. Shawe-Taylor is also probabilistic, you’re not 100% guaranteed to get a good prime, but the chances of something going wrong and getting a non-prime number are very small.

FIPS 186-4 covers the math and usage of both Miller-Rabin and Shawe-Taylor, and gives specific information on how to use them securely (e.g. how many rounds of Miller-Rabin you’ll need to use, etc.). The main difference between Rabin-Miller and Shawe-Taylor is that Shawe-Taylor generates something that is probably a prime, whereas with Miller-Rabin you generate a number that might be prime, and then test it. As such you may immediately generate a good number, or it may take several tries. In testing on a 3Ghz CPU, using a single core it took me between less than a second and over 10 minutes to generate a 2048 bit prime using the Miller-Rabin method.

The easiest way to deal with the time and computing resources needed to generate primes is to generate them in advance. Also because they are shared publicly during the exchange you can even distribute them in advance, which is what has happened for example with OpenSSL. Unfortunately many of the primes currently in use were generated a long time ago when the attacks available were not as well understood, and thus are not very large. Additionally, there are relatively few primes in use and it appears that there may be a literal handful of primes in wide use (especially the default ones in OpenSSL and OpenSSH for example). There is now public research to indicate that at least one large organization may have successfully attacked several of these high value prime numbers, and as computational power increases this becomes more and more likely.

To generate Diffie-Hellman primes in advance is easy, for example with OpenSSL:

openssl dhparam [bit size] -text > filename

so to generate a 2048 bit prime:

openssl dhparam 2048 -text

Or for example with OpenSSH you first generate a list of candidate primes and then test them:

ssh-keygen -G candidates -b 2048
ssh-keygen -T moduli -f candidates

Please note that OpenSSH uses a list of multiple primes so generation can take some time, especially with larger key sizes.

## Defending – larger primes

The best defense against someone factoring your primes is to use really large primes. In theory every time you add a single bit to the prime you are increasing the workload significantly (assuming no major advances in math/quantum computing that we don’t know about that make factorization much easier). As such moving from a 1024 bit to 2048 bit prime is a huge improvement, and moving to something like a 4096 bit prime should be safe for a decade or more (or maybe not, I could be wrong). So why don’t we simply use very large primes? CPU power and battery power are still finite resources, and very large primes take much more computational power to use, so much so that very large primes like 16384 bit primes become impractical to use, introducing noticeable delays in connections. The best thing we can do here is set a minimum prime size such as 2048 bits now, and hopefully move to 4096 bit primes within the next few years.

## Defending – diversity of primes

But what happens if you cannot use larger primes? The next best thing is to use custom generated prime numbers, this means that an attacker will have to factor your specific prime(s), increasing their workload. Please note that even if you can use large primes, prime diversity is still a good idea, but prime diversity increases the amount of work for an attacker at a much slower rate than using larger prime does. The following apps and locations contain primes you may want to replace:

OpenSSH: /etc/ssh/moduli

Apache with mod_ssl: “SSLOpenSSLCondCmd DHParameters [filename]” or append the DH param to the SSLCertificateFile

Some excellent articles on securing a variety of other services and clients are:

## Current and future state

So what should we do?

I think the best plan for dealing with this in the short term is deploying larger primes (2048 bits minimum, ideally 4096 bits) right now wherever possible. For systems that cannot have larger primes (e.g. some are limited to 768, 2013 bits or other related small sizes) we should ensure that default primes are not used and instead custom primes are used, ideally for limited periods of time, replacing the primes as often as possible (which is easier since they are small and quick to generate).

In the medium term we need to ensure as many systems as possible can handle larger prime sizes, and we need to make default primes much larger, or at least provide easy mechanisms (such as firstboot scripts) to replace them.

Longer term we need to understand the size of primes needed to avoid decryption due to advances in math and quantum computing. We also need to ensure software has manageable entry points for these primes so that they can easily be replaced and rotated as needed.

## Why not huge primes?

Why not simply use really large primes? Because computation is expensive, battery life matters more than ever and latency will become problems that users will not tolerate. Additionally the computation time and effort needed to find huge primes (say 16k) is difficult at best for many users and not possible for many (anyone using a system on a chip for example).

## Why not test all the primes?

Why not test the DH params passed by the remote end, and refuse the connection if the primes used are too small? There is at least one program (wvstreams [wvstreams]) that tests the DH params passed to it, however it does not check for a minimum size, it simply tests the DH params for correctness. The problem with this is twofold, one there would be a significant performance impact (adding time to each connection) and two, most protocols and programs don’t really support error messages from the remote end related to the DH params, so apart from dropping the connection there is not a lot you can do.

## Summary

As bad as things sound there is some good news. Fixing this issue is pretty trivial, and mostly requires some simple operational changes such as using moderately sized DH Parameters (e.g. 2048 bits now, 4096 within a few years). The second main fix for this issue is to ensure any software in use that handles DH Parameters can handle larger key sizes, if this is not possible then you will need to place a proxy in front that can handle proper key sizes (so all your old Java apps will need proxies). This also has the benefit of decoupling client-server encryption from old software which will allow you to solve future problems more easily as well.

# The SLOTH attack and IKE/IPsec

Executive Summary: The IKE daemons in RHEL7 (libreswan) and RHEL6 (openswan) are not vulnerable to the SLOTH attack. But the attack is still interesting to look at .

The SLOTH attack released today is a new transcript collision attack against some security protocols that use weak or broken hashes such as MD5 or SHA1. While it mostly focuses on the issues found in TLS, it also mentions weaknesses in the “Internet Key Exchange” (IKE) protocol used for IPsec VPNs. While the TLS findings are very interesting and have been assigned CVE-2015-7575, the described attacks against IKE/IPsec got close but did not result in any vulnerabilities. In the paper, the authors describe a Chosen Prefix collision attack against IKEv2 using RSA-MD5 and RSA-SHA1 to perform a Man-in-the-Middle (MITM) attack and a Generic collision attack against IKEv1 HMAC-MD5.

We looked at libreswan and openswan-2.6.32 compiled with NSS as that is what we ship in RHEL7 and RHEL6. Upstream openswan with its custom crypto code was not evaluated. While no vulnerability was found, there was some hardening that could be done to make this attack less dangerous that will be added in the next upstream version of libreswan.

Specifically, the attack was prevented because:

• The SPI’s in IKE are random and part of the hash, so it requires an online attack of 2^77 – not an offline attack as suggested in the paper.
• MD5 is not enabled per default for IKEv2.
• Weak Diffie-Hellman groups DH22, DH23 and DH24 are not enabled per default.
• Libreswan as a server does not re-use nonces for multiple clients.
• Libreswan destroys nonces when an IKE exchange times out (default 60s).
• Bogus ID payloads in IKEv1 cause the connection to fail authentication.

The rest of this article explains the IKEv2 protocol and the SLOTH attack.

## The IKEv2 protocol

The IKE exchange starts with an IKE_INIT packet exchange to perform the Diffie-Hellman Key Exchange. In this exchange, the initiator and responder exchange their nonces. The result of the DH exchange is that both parties now have a shared secret called SKEYSEED. This is fed into a mutually agreed PRF algorithm (which could be MD5, SHA1 or SHA2) to generate as much pseudo-random key material as needed. The first key(s) are for the IKE exchange itself (called the IKE SA  or Parent SA), followed by keys for one or more IPsec SAs (also called Child SAs).

But before the SKEYSEED can be used, both ends need to perform an authentication step. This is the second packet exchange, called IKE_AUTH. This will bind the Diffie-Hellman channel to an identity to prevent the MITM attack. Usually these are digital signatures over the session data to prove ownership of the identity’s private key. In IKE, it signs the session data. In TLS that signature is only over a hash of the session data which made TLS more vulnerable to the SLOTH attack.

The attack is to trick both parties to sign a hash which the attacker can replay to the other party to fake the authentication of both entities.

They call this a “transcript collision”. To facilitate the creation of the same hash, the attacker needs to be able to insert its own data in the session to the first party so that the hash of that data will be identical to the hash of the session to the second party. It can then just pass on the signatures without needing to have private keys for the identities of the parties involved. It then needs to remain in the middle to decrypt and re-encrypt and pass on the data, while keeping a copy of the decrypted data.

The initial IKE_INIT exchange does not have many payloads that can be used to manipulate the outcome of the hashing of the session data. The only candidate is the NOTIFY payload of type COOKIE.

The SLOTH attacker is the MITM between the VPN client and VPN server. It prepares an IKE_INIT request to the VPN server but waits for the VPN client to connect. Once the VPN client connects, it does some work with the received data that includes the proposals and nonce to calculate a malicious COOKIE payload and sends this COOKIE to the VPN client. The VPN client will re-send the IKE_INIT request with the COOKIE to the MITM. The MITM now sends this data to the real VPN server to perform an IKE_INIT there. It includes the COOKIE payload even though the VPN server did not ask for a COOKIE. Why does the VPN server not reject this connection? Well, the IKEv2 RFC-7296 states:

When one party receives an IKE_SA_INIT request containing a cookie whose contents do not match the value expected, that party MUST ignore the cookie and process the message as if no cookie had been included

The paper actually pointed out a common implementation error:

To implement the attack, we must first find a collision between m1 amd m’1. We observe that in IKEv2 the length of the cookie is supposed to be at most 64 octets but we found that many implementations allow cookies of up to 2^16 bytes. We can use this flexibility in computing long collisions.

The text that limits the COOKIE to 64 byte is hidden deep down in the RFC when it talks about a special use case. It is not at all clearly defined:

When a responder detects a large number of half-open IKE SAs, it
SHOULD reply to IKE_SA_INIT requests with a response containing the
be between 1 and 64 octets in length (inclusive), and its generation
is described later in this section. If the IKE_SA_INIT response
unchanged.

A few implementations (including libreswan/openswan) missed this 64 byte limitation. So instead, those implements only looked at the COOKIE value as a NOTIFY PAYLOAD. These payloads have a two byte Payload Length value, so NOTIFY data is legitimately 2^16 (65535) bytes. Libreswan will fix this in the next release and limit the COOKIE to 64 bytes.

## Attacking the AUTH hash

Assuming the above works, it needs to find a collision between m1 and m’1. The only numbers they claim could be feasible is when MD5 would be used for the authentication step in IKE_AUTH. An offline attack could then be computed of 2^16 to 2^39 which they say would take about 5 hours. As the paper states, IKEv2 implementations either don’t support MD5, or if they do it is not part of the default proposal set. It makes a case that the weak SHA1 is widely supported in IKEv2 but admits using SHA1 will need more computing power (they listed 2^61 to 2^67 or 20 years). Note that libreswan (and openswan in RHEL) requires manual configuration to enable MD5 in IKEv2, but SHA1 is still allowed for compatibility.

## The final step of the attack – Diffie-Hellman

Assuming the above succeeds the attacker needs to ensure that g^xy’ = g^x’y. To facilitate that, they use a subgroup confinement attack, and illustrate this with an example of picking x’ = y’ = 0. Then the two shared secrets would have the value 1. In practice this does not work according to the authors because most IKEv2 implementations validate
the received Diffie-Hellman public value to ensure that it is larger than 1 and smaller than p – 1.They did find that Diffie-Hellman groups 22 to 24 are known to have many small subgroups, and implementations tend to not validate these. Which led to an interesting discussion on one of the cypherpunks mailinglists about the mysterious nature of the DH groups in RFC-5114. Which are not enabled in libreswan (or openswan in RHEL) by default, and require manual configuration precisely because the origin of these groups is a mystery.

## The IKEv1 attack

The paper briefly brainstorms about a variant of this attack using IKEv1. It would be interesting because MD5 is very common with IKEv1, but the article is not really clear on how that attack should work. It mentions filling the ID payload with malicious data to trigger the collision, but such an ID would never pass validation.

## Counter measures

Work was already started on updating the cryptographic algorithms deemed mandatory to implement for IKE. Note that it does not state which algorithms are valid to use, or which to use per default. This work is happening at the IPsec working group at the IETF and can be found at draft-ietf-ipsecme-rfc4307bis. It is expected to go through a few more rounds of discussion and one of the topics that will be raised are the weak DH groups specified in RFC-5114.

Upstream Libreswan has hardened its cookie handling code, preventing the attacker from sending an uninvited cookie to the server without having their connection dropped.

# DevOps On The Desktop: Containers Are Software As A Service

It seems that everyone has a metaphor to explain what containers “are”. If you want to emphasize the self-contained nature of containers and the way in which they can package a whole operating system’s worth of dependencies, you might say that they are like virtual machines. If you want to emphasize the portability of containers and their role as a distribution mechanism, you might say that they are like a platform. If you want to emphasize the dangerous state of container security nowadays, you might say that they are equivalent to root access. Each of these metaphors emphasizes one aspect of what containers “are”, and each of these metaphors is correct.

It is not an exaggeration to say that Red Hat employees have spent man-years clarifying the foggy notion invoked by the buzzword “the cloud”. We might understand cloudiness as having three dimensions: (1) irrelevant location, (2) external responsibility, and (3) the abstraction of resources. The different kinds of cloud offerings distinguish themselves from one another by their emphasis on these qualities. The location of the resources that comprise the cloud is one aspect of the cloud metaphor and the abstraction of resources is another aspect of the cloud metaphor. This understanding was Red Hat’s motivation for both its private-platform offerings and its infrastructure-as-a-service offerings (IaaS/PaaS). Though the hardware is self-hosted and administered, developers are still able to think either in terms of pools of generic computational resources that they assign to virtual machines (in the case of IaaS) or in terms of applications (in the case of PaaS).

What do containers and the cloud have in common? Software distribution. Software that is distributed via container or via statically-linked binary is essentially software-as-a-service (SaaS). The implications of this are far-reaching.

Given the three major dimensions of cloudiness, what is software as a service? It is a piece of software hosted and administered externally to you that you access mainly through a network layer (either an API or a web interface). With this definition of software as a service, we can declare that  99% of the container-distributed and statically-linked Go software is SaaS that happens to run on your own silicon powered by your own electricity. Despite being run locally, this software is still accessed through a network layer and this software is still—in practice—administered externally.

A static binary is a black box. A container is modifiable only if it was constructed as an ersatz VM. Even if the container has been constructed as an ersatz VM, it is only as flexible as (1) the underlying distribution in the container and (2) your familiarity with that distribution. Apart from basic networking, the important parts of administration must be handled by a third party: the originating vendor. For most containers, it is the originating vendor that must take responsibility for issues like Heartbleed that might be present in software’s underlying dependencies.

This trend, which shows no signs of slowing down, is a natural extension to the blurring of the distinction between development and operations. The term for this collaboration is one whose definition is even harder to pin down than “cloud”: DevOps. The DevOps movement has seen some traditional administration responsibilities—such as handling dependencies—become shared between operational personnel and developers. We have come to expect operations to consume their own bespoke containers and static binaries in order to ensure consistency and to ensure that needed runtime dependencies are always available. But now, a new trend is emerging—operational groups are now embedding the self-contained artifacts of other operational groups into their own stack. Containers and static blobs, as a result, are now emerging as a general software distribution method.

The security implications are clear. Self-contained software such as containers and static binaries must be judged as much by their vendor’s commitments to security as by their feature set because it is that vendor who will be acting as the system administrator. Like when considering the purchase of a phone, the track record for appropriate, timely, and continuous security updates is as important as any feature matrix.

Some security experts might deride the lack of local control over security that this trend represents. However, that analysis ignores economies of scale and the fact that—by definition—the average system administrator is worse than the best. Just as the semi-centralized hosting of the cloud has allowed smaller businesses to achieve previously impossible reliability for their size, so too does this trend offer the possibility of a better overall security environment.

Of course, just as the unique economic, regulatory, and feature needs of enterprise customers pushed those customers to private clouds, so too must there be offerings of more customizable containers.

Red Hat is committed to providing both “private cloud” flexibility and to helping ISVs leverage the decades of investment that we have made in system administration. We release fresh containers at a regular cadence and at the request of our security team. By curating containers in this way, we provide a balance between the containers becoming dangerously out of date and the fragility that naturally occurs when software used within a stack updates “too often”. However, just as important is our commitment to all of our containers being updatable in the ways our customers have come to expect from their servers and VMs: `yum update` for RPM based content, and zips and patches for content such as our popular JBoss products. This means that if you build a system on a RHEL-based container you can let “us” administer it by simply keeping up with the latest container releases *or* you can take control yourself using tools you already know.

Sadly, 2016 will probably not be the year of the Linux desktop, but it may well be the year of DevOps on the desktop. In the end, that may be much more exciting.

# Risk report update: April to October 2015

In April 2015 we took a look at a years worth of branded vulnerabilities, separating out those that mattered from those that didn’t. Six months have passed so let’s take this opportunity to update the report with the new vulnerabilities that mattered across all Red Hat products.

ABRT (April 2015) CVE-2015-3315:

ABRT (Automatic Bug Reporting Tool) is a tool to help users to detect defects in applications and to create a bug report. ABRT was vulnerable to multiple race condition and symbolic link flaws. A local attacker could use these flaws to potentially escalate their privileges on an affected system to root.

This issue affected Red Hat Enterprise Linux 7 and updates were made available. A working public exploit is available for this issue. Other products and versions of Enterprise Linux were either not affected or not vulnerable to privilege escalation.

JBoss Operations Network open APIs (April 2015) CVE-2015-0297:

Red Hat JBoss Operations Network is a middleware management solution that provides a single point of control to deploy, manage, and monitor JBoss Enterprise Middleware, applications, and services. The JBoss Operations Network server did not correctly restrict access to certain remote APIs which could allow a remote, unauthenticated attacker to execute arbitrary Java methods. We’re not aware of active exploitation of this issue. Updates were made available.

“Venom” (May 2015) CVE-2015-3456:

Venom was a branded flaw which affected QEMU. A privileged user of a guest virtual machine could use this flaw to crash the guest or, potentially, execute arbitrary code on the host with the privileges of the host’s QEMU process corresponding to the guest.

A number of Red Hat products were affected and updates were released. Red Hat products by default would block arbitrary code execution as SELinux sVirt protection confines each QEMU process.

“LogJam” (May 2015) CVE-2015-4000:

TLS connections using the Diffie-Hellman key exchange protocol were found to be vulnerable to an attack in which a man-in-the-middle attacker could downgrade vulnerable TLS connections to weak cryptography which could then be broken to decrypt the connection.

Like Poodle and Freak, this issue is hard to exploit as it requires a man in the middle attack. We’re not aware of active exploitation of this issue. Various packages providing cryptography were updated.

BIND DoS (July 2015) CVE-2015-5477:

A flaw in the Berkeley Internet Name Domain (BIND) allowed a remote attacker to cause named (functioning as an authoritative DNS server or a DNS resolver) to exit, causing a denial of service against BIND.

This issue affected the versions of BIND shipped with all versions of Red Hat Enterprise Linux. A public exploit exists for this issue. Updates were available the same day as the issue was public.

libuser privilege escalation (July 2015) CVE-2015-3246:

The libuser library implements a interface for manipulating and administering user and group accounts. Flaws in libuser could allow authenticated local users with shell access to escalate privileges to root.

Red Hat Enterprise Linux 6 and 7 were affected and updates available same day as issue was public. Red Hat Enterprise Linux 5 was affected and a mitigation was published.  A public exploit exists for this issue.

Firefox lock file stealing via PDF reader (August 2015) CVE-2015-4495:

A flaw in Mozilla Firefox could allow an attacker to access local files with the permissions of the user running Firefox. Public exploits exist for this issue, including as part of Metasploit, and targeting Linux systems.

This issue affected Firefox shipped with versions of Red Hat Enterprise Linux and updates were available the next day after the issue was public.

Firefox add-on permission warning (August 2015) CVE-2015-4498:

Mozilla Firefox normally warns a user when trying to install an add-on if initiated by a web page.  A flaw allowed this dialog to be bypassed.

This issue affected Firefox shipped with Red Hat Enterprise Linux versions and updates were available the same day as the issue was public.

Conclusion

The issues examined in this report were included because they were meaningful.  This includes the issues that are of a high severity and are likely easy to be exploited (or already have a public working exploit), as well as issues that were highly visible or branded (with a name or logo), regardless of their severity.

Between 1 April 2015 and 31 October 2015 for every Red Hat product there were 39 Critical Red Hat Security Advisories released, addressing 192 Critical vulnerabilities.  Aside from the issues in this report which were rated as having Critical security impact, all other issues with a Critical rating were part of Red Hat Enterprise Linux products and were browser-related: Firefox, Chromium, Adobe Flash, and Java (due to the browser plugin).

Our dedicated Product Security team continue to analyse threats and vulnerabilities against all our products every day, and provide relevant advice and updates through the customer portal. Customers can call on this expertise to ensure that they respond quickly to address the issues that matter.  Hear more about vulnerability handling in our upcoming virtual event: Secure Foundations for Today and Tomorrow.