Remote code execution via serialized data

Most programming languages contain powerful features, that used correctly are incredibly powerful, but used incorrectly can be incredibly dangerous. Serialization (and deserialization) is one such feature available in most modern programming languages. As mentioned in a previous article:

“Serialization is a feature of programming languages that allows the state of in-memory objects to be represented in a standard format, which can be written to disk or transmitted across a network.”

 

So why is deserialization dangerous?

Serialization and, more importantly, deserialization of data is unsafe due to the simple fact that the data being processed is trusted implicitly as being “correct.” So if you’re taking data such as program variables from a non trusted source you’re making it possible for an attacker to control program flow. Additionally many programming languages now support serialization of not just data (e.g. strings, arrays, etc.) but also of code objects. For example with Python pickle() you can actually serialize user defined classes, you can take a section of code, ship it to a remote system, and it is executed there.

Of course this means that anyone with the ability to send a serialized object to such a system can now execute arbitrary code easily, with the full privileges of the program running it.

Some examples of failure

Unlike many classes of security vulnerabilities you cannot really accidentally create a deserialization flaw. Unlike memory management flaws for example which can easily occur due to a single off-by-one calculation, or misuse of variable type, the only way to create a deserialization flaw is to use deserialization. Some quick examples of failure include:

CVE-2012-4406 – OpenStack Swift (an object store) used Python pickle() to store metadata in memcached (which is a simple key/value store and does not support authentication), so an attacker with access to memcached could cause arbitrary code execution on all the servers using Swift.

CVE-2013-2165 – In JBoss’s RichFaces ResourceBuilderImpl.java the classes which could be called were not restricted allowing an attacker to interact with classes that could result in arbitrary code execution.

There are many more examples spanning virtually every major OS and platform vendor unfortunately. Please note that virtually every modern language includes serialization which is not safe by default to use (Perl Storage, Ruby Marshal, etc.).

So how do we serialize safely?

The simplest way to serialize and deserialize data safely is to use a format that does not include support for code objects. Your best bet for serialization almost all forms of data safely in a widely supported format is JSON. And when I say widely supported I mean everything from Cobol and Fortran to Awk, Tcl and Qt. JSON supports pairs (key:value), arrays and elements and within these a wide variety of data types including strings, numbers, objects (JSON objects), arrays, true, false and null. JSON objects can contain additional JSON objects, so you can for example serialize a number of things into discrete JSON objects and then shove those into a single large JSON (using an array for example).

Legacy code

But what if you are dealing with legacy code and can’t convert to JSON? On the receiving (deserializing end) you can attempt to monkey patch the code to restrict the objects allowed in the serialized data. However most languages do not make this very easy or safe and a determined attacker will be able to bypass them in most cases. An excellent paper is available from BlackHat USA 2011 which covers any number of clever techniques to exploit Python pickle().

What if you need to serialize code objects?

But what if you actually need to serialize and deserialize code objects? Since it’s impossible to determine if code is safe or not you have to trust the code you are running. One way to establish that the code has not been modified in transit, or comes from an untrusted source is to use code signing. Code signing is very difficult to do correctly and very easy to get wrong. For example you need to:

  1. Ensure the data is from a trusted source
  2. Ensure the data has not been modified, truncated or added to in transit
  3. Ensure that the data is not being replayed (e.g. sending valid code objects out of order can result in manipulation of the program state)
  4. Ensure that if data is blocked (e.g. blocking code that should be executed but is not, leaving the program in an inconsistent state) you can return to a known good state

To name a few major concerns. Creating a trusted framework for remote code execution is outside the scope of this article, however there are a number of such frameworks.

Conclusion

If data must be transported in a serialized format use JSON.  At the very least this will ensure that you have access to high quality libraries for the parsing of the data, and that code cannot be directly embedded as it can with other formats such as Python pickle(). Additionally you should ideally encrypt and authenticate the data if it is sent over a network, an attacker that can manipulate program variables can almost certainly modify the program execution in a way that allows privilege escalation or other malicious behavior. Finally you should authenticate the data and prevent replay attacks (e.g. where the attacker records and re-sends a previous sessions data), chances are if you are using JSON you can simply wrap the session in TLS with an authentication layer (such as certificates or username and password or tokens).

libuser vulnerabilities

Updated 2015-07-24 @ 12:33 UTC

It was discovered that the libuser library contains two vulnerabilities which, in combination, allow unprivileged local users to gain root privileges. libuser is a library that provides read and write access to files like /etc/passwd, which constitute the system user and group database. On Red Hat Enterprise Linux it is a central system component.

What is being disclosed today?

Qualys reported two vulnerabilities:

It turns out that the CVE-2015-3246 vulnerability, by itself or in conjunction with CVE-2015-3245, can be exploited by an unprivileged local user to gain root privileges on an affected system. However, due to the way libuser works, only users who have accounts already listed in /etc/passwd can exploit this vulnerability, and the user needs to supply the account password as part of the attack. These requirements mean that exploitation by accounts listed only in LDAP (or some other NSS data source) or by system accounts without a valid password is not possible. Further analysis showed that the first vulnerability, CVE-2015-3245, is also due to a missing check in libuser. Qualys has disclosed full technical details in their security advisory posted to the oss-security mailing list.

Which system components are affected by these vulnerabilities?

libuser is a library, which means that in order to exploit it, a program which employs it must be used. Ideally, such a program has the following properties:

  1. It uses libuser.
  2. It is SUID-root.
  3. It allows putting almost arbitrary content into /etc/passwd.

Without the third item, exploitation may still be possible, but it will be much more difficult. If the program is not SUID-root, a user will not have unlimited attempts to exploit the race condition. A survey of programs processing /etc/passwd and related files presents this picture:

  • passwd is SUID-root, but it uses PAM to change the password, which has custom code to modify /etc/passwd not affected by the race condition. The account locking functionality in passwd does use libuser, but it is restricted to root.
  • chsh from util-linux is SUID-root and uses libuser to change /etc/passwd (the latter depending on how util-linux was compiled), but it has fairly strict filters controlling what users can put into these files.
  • lpasswd, lchfn, lchsh and related utilities from libuser are not SUID-root.
  • userhelper (in the usermode package) and chfn (in the util-linux package) have all three qualifications: libuser-based, SUID-root, and lack of filters.

This is why userhelper and chfn are plausible targets for exploitation, and other programs such as passwd and chsh are not.

How can these vulnerabilities be addressed?

System administrators can apply updates from your operating system vendor. Details of affected Red Hat products and security advisories are available on the knowledge base article on the Red Hat Customer Portal. This security update will change libuser to apply additional checks to the values written to the user and group files (so that injecting newlines is no longer possible), and replaces the locking and file update code to follow the same procedures as the rest of the system. The first change is sufficient to prevent newline injection with userhelper as well, which means that only libuser needs to be updated. If software updates are not available or cannot be applied, it is possible to block access to the vulnerable functionality with a PAM configuration change. System administrators can edit the files /etc/pam.d/chfn and /etc/pam.d/chsh and block access to non-root users by using pam_warn (for logging) and pam_deny:

#%PAM-1.0
auth       sufficient   pam_rootok.so
auth required pam_warn.so
auth required pam_deny.so
auth       include      system-auth
account    include      system-auth
password   include      system-auth
session    include      system-auth

This will prevent users from changing their login shells and their GECOS field. userhelper identifies itself to PAM as “chfn”, which means this change is effective for this program as well.

Acknowledgements

Red Hat would like to thank Qualys for reporting these vulnerabilities.

Update (2015-07-24): Clarified that chfn is affected as well and linked to Qualys security advisory.

Single sign-on with OpenConnect VPN server over FreeIPA

In March of 2015 the 0.10.0 version of OpenConnect VPN was released. One of its main features is the addition of MS-KKDCP support and GSSAPI authentication. Putting the acronyms aside that means that authentication in FreeIPA, which uses Kerberos, is greatly simplified for VPN users. Before explaining more, let’s first explore what the typical login process is on a VPN network.

Currently, with a VPN server/product one needs to login to the VPN server using some username-password pair, and then sign into the Kerberos realm using, again, a username-password pair. Many times, exactly the same password pair is used for both logins. That is, we have two independent secure authentication methods to login to a network, one after the other, consuming the user’s time without necessarily increasing the security level. Can things be simplified and achieve single sign on over the VPN? We believe yes, and that’s the reason we combined the two independent authentications into a single authentication instance. The user logs into the Kerberos realm once and uses the obtained credentials to login to the VPN server as well. That way, the necessary passwords are asked only once, minimizing login time and frustration.

How is that done? If the user needs to connect to the VPN in order to access the Kerberos realm, how could he perform Kerberos authentication prior to that? To answer that question we’ll first explain the protocols in use. The protocol followed by the OpenConnect VPN server is HTTPS based, hence, any authentication method available for HTTPS is available to the VPN server as well. In that particular case, we take advantage of the SPNEGO, and the the MS-KKDCP protocols. The former enables GSSAPI negotiation over HTTPS, thus allowing a Kerberos ticket to be used to authenticate to the server. The MS-KKDCP protocol allows an HTTPS server to behave as a proxy to a Kerberos Authentication Server, and that’s the key point which allows the user to obtain the Kerberos ticket over the VPN server protocol. Thus, the combination of the two protocols allows the OpenConnect VPN server to operate both as a proxy to KDC and as a Kerberos-enabled service. Furthermore, the usage of HTTPS ensures that all transactions with the Kerberos server are protected using the OpenConnect server’s key, ensuring the privacy of the exchange. However, there is a catch; since the OpenConnect server is now a proxy for Kerberos messages, the Kerberos Authentication Server cannot see the real IPs of the clients, and thus cannot prevent a flood of requests which can cause denial of service. To address that, we introduced a point system to the OpenConnect VPN server for banning IP addresses when they perform more than a pre-configured amount of requests.

As a consequence, with the above setup, the login processes is simplified by reducing the required steps to login to a network managed by FreeIPA. The user logs into the Kerberos Authentication Server and the VPN to the FreeIPA managed network is made available with no additional prompts.

Wouldn’t that reduce security? Isn’t it more secure to ask different credentials from the user to connect to the home network and different credentials to access the services into it? That’s a valid concern. There can be networks where this is indeed a good design choice, but in other networks it may be not. By stacking multiple authentication methods you could result in having your users trying the different credentials to the different login prompts, effectively training the less security-oriented to try the passwords they were provided anywhere until it works. However, it is desirable to increase the authentication strength when coming from untrusted networks. For that, it is possible, and recommended, to configure FreeIPA to require a second factor authenticator‌ (OTP) as part of the login process.

Another, equally important concern for the single sign-on, is to prevent re-authentication to the VPN for the whole validity time of a Kerberos key. That is, given the long lifetime of Kerberos tickets, how can we prevent a stolen laptop from being able to access the VPN? That, we address by enforcing a configurable TGT ticket lifetime limit on the VPN server. This way, VPN authentication will only occur if the user’s ticket is fresh, and the user’s password will be required otherwise.

Setting everything up

The next paragraphs move from theory to practice, and describe the minimum set of steps required to setup the OpenConnect VPN server and client with FreeIPA. At this point we assume that a FreeIPA setup is already in place and a realm name KERBEROS.REALM exists. See the Fedora FreeIPA guide for information on how to setup FreeIPA.

Server side: Fedora 22, RHEL7

The first step to install the latest of the 0.10.x branch OpenConnect VPN server (ocserv) at the server system. You can use the following command. In a RHEL7 you will also need to setup the EPEL7 repository.

yum install -y ocserv

That will install the server in an unconfigured state. The server utilizes a single configuration file found in /etc/ocserv/ocserv.conf. It contains several directives documented inline. To allow authentication with Kerberos tickets as well as with the password (e.g., for clients that cannot obtain a ticket – like clients in mobile phones) it is needed to enable PAM as well as GSSAPI authentication with the following two lines in the configuration file.

auth = pam
enable-auth = gssapi[tgt-freshness-time=360]

The option ‘tgt-freshness-time’, is available with openconnect VPN server 0.10.5, and specifies the valid for VPN authentication lifetime, in seconds, of a Kerberos (TGT) ticket. A user will have to reauthenticate if this time is exceeded. In effect that prevents the usage of the VPN for the whole lifetime of a Kerberos ticket.

The following line will enable the MS-KKDCP proxy on ocserv. You’ll need to replace the KERBEROS.RELAM with your realm and the KDC IP address.

kkdcp = /KdcProxy KERBEROS.REALM tcp@KDC-IP-ADDRESS:88

Note, that for PAM authentication to operate you will also need to set up a /etc/pam.d/ocserv. We recommend to use pam_sssd for that, although it can contain anything that best suits the local policy. An example for an SSSD PAM configuration is shown in the Fedora Deployment guide.

The remaining options in ocserv.conf are about the VPN network setup; the comments in the default configuration file should be self-explicable. At minimum you’ll need to specify a range of IPs for the VPN network, the addresses of the DNS servers, and the routes to push to the clients. At this point the server can be run with the following commands.

systemctl enable ocserv
systemctl start ocserv

The status of the server can be checked using “systemctl status ocserv”.

Client side: Fedora 21, RHEL7

The first step is to install the OpenConnect VPN client, named openconnect, in the client system. The version must be 7.05 or later. In a RHEL7 you will need to setup the EPEL7 repository.

yum install -y openconnect network-manager-openconnect

Setup Kerberos to use ocserv as KDC. For that you’ll need to modify /etc/krb5.conf to contain the following:

[realms]
KERBEROS.REALM = {
    kdc = https://ocserv.example.com/KdcProxy
    http_anchors = FILE:/path-to-your/ca.pem
    admin_server = ocserv.example.com
    auto_to_local = DEFAULT
}

[domain_realm]
.kerberos.test = KERBEROS.REALM
kerberos.test = KERBEROS.REALM

Note that, ocserv.example.com should be replaced with the DNS name of your server, and the /path-to-your/ca.pem should be replaced by the a PEM-formatted file which holds the server’s Certificate Authority. For the KDC option the server’s DNS name is preferred to an IP address to simplify server name verification for the Kerberos libraries. At this point you should be able to use kinit to authenticate and obtain a ticket from the Kerberos Authentication Server. Note however, that kinit is very brief on the printed errors and a server certificate verification error will not be easy to debug. Ensure that the http_anchors file is in PEM format, it contains the Certificate Authority that signed the server’s certificate, and that the server’s certificate DNS name matches the DNS name setup in the file. Note also, that this approach requires the user to always use the OpenConnect’s KDCProxy. To avoid that restriction, and allow the user to use the KDC directly when in LAN, we are currently working towards auto-discovery of KDC.

Then, at a terminal run:

$ kinit

If the command succeeds, the ticket is obtained, and at this point you will be able to setup openconnect from network manager GUI and connect to it using the Kerberos credentials. To setup a VPN via NetworkManager on the system menu, select VPN, Network Settings, and add a new Network of “CISCO AnyConnect Compatible VPN (openconnect)”. On the Gateway field, fill in the server’s DNS name, add the server’s CA certificate, and that’s all required.

To use the command line client with Kerberos the following trick is recommended. That avoids using sudo with the client and runs the openconnect client as a normal user, after having created a tun device. The reason it avoids using the openconnect client with sudo, is that sudo will prevent access to the user’s Kerberos credentials.

# sudo ip tuntap add vpn0 mode tun user my-user-name
$ openconnect server.example.com -i vpn0

Client side: Windows

A windows client is available for OpenConnect VPN at this web site. Its setup, similarly to NetworkManager, requires setting the server’s DNS name and its certificate. Configuring windows for use with FreeIPA is outside the scope of this text, but more information can be found at this FreeIPA manual.

Conclusion

A single sign-on solution using FreeIPA and the OpenConnect VPN has many benefits. The core optimization of a single login prompt for the user to authorize access to network resources will result in saving user time and frustration. It is important to note that these optimizations are possible by making VPN access part of the deployed infrastructure, rather than an after thought deployment.  With careful planning, an OpenConnect VPN solution can provide a secure and easy solution to network authentication.

The hidden costs of embargoes

It’s 2015 and it’s pretty clear the Open Source way has largely won as a development model for large and small projects. But when it comes to security we still practice a less-than-open model of embargoes with minimal or, in some cases, no community involvement. With the transition to more open development tools, such as Gitorious and GitHub, it is now time for the security process to change and become more open.

The problem

In general the argument for embargoes simply consists of “we’ll fix these issues in private and release the update in a coordinated fashion in order to minimize the time an attacker knows about the issue and an update is not available”. So why not reduce the risk of security issues being exploited by attackers and simply embargo all security issues, fix them privately and coordinate vendor updates to limit the time attackers know about this issue? Well for one thing we can’t be certain that an embargoed issue is known only to the people who found and reported it, and the people working on it. By definition if one person is able to find the flaw, then a second person can find it. This exact scenario happened with the high profile OpenSSL “HeartBleed” issue: initially an engineer at Codenomicon found it, and then it was independently found by the Google security team.

Additionally, the problem is a mismatch between how Open Source software is built and how security flaws are handled in Open Source software. Open Source development, testing, QA and distribution of the software mostly happens in public now. Most Open Source organizations have public source code trees that anyone can view, and in many cases submit change requests to. As well many projects have grown in size, not only code wise but developer wise, and some now involve hundreds or even thousands of developers (OpenStack, the Linux Kernel, etc.). It is clear that the Open Source way has scaled and works well with these projects, however the old fashioned way of handling security flaws in secret has not scaled as well.

Process and tools

The Open Source development method generally looks something like this: project has a public source code repository that anyone can copy from, and specific people can commit to. Project may have some continuous integration platform like Jenkins, and then QE testing to make sure everything still works. Fewer and fewer projects have a private or hidden repository, for one reason because there is little benefit to doing so, and many providers do not allow private repositories on the free plans that they offer. This also applies to the continuous integration and testing environments used by many projects (especially when using free services). So actually handling a security issue in secret without exposing it means that many projects cannot use their existing infrastructure and processes, but must instead email patches around privately and do builds/testing on their own workstations.

Code expertise

But let’s assume that an embargoed issue has been discovered and reported to upstream and only the original reporter and upstream know. Now we have to hope that between the reporter and upstream there is enough time and expertise to properly research the issue and fully understand it. The researcher may have found the issue through fuzzing and may only have a test case that causes a crash but no idea even what section of code is affected or how. Alternatively the researcher may know what code is affected but may not fully understand the code or how it is related to other sections of the program, in which case the issue may be more severe than they think, or perhaps it may not be as severe, or even be exploitable at all. The upstream project may also not have the time or resources to understand the code properly, as many of these people are volunteers, and projects have turn over, so the person who originally wrote the code may be long gone. In this case making the issue public means that additional people, such as vendor security teams, can also participate in looking at the issue and helping to fully understand it.

Patch creation with an embargoed issue means only the researcher and upstream participating. The end result of this is often patches that are incomplete and do not fully address the issue. This happened with the Bash Shellshock issue (CVE-2014-6271) where the initial patch, and even subsequent patches, were incomplete resulting in several more CVEs (CVE-2014-6277, CVE-2014-6278, CVE-2014-7169). For a somewhat complete listing of such examples simply search the CVE database for “because of an incomplete fix for”.

So assuming we now have a fully understood issue and a correct patch we actually need to patch the software and run it through QA before release. If the issue is embargoed this means you have to do so in secret. However, many Open Source projects use public or open infrastructure, or services which do not support selective privacy (a project is either public or private). Thus for many projects this means that the patching and QA cycle must happen outside of their normal infrastructure. For some projects this may not be possible; if they have a large Jenkins infrastructure replicating it privately can be very difficult.

And finally we have a fully patched and tested source code release, we may still need to coordinate a release with other vendors which has significant overhead and time constraint for all concerned. Obviously if the issue is public the entire effort spent on privately coordinating the issue is not needed and that effort can be spent on other things such as ensuring the patches are correct and that they address the flaw properly.

The Fix

The answer to the embargo question is surprisingly simple: we only embargo the issues that really matter and even then we use embargoes sparingly. Bringing in additional security experts, who would not normally be aware due to the embargo, rather than just the original researcher and the upstream project, increases the chances of the issue being properly understood and patched the first time around. Making the issue public will get more eyes on it. And finally, for the majority of lower severity issues (e.g. most XSS, temporary file vulnerabilities) attackers have little to no interest in them, so the cost of embargoes really makes no sense here. In short: why not treat most security bugs like normal bugs and get them fixed quickly and properly the first time around?