Secure distribution of RPM packages

This blog post looks at the final part of creating secure software: shipping it to users in a safe way. It explains how to use transport security and package signatures to achieve this goal.

yum versus rpm

There are two commonly used tools related to RPM package management, yum and rpm. (Recent Fedora versions have replaced yum with dnf, a rewrite with similar functionality.) The yum tool inspects package sources (repositories), downloads RPM packages, and makes sure that required dependencies are installed along with fresh package installations and package updates. yum uses rpm as a library to install packages. yum repositories are defined by .repo files in /etc/yum.repos.d, or by yum plugins for repository management (such as subscription-manager for Red Hat subscription management). rpm is the low-level tool which operates on explicit set of RPM packages. rpm provides both a set of command-line tools, and a library to process RPM packages. In contrast to yum, package dependencies are checked, but violations are not resolved automatically. This means that rpm typically relies on yum to tell it what to do exactly; the recipe for a change to a package set is called a transaction. Securing package distribution at the yum layer resembles transport layer security. The rpm security mechanism is more like end-to-end security (in fact, rpm uses OpenPGP internally, which has traditionally been used for end-to-end email protection).

Transport security with yum

Transport security is comparatively easy to implement. The web server just needs to serve the package repository metadata (repomd.xml and its descendants) over HTTPS instead of HTTP. On the client, a .repo file in /etc/yum.repos.d has to look like this:

[gnu-hello]
name=gnu-hello for Fedora $releasever
baseurl=https://download.example.com/dist/fedora/$releasever/os/
enabled=1

$releasever expands to the Fedora version at run time (like “22”). By default, end-to-end security with RPM signatures is enabled (see the next section), but we will focus on transport security first.

yum will verify the cryptographic digests contained in the metadata files, so serving the metadata over HTTPS is sufficient, but offering the .rpm files over HTTPS as well is a sensible precaution. The metadata can instruct yum to download packages from absolute, unrelated URLs, so it is necessary to inspect the metadata to make sure it does not contain such absolute “http://” URLs. However, transport security with a third-party mirror network is quite meaningless, particularly if anyone can join the mirror network (as it is the case with CentOS, Debian, Fedora, and others). Rather than attacking the HTTPS connections directly, an attacker could just become part of the mirror network. There are two fundamentally different approaches to achieve some degree of transport security.

Fedora provides a centralized, non-mirrored Fedora-run metalink service which provides a list if active mirrors and the expected cryptographic digest of the repomd.xml files. yum uses this information to select a mirror and verify that it serves the up-to-date, untampered repomd.xml. The chain of cryptographic digests is verified from there, eventually leading to verification of the .rpm file contents. This is how the long-standing Fedora bug 998 was eventually fixed.

Red Hat uses a different option to distribute Red Hat Enterprise Linux and its RPM-based products: a content-distribution network, managed by a trusted third party. Furthermore, the repositories provided by Red Hat use a separate public key infrastructure which is managed by Red Hat, so breaches in the browser PKI (that is, compromises of certificate authorities or misissued individual certificates) do not affect the transport security checks yum provides. Organizations that wish to implement something similar can use the sslcacert configuration switch of yum. This is the way Red Hat Satellite 6 implements transport security as well. Transport security has the advantage that it is straightforward to set up (it is not more difficult than to enable HTTPS). It also guards against manipulation at a lower level, and will detect tampering before data is passed to complex file format parsers such as SQLite, RPM, or the XZ decompressor. However, end-to-end security is often more desirable, and we cover that in the next section.

End-to-end security with RPM signatures

RPM package signatures can be used to implement cryptographic integrity checks for RPM packages. This approach is end-to-end in the sense that the package build infrastructure at the vendor can use an offline or half-online private key (such as one stored in hardware security module), and the final system which consumes these packages can directly verify the signatures because they are built into the .rpm package files. Intermediates such as proxies and caches (which are sometimes used to separate production servers from the Internet) cannot tamper with these signatures. In contrast, transport security protections are weakened or lost in such an environment.

Generating RPM signatures

To add an RPM signature to a .rpm signature, you need to generate a GnuPG key first, using gpg --gen-key. Let’s assume that this key has the user ID “rpmsign@example.com”. We first export the public key part to a file in a special directory, otherwise rpmsign will not be able to verify the signatures we create as it uses the RPM database as a source of trusted signing keys (and not the user GnuPG keyring):

$ mkdir $HOME/rpm-signing-keys
$ gpg --export -a rpmsign@example.com > $HOME/rpm-signing-keys/example-com.key

The name of the directory $HOME/rpm-signing-keys does not matter, but the name of the file containing the public key must end in “.key”. On Red Hat Enterprise Linux 7, CentOS 7, and Fedora, you may have to install the rpm-sign package, which contains the rpmsign program. The rpmsign command to create the signature looks like this:

$ rpmsign -D '_gpg_name rpmsign@example.com' --addsign hello-2.10.1-1.el6.x86_64.rpm
Enter pass phrase:
Pass phrase is good.
hello-2.10.1-1.el6.x86_64.rpm:

(On success, there is no output after the file name on the last line, and the shell prompt reappears.) The file hello-2.10.1-1.el6.x86_64.rpm is overwritten in place, with a variant that contains the signature embedded into the RPM header. The presence of a signature can be checked with this command:

$ rpm -Kv -D "_keyringpath $HOME/rpm-signing-keys" hello-2.10.1-1.el6.x86_64.rpm
hello-2.10.1-1.el6.x86_64.rpm:
    Header V4 RSA/SHA1 Signature, key ID de337997: OK
    Header SHA1 digest: OK (b2be54480baf46542bcf395358aef540f596c0b1)
    V4 RSA/SHA1 Signature, key ID de337997: OK
    MD5 digest: OK (6969408a8d61c74877691457e9e297c6)

If the output of this command contains “NOKEY” lines instead, like the following, it means that the public key in the directory $HOME/rpm-signing-keys has not been loaded successfully:

hello-2.10.1-1.el6.x86_64.rpm:
    Header V4 RSA/SHA1 Signature, key ID de337997: NOKEY
    Header SHA1 digest: OK (b2be54480baf46542bcf395358aef540f596c0b1)
    V4 RSA/SHA1 Signature, key ID de337997: NOKEY
    MD5 digest: OK (6969408a8d61c74877691457e9e297c6)

Afterwards, the RPM files can be distributed as usual and served over HTTP or HTTPS, as if they were unsigned.

Consuming RPM signatures

To enable RPM signature checking in rpm explicitly, the yum repository file must contain a gpgcheck=1 line, as in:

[gnu-hello]
name=gnu-hello for Fedora $releasever
baseurl=https://download.example.com/dist/fedora/$releasever/os/
enabled=1
gpgcheck=1

Once signature checks are enabled in this way, package installation will fail with a NOKEY error until the signing key used by .rpm files in the repository is added to the system RPM database. This can be achieved with a command like this:

$ rpm --import https://download.example.com/keys/rpmsign.asc

The file needs to be transported over a trusted channel, hence the use of an https:// URL in the example. (It is also possible to instruct the user to download the file from a trusted web site, copy it to the target system, and import it directly from the file system.) Afterwards, package installation works as before.

After a key has been import, it will appear in the output of the “rpm -qa” command:

$ rpm -qa | grep ^gpg-pubkey-
…
gpg-pubkey-ab0e12ef-de337997
…

More information about the key can be obtained with “rpm -qi gpg-pubkey-ab0e12ef-de337997”, and the key can be removed again using the “rpm --erase gpg-pubkey-ab0e12ef-de337997”, just as if it were a regular RPM package.

Note: Package signatures are only checked by yum if the package is downloaded from a repository (which has checking enabled). This happens if the package is specified as a name or name-version-release on the yum command line. If the yum command line names a file or URL instead, or the rpm command is used, no signature check is performed in current versions of Red Hat Enterprise Linux, Fedora, or CentOS.

Issues to avoid

When publishing RPM software repositories, the following should be avoided:

  1. The recommended yum repository configuration uses baseurl lines containing http:// URLs.
  2. The recommended yum repository configuration explicitly disables RPM signature checking with gpgcheck=0.
  3. There are optional instructions to import RPM keys, but these instructions do not tell the system administrator to disable the gpgcheck=0 line in the default yum configuration provided by the independent software vendor.
  4. The recommended “rpm --import” command refers to the public key file using an http:// URL.

The first three deficiencies in particular open the system up to a straightforward man-in-the-middle attack on package downloads. An attacker can replace the repository or RPM files while they are downloaded, thus gaining the ability execute arbitrary commands when they are installed. As outlined in the article on the PKI used by the Red Hat CDN, some enterprise networks perform TLS intercept, and HTTPS downloads will fail. This possibility is not sufficient to justify weakening package authentication for all customers, such as recommending to use http:// instead of https:// in the yum configuration. Similarly, some customers do not want to perform the extra step involving “rpm --import”, but again, this is not an excuse to disable verification for everyone, as long as RPM signatures are actually available in the repository. (Some software delivery processes make it difficult to create such end-to-end verifiable signatures.)

Summary

If you are creating a repository of packages you should ensure give your users a secure way to consume them. You can do this by following these recommendations:

  • Use https:// URLs everywhere in configuration advice regarding RPM repository setup for yum.
  • Create a signing key and use them to sign RPM packages, as outlined above.
  • Make sure RPM signature checking is enabled in the yum configuration.
  • Use an https:// URL to download the public key in the setup instructions.

We acknowledge that package signing might not be possible for everyone, but software downloads over HTTPS downloads are straightforward to implement and should always be used.

MVEL as an attack vector

Java-based expression languages provide significant flexibility when using middleware products such as Business Rules Management System (BRMS). This flexibility comes at a price as there are significant security concerns in their use. In this article MVEL is used in JBoss BRMS to demonstrate some of the problems. Other products might be exposed to the same risk.

MVEL is an expression language, mostly used for making basic logic available in application-specific languages and configuration files, such as XML. It’s not intended for some serious object-oriented programming, just simple expressions as in “data.value == 1”. On a surface it doesn’t look like something inherently dangerous.

JBoss BRMS is a middleware product designed to implement Business Rules. The open source counterpart of JBoss BRMS is called drools. The product is intended to allow businesses (especially financial) to implement the decision logic used in their organization’s operations. The product contains a rules repository, an execution engine, and some authoring tools. The business rules themselves are written in a drools rules language. An interesting approach has been chosen for the implementation of drools rules language. The language is complied into MVEL for execution, and it allows the use of MVEL expressions directly, where expressions are applicable.

There is however an implementation detail that makes MVEL usage in middleware products a security concern. MVEL is compiled into plain Java and, as such, allows access to any Java objects and methods that are available to the hosting application. It was initially intended as an expression language that allowed simple programmatic expressions in otherwise non-programmatic configuration files, so this was never a concern: configuration files are usually editable only by the site admins anyway, so from a security perspective adding an expression in a config file is not much different from adding a call in a Java class of an application and deploying it. The same was true for BRMS up to version 5: any drools rule would be deployed as a separate file in repository, so any code in drools rules would be only available for deployment by authorized personnel, usually as part of the company workflow following the code review and other such procedures.

This changed in BRMS (and BPMS) 6. A new WYSIWYG tool was introduced that allowed constructing the rules graphically in a browser session, and testing them right away. So any person with rule authoring permissions (role known as “analyst” rather than “admin”) would be able to do this. The drools rules would allow writing arbitrary MVEL expressions, that in turn allow any calls to any Java classes deployed on the application server without restrictions, including the system ones. This means an analyst would be able to write Sys.exit() in a rule and testing this rule would shut down the server! Basically, the graphical rule editor allowed authenticated arbitrary code execution for non-admin users.

A similar problem existed in JBoss Fuse Service Works 6. While the drools engine that ships with it does not come with any graphical tool to author rules, so the rules must be deployed on the server as before, it comes with RTGov component that has some MVEL interfaces exposed. Sending an RTGov request with an MVEL expression in it would again allow authenticated arbitrary code execution for any user that has RTGov permissions.

This behaviour was caught early on in the development cycle for BxMS/FSW version 6, and a fix was implemented. The fix involves running the application server with Java Security Manager (JSM) turned on, and adding extra configuration files for MVEL-only security policies. After the fix was applied, only the limited number of Java classes were allowed to be used inside MVEL expressions, which were safe for use in legitimate Drools rules and RTGov interfaces, the specific RCE vulnerability was considered solved.

Further problems arose when products went into testing with the fix applied and some regressions were run. It was discovered that it wasn’t a good idea to make the fix with JSM enabled the default setup for productions servers as this caused the servers would run slow. Very slow. Resource consumption was excessive and performance suffered dramatically. It became obvious that making MVEL/JSM fix the default for high-performance production environment was a not an -option.

A solution was found after considerable consultation between Development, QE and Project Management. The following proposals where made for any company running BRMS:

  • When deploying BRMS/BPMS on a high-performance production server, it is suggested to disable JSM, but at the same time not to allow any “analyst”-role users to use these systems for rule development. It is recommended to use these servers for running the rules and applications developed separately and achieving maximum performance, while eliminating the vulnerability by disabling the whole attack vector by disallowing the rule development altogether.
  • When BRMS is deployed on development servers used by rule developers and analysts, it is suggested to run these servers with JSM enabled. Since these are not production servers, they do not require mission critical performance in processing real-time customer data, they are only used for application and rule development. As such, a little sacrifice in performance on a non mission-critical server is a fair trade-off for a tighter security model.
  • The toughest situation arises when a server is deployed in a “BRMS-as-a-service” configuration. In other words when rule development is exposed to customers over the Web (even through VPN-protected Extranet). In this case no other choice is available but to enable complete JSM protection, and accept all the consequences of the performance hit. Without it, any customer with minimal “rule writing and testing” privileges can completely take over the server (and any other co-hosted customers’ data as well), A very undesirable result to avoid.

Similar solutions are recommended for FSW. Since only RTGov exposes the weakness, it is recommended to run RTGov as a separate server with JSM enabled. For high performance production servers, it is recommended not to install or enable the RTGov component, which eliminates the risk of exposure of MVEL-based attack vectors, making it possible to run them without JSM at full speed.

Other approaches are being considered by the development team for new implementation of MVEL fix in the future BRMS versions. Once such idea was to run a dedicated MVEL-only app server under JSM separate from the main app server that runs all other parts of the applications, but other proposals were talked about as well. Stay tuned for more information once the decisions are made.

Remote code execution via serialized data

Most programming languages contain powerful features, that used correctly are incredibly powerful, but used incorrectly can be incredibly dangerous. Serialization (and deserialization) is one such feature available in most modern programming languages. As mentioned in a previous article:

“Serialization is a feature of programming languages that allows the state of in-memory objects to be represented in a standard format, which can be written to disk or transmitted across a network.”

 

So why is deserialization dangerous?

Serialization and, more importantly, deserialization of data is unsafe due to the simple fact that the data being processed is trusted implicitly as being “correct.” So if you’re taking data such as program variables from a non trusted source you’re making it possible for an attacker to control program flow. Additionally many programming languages now support serialization of not just data (e.g. strings, arrays, etc.) but also of code objects. For example with Python pickle() you can actually serialize user defined classes, you can take a section of code, ship it to a remote system, and it is executed there.

Of course this means that anyone with the ability to send a serialized object to such a system can now execute arbitrary code easily, with the full privileges of the program running it.

Some examples of failure

Unlike many classes of security vulnerabilities you cannot really accidentally create a deserialization flaw. Unlike memory management flaws for example which can easily occur due to a single off-by-one calculation, or misuse of variable type, the only way to create a deserialization flaw is to use deserialization. Some quick examples of failure include:

CVE-2012-4406 – OpenStack Swift (an object store) used Python pickle() to store metadata in memcached (which is a simple key/value store and does not support authentication), so an attacker with access to memcached could cause arbitrary code execution on all the servers using Swift.

CVE-2013-2165 – In JBoss’s RichFaces ResourceBuilderImpl.java the classes which could be called were not restricted allowing an attacker to interact with classes that could result in arbitrary code execution.

There are many more examples spanning virtually every major OS and platform vendor unfortunately. Please note that virtually every modern language includes serialization which is not safe by default to use (Perl Storage, Ruby Marshal, etc.).

So how do we serialize safely?

The simplest way to serialize and deserialize data safely is to use a format that does not include support for code objects. Your best bet for serialization almost all forms of data safely in a widely supported format is JSON. And when I say widely supported I mean everything from Cobol and Fortran to Awk, Tcl and Qt. JSON supports pairs (key:value), arrays and elements and within these a wide variety of data types including strings, numbers, objects (JSON objects), arrays, true, false and null. JSON objects can contain additional JSON objects, so you can for example serialize a number of things into discrete JSON objects and then shove those into a single large JSON (using an array for example).

Legacy code

But what if you are dealing with legacy code and can’t convert to JSON? On the receiving (deserializing end) you can attempt to monkey patch the code to restrict the objects allowed in the serialized data. However most languages do not make this very easy or safe and a determined attacker will be able to bypass them in most cases. An excellent paper is available from BlackHat USA 2011 which covers any number of clever techniques to exploit Python pickle().

What if you need to serialize code objects?

But what if you actually need to serialize and deserialize code objects? Since it’s impossible to determine if code is safe or not you have to trust the code you are running. One way to establish that the code has not been modified in transit, or comes from an untrusted source is to use code signing. Code signing is very difficult to do correctly and very easy to get wrong. For example you need to:

  1. Ensure the data is from a trusted source
  2. Ensure the data has not been modified, truncated or added to in transit
  3. Ensure that the data is not being replayed (e.g. sending valid code objects out of order can result in manipulation of the program state)
  4. Ensure that if data is blocked (e.g. blocking code that should be executed but is not, leaving the program in an inconsistent state) you can return to a known good state

To name a few major concerns. Creating a trusted framework for remote code execution is outside the scope of this article, however there are a number of such frameworks.

Conclusion

If data must be transported in a serialized format use JSON.  At the very least this will ensure that you have access to high quality libraries for the parsing of the data, and that code cannot be directly embedded as it can with other formats such as Python pickle(). Additionally you should ideally encrypt and authenticate the data if it is sent over a network, an attacker that can manipulate program variables can almost certainly modify the program execution in a way that allows privilege escalation or other malicious behavior. Finally you should authenticate the data and prevent replay attacks (e.g. where the attacker records and re-sends a previous sessions data), chances are if you are using JSON you can simply wrap the session in TLS with an authentication layer (such as certificates or username and password or tokens).

libuser vulnerabilities

Updated 2015-07-24 @ 12:33 UTC

It was discovered that the libuser library contains two vulnerabilities which, in combination, allow unprivileged local users to gain root privileges. libuser is a library that provides read and write access to files like /etc/passwd, which constitute the system user and group database. On Red Hat Enterprise Linux it is a central system component.

What is being disclosed today?

Qualys reported two vulnerabilities:

It turns out that the CVE-2015-3246 vulnerability, by itself or in conjunction with CVE-2015-3245, can be exploited by an unprivileged local user to gain root privileges on an affected system. However, due to the way libuser works, only users who have accounts already listed in /etc/passwd can exploit this vulnerability, and the user needs to supply the account password as part of the attack. These requirements mean that exploitation by accounts listed only in LDAP (or some other NSS data source) or by system accounts without a valid password is not possible. Further analysis showed that the first vulnerability, CVE-2015-3245, is also due to a missing check in libuser. Qualys has disclosed full technical details in their security advisory posted to the oss-security mailing list.

Which system components are affected by these vulnerabilities?

libuser is a library, which means that in order to exploit it, a program which employs it must be used. Ideally, such a program has the following properties:

  1. It uses libuser.
  2. It is SUID-root.
  3. It allows putting almost arbitrary content into /etc/passwd.

Without the third item, exploitation may still be possible, but it will be much more difficult. If the program is not SUID-root, a user will not have unlimited attempts to exploit the race condition. A survey of programs processing /etc/passwd and related files presents this picture:

  • passwd is SUID-root, but it uses PAM to change the password, which has custom code to modify /etc/passwd not affected by the race condition. The account locking functionality in passwd does use libuser, but it is restricted to root.
  • chsh from util-linux is SUID-root and uses libuser to change /etc/passwd (the latter depending on how util-linux was compiled), but it has fairly strict filters controlling what users can put into these files.
  • lpasswd, lchfn, lchsh and related utilities from libuser are not SUID-root.
  • userhelper (in the usermode package) and chfn (in the util-linux package) have all three qualifications: libuser-based, SUID-root, and lack of filters.

This is why userhelper and chfn are plausible targets for exploitation, and other programs such as passwd and chsh are not.

How can these vulnerabilities be addressed?

System administrators can apply updates from your operating system vendor. Details of affected Red Hat products and security advisories are available on the knowledge base article on the Red Hat Customer Portal. This security update will change libuser to apply additional checks to the values written to the user and group files (so that injecting newlines is no longer possible), and replaces the locking and file update code to follow the same procedures as the rest of the system. The first change is sufficient to prevent newline injection with userhelper as well, which means that only libuser needs to be updated. If software updates are not available or cannot be applied, it is possible to block access to the vulnerable functionality with a PAM configuration change. System administrators can edit the files /etc/pam.d/chfn and /etc/pam.d/chsh and block access to non-root users by using pam_warn (for logging) and pam_deny:

#%PAM-1.0
auth       sufficient   pam_rootok.so
auth required pam_warn.so
auth required pam_deny.so
auth       include      system-auth
account    include      system-auth
password   include      system-auth
session    include      system-auth

This will prevent users from changing their login shells and their GECOS field. userhelper identifies itself to PAM as “chfn”, which means this change is effective for this program as well.

Acknowledgements

Red Hat would like to thank Qualys for reporting these vulnerabilities.

Update (2015-07-24): Clarified that chfn is affected as well and linked to Qualys security advisory.