CVE-2024-11616: Netskope EPDLP Double-Fetch

Around a year ago, I decided to orient myself more keenly towards vulnerability research. It had become apparent to me that it was the area of offensive security, and the process, that I had enjoyed dipping my toes into the most. I spent much of the last year shoring up fundamentals - finishing Ret2Systems’ excellent Wargames, a bunch of OST2 modules, and the Trainsec Kernel Programming modules - before jumping into some real world targets and starting to develop my workflow. This blog post documents the first vulnerability that I found and reported, CVE-2024-11616, in a driver created by Netskope. The bug is a pretty typical double-fetch, and requires elevated privileges to exploit, but due to the risk of Bring Your Own Vulnerable Driver (BYOVD) attacks, these bugs are still pretty important for vendors to catch. Despite it being a simple bug, successfully reporting something was a milestone I was eager to hit, so a small writeup seemed justified!

Target

For those unfamiliar, Netskope is a leading provider in the Cloud Security and Data Protection space. Like most security companies now, they offer a platform consisting of a multitude of different products and tools. Chief among these are their Data Loss Prevention (DLP), and Secure Service Edge (SSE) products. According to themselves, they are used by at least one third of the Fortune 100.

The vulnerability I’ll be covering was found in the “Endpoint DLP” (EPDLP) feature, an optional add-on in the Netskope client. EPDLP aims to provide visibility and control over files moving between a host and storage device - specifically a USB storage device, printer, network file share, or device connecting over Bluetooth. EPDLP has two constituent parts - Device Control and Content Control. Device control policies determine whether or not a device should be allowed, blocked, or read-only. Content Control policies house more of the actual DLP functionality, allowing you to permit or block files from being transferred based on various criteria. You may decide to only permit writing certain file types to the drive, or only transferring files downloaded from specific applications.

On Windows, these two different parts of EPDLP are implemented in separate drivers - Device Control in epdlp_dev_ctrl.sys and Content Control in epdlpdrv.sys. The bug was found in the Content Control driver, which unsurprisingly has a much larger attack surface than its Device Control counterpart.

epdlpdrv.sys is a particular type of Windows driver known as a file system minifilter. These drivers allow you to read, intercept, and modify, any I/O request making its way to the actual file system driver. Minifilter drivers don’t sit directly in the driver stack, but are instead registered with the Filter Manager (FltMgr), which Microsoft introduced to simplify the process of creating file system filter drivers. For a proper overview, I would insist on reading the inimitable James Forshaw’s article on researching filesystem mini-filters, a post that I returned to dozens of times whilst reverse engineering the driver.

Filter Communication Port Background

If you are familiar with Windows drivers, you can probably skip this bit. And I’ll try to re-tread over only the parts of James Forshaw’s post that are most relevant to the bug.

The Filter Manager introduces a unique method of communication that minifilter drivers can use - Filter Communication Ports. These communication ports are provided by the filter manager to support bidirectional communication between a user-mode app, and a kernel-mode minifilter.

Paraphrasing from the docs, communication is established like so:

The minifilter driver calls FltCreateCommunicationPort, to create the listening server port. Included in its parameters are a security descriptor that is applied to the communication port object, as well as three callback functions, ConnectNotifyCallback, MessageNotifyCallback, and DeleteNotifyCallback. Once the port is created, the minifilter begins listening for incoming connections.
A user-mode app calls FilterConnectCommunicationPort to attempt to connect to the port. The Filter Manager calls the ConnectNotifyCallback, passing it a handle to the newly created connection. When the callback completes, the Filter Manager passes the user-mode caller a separate file handle that represents the user-mode endpoint to the connection.
The user-mode application can then call FilterSendMessage to deliver raw buffers to the MessageNotifyCallback that was supplied when the port was created.

The typical way to communicate with a driver from userland (at least with any flexibility) is to use DeviceIoControl, which without too much of a tangent, involves passing a 4-byte value known as an I/O Control Code (IOCTL). This value is passed in the I/O Request Packet (IRP) to a callback function that the driver registers when it is loaded. This callback will usually then have a switch statement on the IOCTL that calls different functions depending on what the IOCTL contains.

Why use a communication port over the typical non-minifilter driver method of communication, DeviceIoControl? According to Microsoft, it’s faster and more efficient because it isn’t buffered, has more granular security controls due to security descriptors being attached to the port objects (rather than access being tied to the device object), and provides message queueing. James Forshaw sheds some light on why this might be: "…under the hood it’s implemented using the device IO control code 0x8801B. As this code uses the METHOD_NEITHER method means the InputBuffer and OutputBuffer parameters are pointers into user-mode memory. The filter manager does check them before calling the callback with ProbeForRead and ProbeForWrite calls."

This is very relevant for the bug we are going to look at. One of the bits of data used to construct the IOCTL is a Method value that indicates to the I/O Manager how we want to handle the input and output buffers passed in the DeviceIoControl call. The options are as follows:

METHOD_BUFFERED - Causes the I/O Manager to create a buffer in kernelspace that the input buffer is copied to. The driver works with this buffer instead of touching the original buffer in userspace, and then the driver writes output to the same kernel buffer, which the I/O Manager then copies back to the output buffer passed in the DeviceIoControl call. From what I’ve seen, this is the most common case. Documented here
METHOD_IN_DIRECT and METHOD_OUT_DIRECT - For both of these options, the input buffer is copied to a kernelspace buffer, as with METHOD_BUFFERED. For the second buffer supplied, the output buffer, the I/O Manager locks the physical memory so it can’t be paged out, and prepares a Memory Descriptor List (MDL) pointing to that userspace memory. If METHOD_IN_DIRECT has been used, the I/O Manager will probe to ensure the executing thread only has read-access to the buffer (as it is intended to be used as another input buffer). If METHOD_OUT_DIRECT is used, the I/O Manager will probe to ensure the buffer is writable. Documented here
METHOD_NEITHER - The sketchiest option, whereby the I/O Manager just… does nothing. With this method, the driver is responsible for validating the pointers supplied for the input/output buffer. It’s the fastest because the buffers aren’t copied or mapped anywhere, but the most error-prone. Documented here

The filter manager mitigates some of the risk of using METHOD_NEITHER by probing the addresses passed in the buffer arguments, checking they aren’t in kernelspace and that they are correctly aligned. However, because the input buffer isn’t copied to kernel memory, the risk still remains for standard double-fetch bugs. A double-fetch is a type of “Time-Of-Check to Time-Of-Use” bug, usually shortened to TOCTOUs, which are more generally caused by race conditions. The name is sort of self-explanatory - a program checks a value, usually validating that it falls within acceptable limits, and then later in the program the value is used again, assuming it is still within acceptable limits. The race is whether the value can be changed in-between being validated, and being used for something.

When it comes to double-fetch bugs in drivers or the kernel, there is a common pattern:

An input buffer is supplied that exists in user memory
An address in that input buffer is dereferenced and some validation is performed
Later in the program, that same address in the input buffer is dereferenced again and used as a value in a function call (usually as a length or count of some other data)

This issue here is that the value is dereferenced from user memory on both occasions, and can theoretically be changed by the user that allocated the memory. The fix is simple - the driver needs to copy the value to a local variable, and perform both the validation and the actual use of the value using the variable rather than fetching it directly from the input buffer both times. For more information on double-fetches, there is an incredible whitepaper by Mateusz ‘j00ru’ Jurczyk and Gynvael Coldwind, where they find numerous double-fetches at-scale across Windows using the Bochs emulator to analyse memory access patterns. They also cover exploitation and how the odds of winning the race condition can be made more favourable.

As you’ve probably guessed, this pattern is what I found in the EPDLP driver - a double-fetch of user memory provided in the input buffer sent to a filter communication port call.

The Vulnerability

During initialisation in the DriverEntry function, two filter communication ports are created, EpdlpPort and EpdlpPort1. The first one, EpdlpPort, has a security descriptor returned from FltBuildDefaultSecurityDescriptor, resulting in the port only being accessible to users with system or administrator privileges (the second port, EpdlpPort1, uses the same security descriptor, but only after a call to RtlSetDaclSecurityDescriptor, where the DACL on the security descriptor is nulled out - effectively giving all users unconditional access to the port object).

The first communication port is restricted to privileged users because it is the interface exposing the administrative functionality for the Endpoint DLP service. It is called by the EPDLP userland service, epdlp.exe.

if (FltBuildDefaultSecurityDescriptor(&SecurityDescriptor, DesiredAccess: 0x1f0001) s< STATUS_SUCCESS)
	goto label_140021630
                        
OBJECT_ATTRIBUTES ObjectAttributes
ObjectAttributes.ObjectName = &commPortName
ObjectAttributes.SecurityDescriptor = SecurityDescriptor
ObjectAttributes.Length = 0x30
ObjectAttributes.RootDirectory = 0
ObjectAttributes.Attributes = 0x240
ObjectAttributes.SecurityQualityOfService = 0

FltCreateCommunicationPort(Filter: fltFilter, ServerPort: &serverPort1, &ObjectAttributes, ServerPortCookie: nullptr, ConnectNotifyCallback: CreateEpdlpHandle, DisconnectNotifyCallback: disconnectNotifyCallback1, MessageNotifyCallback: messageNotifyCallbackFunc, MaxConnections: 1)

The MessageNotifyCallback registered to the EpdlpPort communication port simply calls another function - which here I’ve just named messageNotify - passing only the PortCookie, InputBuffer, and InputBufferLen arguments from the callback.

Disassembly of messageNotify function

At 1, we can see, the messageNotify function dereferences the first 4-bytes of the InputBuffer and uses this value in a switch statement to determine which administrative function should be performed. Most of the options here relate to EPDLP functionality that isn’t enabled by default - the driver actually contains capability to inject a DLL into userland processes via a Kernel APC, and there are administrative options here to manually trigger that process, or to update a list of excluded processes that shouldn’t be hooked.

In this case we care about the first option, as EpdlpSetUsbAction is where we find our double-fetch. At 2 we can see some initial validation checks on the arguments supplied, namely that the InputBuffer isn’t NULL, and that its length isn’t less than 4 bytes. Then at 3 in the case 1 block, there is also a comparison made between the InputBufferLen and a dereferenced value at InputBuffer+4, indicating that we likely have other length fields supplied by the user. The InputBuffer is then passed to EpdlpSetUsbAction as the only argument at 4.

Disassembly of EpdlpSetUsbAction function

The EPDLP driver maintains a global array of objects that describe mounted volumes on the system. The EpdlpSetUsbAction function, as the name suggests, allows the caller to update a value on a specific one of these objects, determining whether the USB is accessible. The caller supplies the ID (field name DeviceInstanceId) of the target volume as a wide char string in InputBuffer+0x10, along with a corresponding string length at InputBuffer+0xc. The function later loops through the global array of objects and compares the supplied ID in order to find the right one. The DeviceInstanceId contained in the object, however, is a UNICODE_STRING, so the EpdlpSetUsbAction function copies the supplied string into an allocated buffer and constructs the UNICODE_STRING object - and this is where the bug lies.

As viewable in the function above at 1, the supplied string length at InputBuffer+0xc is used as the NumberOfBytes argument in the ExAllocatePoolWithTag function. The returned pointer is assigned as the Buffer value in the new UNICODE_STRING. Then, RtlCopyMemory is called in order to copy the caller supplied string into the heap memory, and the Length argument is a second dereference of InputBuffer+0xc, noted as 2.

This is the double-fetch, and by causing the ExAllocatePoolWithTag call to execute with a NumberOfBytes value that then grows into a larger value by the subsequent RtlCopyMemory call, we can overflow the heap memory with user-supplied data.

PoC

Exploiting a double-fetch for a PoC typically involves a few steps:

Allocate some memory to use as our input buffer
Create a “flipping” thread that runs a while loop that continually XORs the value of a global variable. This global variable will be the value in the buffer that is double-fetched by the driver. (For the purpose of a crash, we can simply XOR the value to make it alternate between a small number and a bigger one.)
In our main thread, run a function with another while loop that continually calls FilterSendMessage with our input buffer.

In this particular case, there is one other thing we need to do that I haven’t mentioned yet. Astute readers may have noticed earlier that the first FltCreateCommunicationPort call passes a MaxConnections argument with its value set to 1. This connection will already be in use by the EPDLP userland service. For us to be able to connect to the port and reach the double-fetch, we need to kill the service to free the connection. Seeing as we need elevated privileges to connect to the communication port, this isn’t much of a hurdle. If we didn’t need elevated privileges to call the communication port, it would have been worthwhile looking for a way to DoS/restart the EPDLP service as an unprivileged user in order to create a window where the communication port connection is available.

PoC code is here: https://github.com/inb1ts/CVE-2024-11616

Also shoutout to the Netskope PSIRT team who were always responsive and easy to work with throughout the reporting process!

Target#

Filter Communication Port Background#

The Vulnerability#

PoC#

References#

Target

Filter Communication Port Background

The Vulnerability

PoC

References