Security

Building the inverse: the design decisions behind a three-layer dropper

The same architecture from the other direction. Why a layered .NET dropper has three layers and not two or four, what each layer is solving for, and which defender assumptions the design quietly presupposes. A composite reference design drawn from the artefact dissected in Three layers deep and adjacent commodity infostealer campaigns through 2024 and 2025. Architecture and decisions, not source. Companion to the reverse-engineering writeup.

Arthur Dutra··34 min readShare ↗RSS

Defining the framework

Reverse engineering reveals what an artifact does, though only rarely does it explain why the artifact is shaped the way it is. The decisions behind a layered dropper are seldom the obvious ones; the choices that matter are made under defender pressure, namely the question of which detection surface to attack and which to concede, what cost to pay in execution complexity, and which telemetry must be absorbed because it cannot be evaded. The aim of what follows is to document those decisions.

The reference design is a composite, drawn from a class of three-layer .NET droppers observed in commodity infostealer campaigns through 2024 and 2025, inter alia the artifact dissected in Three layers deep. Its general shape consists of a batch script that serves as initial carrier, a PowerShell stage that loads shellcode into memory, and a Donut-packed .NET implant injected into a long-running host process. None of the individual techniques discussed below are novel; what merits attention is the architecture that combines them, and more precisely the trade-offs that bind one layer to the next.

Two clarifications are owed to the reader before the substantive material begins. The first is that what follows is not a build log. No source is provided for any layer, because techniques are referenced by name rather than by implementation, and the prose is organised as a sequence of decision records rather than as code listings. The second is that this is not a defender's tutorial either. Although every section concludes with a defender read, those reads exist to render the architectural reasoning legible to both sides of the engagement, not to furnish ready-made detection rules.

The sections that follow treat each layer as a response to a particular defender capability. Layer one is shaped principally by content scanning and the dynamics of sandbox detonation, whereas layer two answers to the constraints imposed by AMSI and ScriptBlock logging. Layer three, situated furthest from the initial intrusion, must contend with memory scanning and the telemetry that surrounds process injection. The seven-byte memory marker examined in section five is a curiosity unto itself, because it solves not for the defender at all but for the operator's own self-inflicted noise. Section six closes the loop by enumerating what each layer presupposed about the defender, and by identifying the points at which those presuppositions break.

Section 1. The constraint set

The shape of a modern dropper is not chosen; it is forced. To explain why a layered architecture exists at all, one must begin not with the operator's ambitions but with the defender's surface, because the latter dictates what is possible and the former merely chooses among what remains.

The defender surface, considered as a whole, has changed substantively in the past decade. Static signature matching, although it once constituted the bulk of endpoint defence, now occupies the lowest rung of the detection stack, where it is consulted chiefly to discharge known-bad samples cheaply before more expensive analysis begins. Above it sits behavioural detection, instantiated in most environments as a commercial EDR product, which observes process behaviour qua behaviour, i.e. independently of the file from which the behaviour originated. Alongside behavioural detection on Windows hosts one finds script logging, comprising AMSI, ScriptBlock logging, and Module logging, which exposes the substance of interpreted code to the host operating system at the moment of evaluation and thereby renders compile-time obfuscation largely inert. Event Tracing for Windows makes available a stream of kernel-mediated events that a sufficiently invested defender can subscribe to, and which is increasingly consulted by EDR products as a primary telemetry source. The sandbox, treated as a logical artefact rather than as a particular product, applies a small amount of dynamic execution to a candidate sample in the hope that its true behaviour will manifest within a tractable budget of time and resources.

Each of these capabilities, considered in isolation, can be defeated by an operator with sufficient discipline. The compounding effect arises from the fact that they cannot be defeated simultaneously by the same artefact. A payload optimised to evade static signatures by means of polymorphism remains visible to behavioural detection the moment it executes; a payload that minimises its behavioural footprint by foregoing reflective loading exposes itself to file-based scanning at every stage of disk persistence; a payload that consolidates its functionality into a single small artefact to defeat sandbox heuristics about size and complexity ipso facto becomes a more legible static target. The constraint set is, in formal terms, overdetermined: no single artefact can satisfy all of it at once.

The operator's response to this overdetermination is decomposition. The functional work of a dropper, which may be described in the abstract as the movement of an operator-controlled implant from the network into stable execution on a target host, is decomposed into stages, each of which optimises against a different subset of the defender surface. The shape of that decomposition is not arbitrary; it is governed by what may be called the principle of detection-surface separation, viz. that each stage of the dropper should present a separate detection surface, such that no single defender mechanism can observe the totality of the operation.

The three-layer shape that this post examines is the minimum viable expression of that principle in the .NET ecosystem on contemporary Windows hosts. A two-layer composition, in which a single script loads an implant directly, can be made to work, although only in environments where one of the major defender capabilities is degraded or absent. A four-layer composition, by contrast, is observed in more sophisticated campaigns and offers additional separation, at the cost of additional moving parts and of additional opportunities for the operator to introduce errors that betray the operation. The three-layer arrangement chosen as the reference design for this post represents the architectural minimum that survives a typical enterprise endpoint without further accommodation.

The constraints may be enumerated more concretely. The carrier, designated here as layer one, must survive content scanning by an inline mail or web gateway, and must survive a brief sandbox detonation without revealing its intent. The loader, layer two, must operate within an interpreted runtime that is heavily instrumented for script logging, and must transition from interpreted text to executable bytes without writing those bytes to disk where a file-based scanner can observe them. The implant, layer three, must operate within a long-running process without producing the cross-process artefacts that EDR products are configured to flag, and must do so for the duration of the operator's interest in the host, which may be days or weeks.

To these external constraints, three operator-internal ones are added: idempotence, because the operator cannot tolerate double-execution of any layer; recoverability, because the loss of any single layer must not require the rebuild of the entire chain; and minimal self-telemetry, because the operator's own actions, ill-considered, generate observable artefacts as readily as the defender's instrumentation does. The seven-byte memory marker examined in section five exists precisely because the first of these constraints, idempotence, was at one point violated by a real campaign with consequences expensive enough to motivate the design.

Defender read. The most useful observation a defender can draw from this section is that the layered shape is not chosen for novelty or cleverness, but is the structural consequence of effective defence. A defender who has driven the operator into a three-layer architecture has already succeeded at the strategic level; the remaining task is to detect any one of the three layers reliably enough that the operator cannot rely on the others to compensate. Detection programmes that focus exclusively on the final stage, viz. the implant, leave the earlier and more legible stages unattended, which is precisely where the operator's compromises and infelicities tend to manifest.

Section 2. Layer one: the batch carrier

The first decision the operator must make is the choice of carrier, viz. the format of the initial artefact that crosses the boundary from operator-controlled infrastructure into the target environment. The set of viable carriers on a Windows host is narrower than it appears at first inspection, because each candidate format imposes its own constraints on what can be done within it, on what defender mechanisms it must traverse, and on what the recipient is willing to execute without further intervention.

The .bat extension, taken on its own merits, is an unpromising choice. It is conspicuous, it has been associated with malicious activity for the entirety of its existence, and it executes in a runtime, viz. the Windows command processor, that is comparatively meagre in its expressive capacity. That batch is nevertheless a recurring choice in droppers of this class follows from three considerations that, taken together, outweigh its disadvantages.

The most immediate of these is interpreter availability. The Windows command processor is present on every Windows host, by default, in a version that has changed little in two decades. The operator therefore need not concern themselves with version-skew, with the presence or absence of an interpreter that a later .NET or PowerShell stage might require, or with the host's language mode configuration. The interpreter is, in this respect, a load-bearing assumption that may be made without verification.

A second consideration, less immediate but more consequential, is signature ambiguity. Batch files, considered as a corpus, are dominated overwhelmingly by legitimate administrative scripts of all conceivable shapes and sizes, with the consequence that the signal-to-noise ratio confronting a defender attempting to triage batch files at scale is poor, especially in enterprise environments in which legacy scripts are common. The operator exploits this not by hiding the maliciousness of the script, which is difficult, but by hiding the script itself in the statistical noise of legitimate batch activity, which is markedly easier.

The third and least appreciated of the three is telemetry asymmetry. Behavioural detection products on Windows are heavily invested in PowerShell instrumentation, owing both to its prevalence in offensive tradecraft and to the substantial logging surface (AMSI, ScriptBlock, Module) that Microsoft has exposed for the purpose. The command processor, by contrast, is comparatively under-instrumented at the script-content level. A defender can observe that cmd.exe was invoked and what arguments it received, although the content of the batch script itself is, in most deployments, not subject to the same kind of inline scrutiny that PowerShell receives. This asymmetry makes batch a useful place to perform work that the operator would prefer not to perform under the gaze of script logging.

These three considerations, taken together, define layer one's purpose with some precision. The carrier exists not to do harm directly, which would expose it to the very detection mechanisms its choice of format was meant to evade, but to deliver layer two into the runtime in which layer two is designed to operate, namely a PowerShell session. Layer one is, in this sense, a postal envelope: its contents do the work, and the envelope itself is required only to survive transit and to remain unremarkable upon inspection.

Survival in transit imposes its own constraints. The most consequential of these is the behaviour of dynamic analysis platforms, both the on-premises sandboxes that mail gateways and EDR products consult and the cloud-hosted services with which security teams enrich their alerts. Such platforms apply a small budget of execution time to each candidate sample, on the order of two to four minutes per artefact, and treat the absence of overtly malicious behaviour within that budget as a weak signal of benignity. The operator's response to this constraint is to ensure that layer one consumes a non-trivial portion of the sandbox's budget on uninteresting work before any portion of the dropper's actual functionality is invoked.

The padding observed in the reference design, comprising approximately 1.1 megabytes of inert text appended to a script whose functional content amounts to perhaps a few hundred bytes, performs three simultaneous roles in service of this strategy. It induces additional parsing overhead, because the command processor must traverse the entirety of the file before reaching the functional content, and this traversal consumes sandbox time without producing telemetry. Beyond the matter of time, the same padding defeats hash-based reputation through small mutations introduced between deployments, which produce wholly distinct hashes and prevent the defender from accruing reputation against any single sample. There is, lastly, a scanner heuristic common to several commercial products by which very large text files are treated as a priori unlikely to be malicious in the absence of other indicators, on the assumption that genuine malware is small and tightly packed; the padding meets this threshold by deliberate design.

The composition of the padding is itself a design decision of some consequence. Random bytes, although effective at defeating hash-based reputation, are conspicuous under entropy analysis and may themselves serve as an indicator. The reference design instead populates the padding with comment lines, of which the artefact dissected in Three layers deep contained approximately three and a half thousand. Comments are syntactically valid batch content, they bear an entropy signature indistinguishable from that of legitimate scripts, and they admit of a further use that random bytes do not, viz. the steganographic concealment of operational data among them. Layer one's transition to layer two, which entails the construction of a PowerShell command line from substrings drawn from particular comment positions, is performed by the script itself at runtime; the comments are therefore not inert decoration but the substrate from which the next stage's command is assembled.

This design carries non-trivial costs. File-size anomaly detection, where it is configured at all, will flag a batch script of the dimensions described above as worthy of attention, because legitimate batch scripts of that size are uncommon. Some commercial scanners have, in recent years, begun to subject very large text files to closer inspection rather than to the cursory triage that the heuristic above presupposes. The command processor's parsing of a heavily commented script, furthermore, produces a characteristic process-timing profile that a sufficiently invested defender can observe through ETW. None of these costs is prohibitive in the present defender landscape, although each represents a direction in which the layer's effectiveness is likely to degrade as defender capability matures.

Defender read. The most productive observation a defender can draw from the carrier layer is that its disguises are quantitative rather than qualitative. A batch script of approximately a megabyte, the majority of which is comments, is a strong indicator on its own, because no legitimate use case produces that shape. Detection rules that flag batch scripts above a size threshold, or that compute the ratio of comment lines to functional lines, identify this carrier class with high precision and acceptable recall. The asymmetry of script-content instrumentation between cmd.exe and PowerShell, which the operator exploits, is a configuration gap rather than a fundamental limitation of the platform, and several open-source projects now provide content-level visibility into batch execution at acceptable cost. A defender programme that addresses these two observations closes most of the surface that this layer was designed to exploit.

Section 3. Layer two: the PowerShell loader

If layer one functions as the envelope, layer two performs the work of translation. Its task is to receive a sequence of bytes delivered by the carrier, transform those bytes into executable code, and surrender control to that code without leaving artefacts on disk or producing telemetry that would distinguish the operation from legitimate administrative scripting. The runtime in which this translation occurs is, in the reference design and in most droppers of the class examined here, the Windows PowerShell host.

The choice of PowerShell is, on its face, perverse. Of all the interpreted runtimes available on a Windows host, PowerShell is the one most heavily instrumented by Microsoft for the purpose of detecting exactly the operations a dropper performs. AMSI exposes every string and every script block to a registered antimalware provider at the moment of evaluation; ScriptBlock logging records the substance of every compiled script block to the Windows event log irrespective of whether it executes successfully; Module logging records the parameter values of cmdlets as they are invoked. Taken together these capabilities furnish a defender with a complete, structured record of what a PowerShell session attempted, which is, prima facie, the worst possible runtime in which to perform sensitive work.

The operator's persistence in choosing PowerShell, despite this instrumentation, follows from three properties of the runtime that no alternative offers in combination. PowerShell is present on every supported Windows host without administrative intervention. Its language permits direct interaction with the Win32 API and the Common Language Runtime, which is the precondition for any in-memory loading of native or managed code. And it is sufficiently general-purpose that legitimate administrative use is widespread, which establishes the same statistical noise that batch enjoys at the carrier layer, although with a less favourable signal-to-noise ratio.

Within this runtime, layer two's first concern is to evade the instrumentation rather than to operate beneath it, because operating beneath it, viz. through Constrained Language Mode or AppLocker policy, is generally not available in the environments the operator targets. Evasion of the script-content logging surface entails two distinct operations. The first is AMSI bypass, which prevents the antimalware provider from seeing the substantive content of the script before it is evaluated. The second is ScriptBlock logging mitigation, which addresses the kernel-mediated logging that records script content even if AMSI has been blinded.

The reference design's approach to AMSI is the well-documented method of patching the AmsiScanBuffer function in memory within the current process such that subsequent calls to it return a clean result. The patch is small, the addresses are stable across patch-Tuesday updates, and the operation occurs entirely within the operator's own process space, which means no cross-process artefacts of the kind EDR products typically observe. The cost of this approach is that the patch is itself recognisable: several detection rules in commercial products now look for the specific byte sequences and offsets that the most common AMSI bypasses produce. The operator's response, in the reference design, is to generate the patch bytes at runtime from arithmetic operations on integer constants, such that the patch never appears as a literal string in the script.

ScriptBlock logging is addressed less elegantly. The kernel-mediated logging of script content cannot, in general, be defeated without administrative privilege, which the dropper does not, at this stage, possess. The reference design therefore accepts that the script content of layer two will be logged, and addresses the cost of that logging by ensuring that the logged content is operationally inert. The patch bytes, the shellcode itself, the C2 addresses, and any other sensitive material are introduced into the process by means that bypass the script-block scope, viz. through reflection against types loaded at runtime and through arithmetic generation of constants whose values are not present in the script text. What ScriptBlock logging records is, in this design, a structurally complete account of the script's behaviour that omits the specific values that would render the behaviour actionable.

The rename of powershell.exe to an arbitrary name is the second technique that warrants treatment here, because of what it reveals about name-based detection. Several commercial EDR products, and a larger number of in-house detection rules built atop the Windows event channels, key their PowerShell-specific logic on the process name, viz. on whether the executable that emitted a given event is named powershell.exe. The operator exploits this by copying the PowerShell executable to a different filename and invoking the copy, with the consequence that events emitted by the renamed process are matched against the wrong detection rules, or against no detection rules at all. The technique is unsophisticated, it has been documented for the better part of a decade, and it nevertheless continues to succeed against an embarrassing fraction of production environments, because the underlying configuration is brittle in a way that vendors and defenders have been slow to remedy.

The transition from layer two to layer three is, by design, the most architecturally significant moment in the dropper. Layer two has, at this point, decoded the shellcode delivered through the carrier, written it into a freshly allocated region of its own process memory, and resolved the addresses of the Win32 functions necessary to execute it. The operation that follows, viz. the transfer of execution from interpreted PowerShell into native shellcode, is the boundary at which the operator commits to all the costs of in-memory code execution: the EDR userland hooks on memory-protection primitives, the ETW providers that emit events on certain calls to NtAllocateVirtualMemory, the behavioural detection rules that key on PowerShell invoking VirtualProtect to mark memory executable. The reference design absorbs these costs because the alternative, viz. dropping the shellcode to disk and invoking it through any of the available mechanisms, exposes a still more legible surface to the file-based scanners that layer one was carefully designed to evade.

Defender read. Layer two repays defender attention in inverse proportion to the attention it has historically received. Detection programmes that fixate on AMSI bypass patterns at the byte level are participating in a contest that the operator can iterate cheaply, whereas detection programmes that monitor for the structural indicators of in-memory code loading, viz. PowerShell processes that allocate executable memory, that resolve specific Win32 APIs through reflection, and that emit child processes or threads with unusual call stacks, observe an operation that the operator cannot perform any other way. The renamed PowerShell binary is a configuration problem rather than a detection problem: detection rules should key on the digital signature, the original filename in the version resource, or the loaded modules of a process, rather than on the basename of the executable on disk. None of these mitigations is novel, although each remains under-implemented in production environments that this layer continues to traverse without difficulty.

Section 4. Layer three: the Donut-packed implant

By the moment layer three executes, the dropper has already absorbed most of the architectural costs of its choices. The carrier has been parsed, layer two has been blinded against the relevant script-content instrumentation, and shellcode has been loaded into a memory region that the operator controls. What remains to be decided is where the implant ultimately resides and in what form. These two decisions, taken together, determine the operator's relationship with the host for the remainder of the engagement, which may extend over weeks, and the constraints that govern them are correspondingly stringent.

The form of the implant is, in the reference design, a position-independent block produced by the Donut framework from a managed .NET assembly. Donut performs three operations on its input that are individually unremarkable and that collectively define the implant's character. It encrypts the assembly, which prevents memory-resident scanners from identifying it by signature once it is loaded. It prepends a small loader stub that, when invoked, decrypts the assembly and loads it into a private instance of the Common Language Runtime within the host process. And it produces, as output, a single contiguous region of position-independent bytes that may be loaded at any address without further relocation.

The choice of position-independent code, although it adds complexity at the build step, simplifies layer three's relationship with its eventual host process to a degree that no alternative achieves. A conventional .NET assembly, taken from disk and loaded by name, leaves a trail through the Common Language Runtime's assembly resolution machinery, populates AppDomain.GetAssemblies with a recognisable entry, and remains discoverable through reflection by any process that knows where to look. Position-independent code, by contrast, is invisible to the runtime's own bookkeeping, because it is loaded by the operator's own logic into a private CLR instance that does not register itself with the host process's primary AppDomain. A defender who enumerates loaded assemblies through documented APIs observes a host process that is, to all appearances, unmodified.

The choice of host process is governed by a different set of constraints, which may be summarised as the operator's need to be where the defender is not looking. Several Windows processes immediately suggest themselves as candidates, and the reference design rejects most of them for reasons worth enumerating. The Local Security Authority Subsystem (lsass.exe) is heavily monitored, owing to the persistent interest credential-theft operations have shown in it, and modern EDR products instrument cross-process memory access to it as a matter of course. The Service Host (svchost.exe) is, similarly, the subject of focused attention, with the further disadvantage that any anomalous behaviour within it is readily attributable, because the legitimate behaviour of any given svchost instance is narrowly constrained by the service it hosts. Creating a new process under the operator's control sidesteps the cross-process injection question entirely, although it raises the question of what the new process should be and what it should appear to do, neither of which admits of a satisfactory answer in environments that audit process creation aggressively.

The reference design's choice of explorer.exe satisfies several constraints simultaneously. It is, on any interactive Windows host, a long-running process that is not periodically restarted, which means the implant's residence in it is durable across the kinds of timescales the operator requires. It is permitted, by both Windows network policy and the configuration of most enterprise firewalls, to make outbound network connections of a variety of kinds, which provides the implant with cover for its command-and-control traffic. Its legitimate behaviour is sufficiently diverse, encompassing shell extension loading, COM activation, file-system enumeration, and a wide range of user-driven activity, that anomalous behaviour within it is more difficult to attribute than it would be in a more narrowly scoped host. And, perhaps most importantly, it is not, as a matter of empirical observation, the subject of the same cross-process scrutiny that lsass and svchost receive.

The mechanics of injection into explorer.exe are themselves a design decision of some consequence. The classical sequence (OpenProcess, VirtualAllocEx, WriteProcessMemory, CreateRemoteThread) produces a characteristic ETW event sequence that EDR products have, in the last several years, learned to recognise reliably. The reference design substitutes a less conspicuous primitive, viz. the queueing of an Asynchronous Procedure Call to a thread that the operator's process has identified through a documented enumeration API, which permits the shellcode to execute in the target process without the creation of a new thread and therefore without the most telemetry-laden of the classical operations. The cost of this substitution is that APC injection requires the target thread to enter an alertable wait state, which is not guaranteed and which the operator must arrange or wait for; the benefit is a substantial reduction in cross-process injection telemetry.

Once the implant is resident in explorer.exe, the operator's remaining concerns are persistence and operational hygiene rather than evasion. The implant must survive a reboot, which it cannot do from within explorer.exe alone, because explorer.exe is itself recreated on login. The reference design therefore introduces a small auxiliary persistence mechanism, the discussion of which lies outside the present section, and the implant's responsibility is restricted to operating reliably while it is resident. Operational hygiene includes the avoidance of long-running network connections that a behavioural detector might flag, the use of jitter in command-and-control beacons to defeat timing analysis, and the periodic relocation of the implant's working memory within the host process to evade memory scanners that may have been keyed against the initial allocation. None of these techniques is novel, although the discipline with which they are applied distinguishes successful operations from those that conclude prematurely.

Defender read. Layer three is the layer at which detection becomes both most consequential and most difficult, because the implant is, by the time it is resident, indistinguishable from its host by most of the means available to a defender who is not already paying particular attention. The productive direction for defender effort is therefore not to look harder at the implant once it is in place, but to instrument the moment of injection. APC injection, although less conspicuous than thread-based injection, is observable through ETW providers that surface NtQueueApcThread and its variants, and detection rules that flag cross-process APC queues against explorer.exe from processes other than a small set of known-legitimate sources will identify this technique with adequate precision. The choice of explorer.exe as host is itself a useful indicator, because the set of processes that legitimately inject into explorer.exe is small and stable, although it is not empty.

Section 5. The seven-byte memory marker

The technical material in the preceding sections concerns the operator's contest with the defender. The material in this section concerns the operator's contest with himself. Idempotence is a property the dropper requires for the same reasons any distributed system requires it, viz. that the operation may be invoked more than once, that double-execution produces consequences different in kind from single execution, and that the operator cannot, in general, distinguish a first invocation from a subsequent one without consulting some external state. The mechanism by which idempotence is achieved in the reference design is a seven-byte marker written to a fixed location in shared memory, and the design considerations that govern that mechanism are, in their own way, more interesting than any of the techniques discussed so far.

The problem the marker addresses is not hypothetical. Re-execution of a dropper, whether caused by the user double-clicking the carrier, by a scheduled task firing at an unintended moment, or by the operator's own infrastructure delivering a duplicate, produces several failure modes that betray the operation more readily than any single defender capability can. A second instance of layer two attempts AMSI patching against an already-patched process, which may fail in ways that produce diagnostic output. A second instance of layer three injects into explorer.exe a second copy of an implant, producing two beacons from the same host, which the operator's command-and-control infrastructure observes and which is generally regarded as an indication that something has gone wrong. A second instance of either layer increases the host's telemetry surface in ways that no single instance would.

The constraints on the marker are stringent and somewhat contradictory. The marker must be readable from layer two at the moment layer two begins execution, which requires it to survive between invocations within the same boot cycle. It must not survive across reboots, because the operator's persistence mechanism is designed to handle reboot recovery separately and re-execution after a reboot is, in this design, desired rather than avoided. It must not produce a forensic artefact that a defender could use to identify the dropper's presence, which excludes the registry, the file system, and most of the named-object namespaces that Windows exposes. And it must not consume so much memory as to be observable through process working-set analysis or memory diffing.

These constraints, taken together, identify a narrow class of acceptable locations for the marker. Named shared memory through CreateFileMapping with a backing of INVALID_HANDLE_VALUE, which produces a memory region backed by the system paging file that survives the lifetime of any single process but not a reboot, satisfies all of them. The marker is written at a fixed offset within this region and consulted by each subsequent invocation of layer two before any sensitive operation is undertaken. If the marker is present, the invocation terminates without action. If it is absent, the invocation proceeds and writes the marker before continuing.

The choice of seven bytes specifically, rather than four or eight or sixteen, follows from a consideration that is not architecturally important but that is illustrative of the kind of micro-decision that distinguishes designs that work in practice from designs that work in theory. The marker must be improbable enough that legitimate processes do not accidentally produce it, which excludes natural alignments such as four bytes of a recognisable magic constant or eight bytes corresponding to a standard pointer or timestamp. Seven bytes occupies an awkward register-and-a-half on x86_64 and is not produced by any common compiler idiom, with the consequence that a defender who scans shared memory regions for the marker observes a value that does not naturally arise from legitimate use. The value of the seven bytes is itself drawn from a generator that produces, for each campaign, a marker distinct from the markers of any previous campaign, which prevents a defender who has recovered a marker from one operation from using it to identify the presence of an unrelated operation.

The forensic implications of this design are worth noting briefly, because they illustrate a property of memory-resident operational state that is poorly appreciated outside the offensive community. A defender who acquires a memory image of an infected host observes the marker in the shared memory region. A defender who acquires a disk image, or a registry hive, or any of the artefact categories that retrospective forensic analysis typically consults, does not. The marker is, in this respect, an artefact whose existence is contingent on the host being live when the analysis is performed, which constrains the conditions under which the marker can be discovered to a narrow window that the operator may, with discipline, avoid.

Defender read. The seven-byte marker is the kind of artefact a memory-resident detector finds and a disk-based detector does not, which makes it a useful test of whether a defender programme has invested in live-memory analysis or has limited itself to retrospective examination of disk and event-log artefacts. The shared-memory namespace as a category warrants more defender attention than it currently receives, because it is one of the few locations in which an operator can deposit cross-process state with low forensic exposure, and tooling that enumerates shared memory regions and flags those that do not correspond to known legitimate uses will identify markers of this class with high precision. The design's reliance on the marker being unlikely to occur by accident is the same reliance every steganographic scheme makes, and the same answer applies, viz. that any defender who looks for the right shape, in the right place, will find it.

Section 6. The synthesis

The defender read appended to each preceding section addressed the layer it accompanied. This section addresses the architecture in aggregate, on the proposition that the most useful defender observation is not about any single layer but about the assumptions the architecture presupposes about the defender as a whole.

The first such assumption is that the defender's instrumentation is unevenly distributed across the available runtimes. Layer one's choice of batch, layer two's reliance on the asymmetry between cmd.exe and PowerShell instrumentation, and layer three's preference for explorer.exe over more heavily scrutinised hosts share a common structural feature, viz. that the operator is choosing, at each decision point, the runtime or location in which the defender has invested least. The architecture is shaped, in this respect, by the defender's investment portfolio rather than by any inherent property of the platform. A defender programme that addresses this assumption directly, by extending instrumentation into the under-instrumented runtimes in proportion to their offensive utility rather than to their historical prevalence, closes the asymmetry the architecture exploits.

The second assumption is that detection is keyed on artefact properties rather than on operations. The renamed PowerShell binary defeats detection rules that match on executable basename. The Donut-packed implant defeats detection rules that match on PE structures on disk or on registered .NET assemblies. The seven-byte marker defeats detection rules that scan registry and file-system artefacts. In each case, the operator is exploiting the gap between the artefact a detection rule is configured to find and the operation the rule is intended to detect. A defender programme that addresses this assumption shifts its detection logic from artefact-matching to operation-matching, viz. from "is there a file named powershell.exe doing X" to "is there a process executing PowerShell code doing X", where the latter formulation is platform-agnostic with respect to filename, signature, and version-resource details that the operator can manipulate at low cost.

The third assumption, more subtle than the first two, is that the defender's attention is unevenly distributed across the dropper's timeline. The carrier and the loader are observed at the moment of execution, which is the moment they are most defended. The implant is observed continuously thereafter, although the implant is, by design, hardest to detect once resident. The moment of transition between layers, viz. the construction of the PowerShell command line from the batch script, the transfer of execution from PowerShell to shellcode, and the injection of the implant into explorer.exe, occupies a narrow window in which the operator is at his most exposed and during which most defender programmes are not specifically configured to look. A defender programme that addresses this assumption, by instrumenting the transitions rather than the steady states, observes operations that the operator cannot, by the nature of the architecture, avoid performing.

The fourth assumption, which closes the synthesis, is that the defender is engaged in detection of artefacts in isolation rather than in correlation of behaviours across time. Each individual element of the dropper, considered on its own, is either innocuous (a large batch file with many comments) or marginally suspicious in a way that produces a high false-positive rate (a PowerShell process allocating executable memory, a process injecting into explorer.exe). The architecture's resilience derives from the absence of a single high-confidence indicator in any layer; what would identify the operation reliably is the correlation of several low-confidence indicators across the layers, viz. a batch script of unusual size invoking a renamed PowerShell process that allocates executable memory and injects into explorer.exe within the same execution context. A defender programme that addresses this assumption invests in cross-event correlation rather than in per-event detection sensitivity.

These four assumptions, taken together, constitute the strategic surface that the architecture attacks. None of them is fundamental to the Windows platform or to the defender's tooling; each is a configuration choice that may be revisited. The architecture is not unbeatable. It is, however, well-tuned to the configuration of defender programmes as those programmes are typically deployed in 2025 and 2026, which is the period during which droppers of this class are most often observed. A defender programme that addresses any one of the four assumptions raises the operator's cost meaningfully. A defender programme that addresses all four pushes the operator into a class of architectures that this post does not examine, and that the operator would prefer not to be forced into.

Closing

The reversal post that this one accompanies, Three layers deep, describes a single artefact in detail. The post above describes a class of artefacts in their architectural commonalities, of which the artefact in Three layers deep is one instance. The two posts together attempt to do what neither could accomplish alone, viz. to render the design legible from both directions, with the defender's read in each section serving as the bridge that connects the operator's intent to the analyst's recovered evidence.

The decision to publish architectural decisions of this kind is not without controversy in the offensive community, which prefers, on the whole, that the techniques on which it relies be communicated through more restrictive channels than a public weblog. The case for publishing, which I find ultimately the more persuasive, is that the techniques discussed here are not novel, that their defensive countermeasures are well-understood by the analyst community whether or not they are well-deployed in production environments, and that the gap between what a competent operator can do and what the average defender programme can detect is closed faster by candid discussion than by the alternative.

If a single observation has shaped my own approach to this work over the past several years, it is that the architectural reasoning behind offensive tradecraft is, almost without exception, more interesting than the implementation details, and that defenders who understand the former tend to detect the latter even in forms they have never seen before. The aim of this post has been to make a small contribution toward that understanding.