GEMINI Malware-Research Lab

Authorized-use lab. All analysis and detonation was performed on owned infrastructure in an air-gapped lab network. No third parties were ever in scope. Samples are public MalwareBazaar specimens. Nothing here constitutes a 0-day or novel exploit. This is defensive DFIR and detection-engineering research.

The constraint that drove the design

The analysis host (CT215) is an unprivileged LXC container. Docker's overlay filesystem mount is denied at the kernel level, so the intended triage stack couldn't build or run. Rather than wait on a hypervisor-host config change, I built a Docker-free Python venv backend that produces the identical triage.json schema, so every downstream artifact (case notes, detections, dashboard) stayed backend-agnostic.

The venv stack: capa 9.4.0 (+ 1045 rules, FLIRT sigs), floss 3.1.1, pefile, yara-python, pyzipper (for MalwareBazaar AES-encrypted zips). Rebuild is one script: gemini_build_venv_stack.sh. The P1 static-triage gate flipped from BLOCKED to READY without touching the hypervisor.

The detonation lab runs separately on WIN3060: a VirtualBox internal network (intnet) with no host adapter and no uplink. The victim VM's only reachable node is the REMnux sinkhole. Containment was proven with a four-point dry run before any real sample ran: internet FAIL, LAN gateway FAIL, host FAIL, sinkhole SUCCESS. The sinkhole-containment screenshot below is the dry-run proof; the victim's browser sees nothing but INetSim's fake HTTP page.

Containment dry-run: inside the sealed lab the victim VM reaches only the INetSim sinkhole, never the real internet.

Static triage: three SmokeLoader structural profiles and an attribution correction

Four MalwareBazaar samples (three tagged SmokeLoader, one tagged RemcosRAT) were triaged statically against the venv backend. Each had a distinct structural profile, and the triage data drives specific, falsifiable claims.

Sample 1: packed stage-0 (PE32 x86, 43 KB). Single .text section, Shannon entropy 7.95, empty import table. capa reports exactly one capability: contain loop. The empty IAT and maximum-entropy section are the canonical packed-PE signature: no static behavior is reachable because the payload is compressed or encrypted inside the section and decompressed at runtime. This is the delivery vehicle, not the payload.

Sample 2: unpacked x64 payload (PE32+, 93 KB). capa: encode data using XOR, execute syscall, contain loop, terminate process. The execute syscall capability is the interesting signal: a user-mode PE that issues a syscall instruction directly, rather than calling through ntdll, is bypassing the ntdll layer where EDR hooks live. This is consistent with EDR-evasion tradecraft (ATT&CK T1027). HYPOTHESIS pending disassembly of the syscall site to confirm the target function.

Sample 3: size-padded dropper (32.8 MB). The mapped PE body is 1.9 MB; 30.9 MB is non-null overlay appended after the section table. The figure is derived from section-table arithmetic: overlay is 94.2% of the file. Many AV/sandbox engines decline to scan files above a size ceiling; inflating the PE to 32 MB can evade them entirely. ATT&CK T1027.001 (Binary Padding). This is proven by the math, no execution required.

Sample 4: MalwareBazaar-tagged RemcosRAT (PE32 x86, 59 KB). Attribution correction. The on-disk artifact presents as a Windows service and named-pipe launcher, not as Remcos. Specifically: (a) import table is two DLLs only: ADVAPI32 and KERNEL32; (b) ADVAPI32 imports are StartServiceCtrlDispatcherA, OpenSCManagerA, OpenProcessToken, AllocateAndInitializeSid, SetSecurityDescriptorDacl (service management plus token/DACL manipulation); (c) KERNEL32 imports include CreateNamedPipeA, ConnectNamedPipe, CreateProcessA (a named-pipe server that launches a child); (d) capa: resolve function by hash, link many functions at runtime, PEB access. The real API set is hash-resolved at runtime, so the visible IAT understates behavior by design. (e) floss: 542 static strings, zero decoded, zero stack strings, zero Remcos/C2/keylog indicators; (f) no RCDATA SETTINGS resource (the Remcos config blob), no high-entropy resource, no overlay.

The RemcosRAT tag is a dynamic/campaign label corroborated by 8 vendor sandboxes. That label applies to the delivery chain, not this specific file. The static artifact is the loader/installer stage. Inheriting a dynamic family label as a static claim would not be defensible here.

DFIR timeline: Volatility 3 pins the encryption sweep

After the Makop detonation, Volatility 3.28.0 was run against the 4.13 GB VirtualBox ELF core (Elf64Layer → WindowsIntel32e, Win10 build 19041). The process list anchors the incident timeline: the encryption sweep runs 20:47-20:53 UTC; notepad.exe opening +README-WARNING+.txt appears at 20:51:06. The note was dropped mid-sweep, not after. The ransomware process itself had already exited by capture time (encryption complete, wallpaper set). The full Volatility output excerpt below is verbatim from the live run.

vol3 windows.pslist (incident-makop): key processes pinning the 20:47-20:53 UTC encryption window
Volatility 3 Framework 2.28.0

PID   PPID  ImageFileName    Offset(V)         Threads  SessionId  Wow64  CreateTime                      ExitTime

4     0     System           0x94873d09d200    100      N/A        False  2026-06-07 19:57:33.000000 UTC  N/A
624   492   services.exe     0x948743ebe080    5        0          False  2026-06-07 19:57:38.000000 UTC  N/A
640   492   lsass.exe        0x948743e34080    8        0          False  2026-06-07 19:57:38.000000 UTC  N/A
2848  624   MsMpEng.exe      0x948744ef1280    20       0          False  2026-06-07 19:57:41.000000 UTC  N/A
4100  3796  explorer.exe     0x9487456e7340    40       1          False  2026-06-07 19:57:44.000000 UTC  N/A
5616  4100  msedge.exe       0x948745e87080    0        1          False  2026-06-07 19:57:48.000000 UTC  2026-06-07 20:47:41.000000 UTC
6204  1324  VBoxService.ex   0x9487451e10c0    1        1          False  2026-06-07 20:47:17.000000 UTC  N/A
--   --    [ransomware]      --                --       --         --     20:47:xx UTC (exited pre-capture)
3556  6492  notepad.exe      0x948744708080    1        1          False  2026-06-07 20:51:06.000000 UTC  N/A
                             cmd: notepad C:\Users\victim\Desktop\+README-WARNING+.txt
6048  916   notepad.exe      0x948745fd02c0    1        1          False  2026-06-07 20:53:32.000000 UTC  N/A
                             cmd: notepad C:\Users\victim\Desktop\+README-WARNING+.txt
1836  756   RuntimeBroker.   0x948745f3a300    2        1          False  2026-06-07 20:53:30.000000 UTC  N/A
5352  2652  SearchFilterHo   0x9487463b3080    0        0          False  2026-06-07 20:53:30.000000 UTC  2026-06-07 20:56:03.000000 UTC

-- Timeline anchor:
   20:47:17 UTC  VBoxService guest-session 2 opens (operator pre-detonation step)
   20:47:41 UTC  msedge.exe exits (all 6 Edge child PIDs exit simultaneously -- wallpaper set)
   20:51:06 UTC  notepad.exe opens +README-WARNING+.txt -- note dropped mid-sweep
   20:53:30 UTC  RuntimeBroker + SearchFilterHost wake -- encryption sweep complete
   20:53:32 UTC  second notepad.exe opens note (operator verification step)

Live ransomware DFIR: Makop detonation and memory key recovery

A real Makop ransomware sample (Phobos lineage, MalwareBazaar) was detonated in the sealed VirtualBox lab. The sample encrypted the victim desktop (clients.csv, financials.txt, contract.doc, vacation.jpg) with the characteristic .[ID].[contact].mkp extension appended. The ransom wallpaper and +README-WARNING+.txt note were set. The process exited after the sweep.

The single highest-priority action in a ransomware incident is to freeze RAM before the process exits and the key is lost. The full guest memory was captured live:

VBoxManage debugvm malware-victim-win10 dumpvmcore --filename victim_core.elf
# 4,435,351,940 bytes (complete physical memory of the victim VM)

gemini_keyfind.py scanned the 4.13 GB core in approximately 6 minutes using numpy vectorized pre-filtering on the FIPS-197 key-expansion boundary. It recovered 21 valid AES key schedules (14 × AES-256, 7 × AES-128). A Windows host carries many AES keys in RAM (BitLocker, LSASS, TLS stacks), so not all are the ransomware's.

Two structural signals narrow the candidates: a cluster of 5 AES-256 schedules within approximately 12 KB at offsets 906,863,292-906,874,812 (consistent with a per-file round-key table) and one 256-bit key that appears at two distinct offsets, 464,286,140 and 826,177,212 (actively-referenced pattern). The recovery is proven. Which key is the ransomware's is a hypothesis. Confirming it requires obtaining an encrypted file and validating a decrypt round-trip. That step was not completed, and no decryptor is claimed.

gemini_keyfind.py: numpy-vectorized AES key-schedule scanner for multi-GB memory dumps
def scan_fast(path, max_hits, chunk_mb=64):
    """Pre-filter on the first key-expansion boundary (Rcon1), then
    full-verify the handful of candidates. Chunked to bound RAM."""
    import numpy as np
    sbox = np.frombuffer(SBOX, dtype=np.uint8)
    mm = np.memmap(path, dtype=np.uint8, mode="r")
    hits = []; seen = set()
    chunk = chunk_mb * 1024 * 1024; overlap = 64; pos = 0
    while pos < len(mm) - 40:
        a = np.asarray(mm[pos : min(pos + chunk + overlap, len(mm))])
        L = len(a) - 40
        # AES-128: w4[0..3] == SubWord(RotWord(w3)) ^ Rcon1 ^ w0[0..3]
        m = ((sbox[a[13:L+13]] ^ np.uint8(1) ^ a[0:L]) == a[16:L+16]) & \
            ((sbox[a[14:L+14]] ^ a[1:L+1]) == a[17:L+17]) & \
            ((sbox[a[15:L+15]] ^ a[2:L+2]) == a[18:L+18]) & \
            ((sbox[a[12:L+12]] ^ a[3:L+3]) == a[19:L+19])
        # AES-256: same idea, w8 == SubWord(RotWord(w7)) ^ Rcon1 ^ w0
        m2 = ((sbox[a[29:L+29]] ^ np.uint8(1) ^ a[0:L]) == a[32:L+32]) & \
             ((sbox[a[30:L+30]] ^ a[1:L+1]) == a[33:L+33]) & \
             ((sbox[a[31:L+31]] ^ a[2:L+2]) == a[34:L+34]) & \
             ((sbox[a[28:L+28]] ^ a[3:L+3]) == a[35:L+35])
        for arr_off, nk in [(np.nonzero(m2)[0], 8), (np.nonzero(m)[0], 4)]:
            for c in arr_off:
                goff = pos + int(c)
                if goff in seen: continue
                key = valid_schedule(bytes(mm[goff:goff+240]), 0, nk)
                if key and len(set(key)) > 1:
                    seen.add(goff)
                    hits.append({"offset": goff, "aes_bits": nk*32,
                                 "key_hex": bytes(key).hex()})
        if len(hits) >= max_hits: break
        pos += chunk
    return hits, len(mm)

Entropy analysis: gemini_ransom_scan.py and intermittent encryption

gemini_ransom_scan.py was built for one specific gap in commodity DFIR: a whole-file Shannon entropy score will not reliably detect ransomware that uses intermittent or partial-extent encryption. If a 1 MB file has only its header and footer blocks encrypted and the body left plain, the per-file entropy might score 4.5, well below the typical 7.5 detection threshold. The file is useless, but the detector passes it.

The scanner splits each victim file into 4 KB blocks and computes entropy per block. The classification logic:

full_encryption: >=92% of blocks score >=7.8 entropy
INTERMITTENT_partial_encryption: high- and low-entropy blocks interleave across the file body, measured by transition count (transitions >= max(3, n//10) with 10%-92% high-block ratio)
header_or_footer_encryption: high blocks clustered at file extremes, plain body
plain: <=8% high blocks

Intermittent encryption is the pattern commodity ransomware is converging on. It speeds up the sweep (encrypting the whole file takes time) while making recovery impossible in practice. Detecting it at the block level also enables a changed-files diff against a pre-detonation baseline snapshot, giving the responder an encrypted-file inventory that a whole-file hash comparison alone would also produce but an entropy-only scan would miss.

gemini_ransom_scan.py: block-level entropy classifier distinguishing full vs intermittent encryption
def classify(blocks, whole_e):
    """Classify encryption pattern from the block entropy profile."""
    if not blocks:
        return "empty", {}
    hi = [i for i, e in enumerate(blocks) if e >= 7.8]
    lo = [i for i, e in enumerate(blocks) if e < 6.0]
    n = len(blocks)
    high_ratio = len(hi) / n
    meta = {"blocks": n, "high_ratio": round(high_ratio, 3), "whole_entropy": whole_e}
    if n == 1:
        return ("encrypted_or_compressed" if whole_e >= 7.8 else "plain"), meta
    head_hi = sum(1 for i in hi if i < max(1, n // 8))
    tail_hi = sum(1 for i in hi if i >= n - max(1, n // 8))
    body_lo = sum(1 for i in lo if n // 8 <= i < n - n // 8)
    if high_ratio >= 0.92:
        return "full_encryption", meta
    if high_ratio <= 0.08:
        return "plain", meta
    # measure transitions between high and low entropy states
    states = [1 if e >= 7.8 else (0 if e < 6.0 else None) for e in blocks]
    seq = [s for s in states if s is not None]
    trans = sum(1 for a, b in zip(seq, seq[1:]) if a != b)
    meta["transitions"] = trans
    if (head_hi or tail_hi) and body_lo and high_ratio < 0.5:
        return "header_or_footer_encryption", meta
    if trans >= max(3, n // 10) and 0.1 < high_ratio < 0.92:
        return "INTERMITTENT_partial_encryption", meta
    return "partial_or_mixed", meta

Detection-engineering foundry loop

The external sample source (MalwareBazaar) requires an auth key to pull by hash. Rather than gate detection R&D on credential availability, I built gemini_foundry.py: it authors real Windows PEs with precisely-known behaviors, compiles them with mingw-w64 (-Os -s to strip debug info), and packages them as MalwareBazaar-style AES zips so the triage pipeline consumes them identically to a real sample.

The loader recipe produces a PE that exercises the detection-relevant behaviors: runtime API resolution via GetProcAddress/LoadLibraryA from the PEB loader list, a XOR-encoded embedded stage (key 0x5A) decoded at runtime, temp-path discovery, contained file write (all I/O confined to %TEMP%\gemini_lab_<id>), and IsDebuggerPresent anti-analysis check. The spec is explicit: behaviors are emulations for detector R&D, not capability demonstrations.

The loop end-to-end: 1. Foundry emits C source with specimen ID embedded 2. mingw-w64 cross-compiles to PE (inert on Linux) 3. capa independently reads the compiled binary and reports the authored behaviors (link function at runtime, encode data using XOR, get common file path, write file), matching the manifest's expected_capa_substrings 4. A YARA rule is authored against the ground-truth specimen 5. gemini_validate_detection.py runs the rule against the positive corpus (the authored PE) and a negative corpus (three unrelated PEs) and emits a verdict

gemini_validate_detection.py enforces one rule: a detection is not VALIDATED until it has been tested against at least one real positive and at least one real negative. YARA structure alone is not enough; Sigma structure alone is explicitly not enough (the tool says so in the output: verdict: DRAFT for Sigma because it needs a log backend to execute). The foundry loop ran to completion on 2026-06-07T15:34:14Z with verdict VALIDATED, TP=1, FP=0, FN=0, but that verdict applies to the self-test specimen, not to in-the-wild malware families.

Calibration is the methodology. The lab runs an explicit honesty gate: a claim requires the tool to have fired on that specific artifact. Recovering 21 AES key schedules from a live capture is a real result; naming which one belongs to the ransomware without a decrypt round-trip would not be. A YARA rule validated against a self-authored PE is a pipeline test, not a family detection. Good DFIR is as much about what you decline to claim as what you prove.

The toolchain

Seven focused tools, each removing a specific bottleneck:

Tool	Purpose
`gemini_triage_local.py`	Docker-free static triage, same schema as the container path
`gemini_keyfind.py`	AES key-schedule recovery from memory dumps (numpy vectorized)
`gemini_ransom_scan.py`	Per-block entropy scan: distinguishes full vs intermittent encryption
`gemini_foundry.py`	Authors contained ground-truth PE specimens for detection R&D
`gemini_validate_detection.py`	YARA/Sigma validator: refuses VALIDATED without real positives + negatives
`gemini_dfir_run.sh`	One-shot DFIR orchestrator: FS + memdump + pcap → `cases/<ts>/`
`gemini_build_venv_stack.sh`	Reproducible build of the whole static backend

gemini_ransom_scan.py is worth calling out: it computes 4 KB block-level Shannon entropy over a victim filesystem tree, classifying files as fully encrypted, intermittent/partial-extent encrypted, or plain. Intermittent encryption (alternating high/low-entropy blocks) defeats naive whole-file entropy detectors. It is the evasion pattern commodity ransomware is converging on. gemini_validate_detection.py is the other noteworthy design: it refuses to emit VALIDATED for a Sigma rule from structure alone, and refuses VALIDATED for YARA without both positive and negative coverage. The tool's honest gate disciplines the detection pipeline the same way the PROVEN/HYPOTHESIS tagging disciplines the analysis notes.