diff --git a/AGENTS.md b/AGENTS.md index f6c3fd2..ef98741 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -31,9 +31,9 @@ For any changes applying new patches, also update research/0_binary_patch_compar | Variant | Boot Chain | CFW | Make Targets | | ------------------- | :--------: | :-------: | ---------------------------------- | -| **Regular** | 38 patches | 10 phases | `fw_patch` + `cfw_install` | -| **Development** | 49 patches | 12 phases | `fw_patch_dev` + `cfw_install_dev` | -| **Jailbreak (WIP)** | 84 patches | 14 phases | `fw_patch_jb` + `cfw_install_jb` | +| **Regular** | 51 patches | 10 phases | `fw_patch` + `cfw_install` | +| **Development** | 64 patches | 12 phases | `fw_patch_dev` + `cfw_install_dev` | +| **Jailbreak (WIP)** | 126 patches | 14 phases | `fw_patch_jb` + `cfw_install_jb` | See `research/` for detailed firmware pipeline, component origins, patch breakdowns, and boot flow documentation. @@ -73,6 +73,7 @@ sources/ ├── VPhoneMenuConnect.swift # Connect menu — devmode, ping, version, file browser ├── VPhoneMenuInstall.swift # Install menu — IPA installation to guest ├── VPhoneMenuRecord.swift # Record menu — screen recording controls + ├── VPhoneMenuBattery.swift # Battery menu — battery status display │ │ # IPA installation ├── VPhoneIPAInstaller.swift # IPA extraction, signing, and installation @@ -89,8 +90,8 @@ scripts/ ├── patchers/ # Python patcher modules │ ├── iboot.py # iBoot patcher (iBSS/iBEC/LLB) │ ├── iboot_jb.py # JB: iBoot nonce skip -│ ├── kernel.py # Kernel patcher (25 patches) -│ ├── kernel_jb.py # JB: kernel patches (~34) +│ ├── kernel.py # Kernel patcher (26 patches) +│ ├── kernel_jb.py # JB: kernel patches (~40) │ ├── txm.py # TXM patcher │ ├── txm_dev.py # Dev: TXM entitlements/debugger/dev mode @@ -111,7 +112,9 @@ scripts/ ├── setup_machine.sh # Full automation (setup → first boot) ├── setup_tools.sh # Install deps, build toolchain, create venv ├── setup_venv.sh # Create Python venv -└── setup_libimobiledevice.sh # Build libimobiledevice from source +├── setup_venv_linux.sh # Create Python venv (Linux) +├── setup_libimobiledevice.sh # Build libimobiledevice from source +└── tail_jb_patch_logs.sh # Tail JB patch log output research/ # Detailed firmware/patch documentation ``` diff --git a/research/0_binary_patch_comparison.md b/research/0_binary_patch_comparison.md index 4115a3e..eb8cc45 100644 --- a/research/0_binary_patch_comparison.md +++ b/research/0_binary_patch_comparison.md @@ -77,7 +77,7 @@ | # | Group | Method | Function | Purpose | JB Enabled | | ----- | ----- | ------------------------------------- | ---------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :--------: | | JB-01 | A | `patch_amfi_cdhash_in_trustcache` | `AMFIIsCDHashInTrustCache` | Always return true + store hash | Y | -| JB-02 | A | `patch_amfi_execve_kill_path` | AMFI execve kill return site | Convert shared kill return from deny to allow | Y | +| JB-02 | A | `patch_amfi_execve_kill_path` | AMFI execve kill return site | Convert shared kill return from deny to allow (superseded by C21; standalone only) | N | | JB-03 | C | `patch_cred_label_update_execve` | `_cred_label_update_execve` | Reworked C21-v3: C21-v1 already boots; v3 keeps split late exits and additionally ORs success-only helper bits `0xC` after clearing `0x3F00`; still disabled pending boot validation | N | | JB-04 | C | `patch_hook_cred_label_update_execve` | sandbox `mpo_cred_label_update_execve` wrapper (`ops[18]` -> `sub_FFFFFE00093BDB64`) | Faithful upstream C23 trampoline: copy `VSUID`/`VSGID` owner state into pending cred, set `P_SUGID`, then branch back to wrapper | Y | | JB-05 | C | `patch_kcall10` | `sysent[439]` (`SYS_kas_info` replacement) | Rebuilt ABI-correct kcall cave: `target + 7 args -> uint64 x0`; re-enabled after focused dry-run validation | Y | @@ -153,12 +153,13 @@ | iBEC | 3 | 3 | 3 | | LLB | 6 | 6 | 6 | | TXM | 1 | 12 | 12 | -| Kernel | 28 | 28 | 53 | -| Boot chain total | 41 | 52 | 78 | +| Kernel (base) | 28 | 28 | 28 | +| Kernel (JB methods) | - | - | 59 | +| Boot chain total | 41 | 52 | 112 | | CFW binary patches | 4 | 5 | 6 | | CFW installed components | 6 | 7 | 8 | | CFW total | 10 | 12 | 14 | -| Grand total | 51 | 64 | 92 | +| Grand total | 51 | 64 | 126 | ## Ramdisk Variant Matrix @@ -166,7 +167,7 @@ | ------------- | ------------------- | -------------------------------- | -------------------------------------------------------------------------------- | --------------------------------------- | --------------------------------------------------- | | `RAMDISK` | `make fw_patch` | release TXM + base TXM patch (1) | base kernel (28), legacy `*.ramdisk` preferred else derive from pristine CloudOS | restore kernel from `fw_patch` (28) | `krnl.ramdisk.img4` preferred, fallback `krnl.img4` | | `DEV+RAMDISK` | `make fw_patch_dev` | release TXM + base TXM patch (1) | base kernel (28), same derivation rule | restore kernel from `fw_patch_dev` (28) | `krnl.ramdisk.img4` preferred, fallback `krnl.img4` | -| `JB+RAMDISK` | `make fw_patch_jb` | release TXM + base TXM patch (1) | base kernel (28), same derivation rule | restore kernel from `fw_patch_jb` (53) | `krnl.ramdisk.img4` preferred, fallback `krnl.img4` | +| `JB+RAMDISK` | `make fw_patch_jb` | release TXM + base TXM patch (1) | base kernel (28), same derivation rule | restore kernel from `fw_patch_jb` (28+59) | `krnl.ramdisk.img4` preferred, fallback `krnl.img4` | ## Cross-Version Dynamic Snapshot diff --git a/research/boot_data_protection_seputil_macf_investigation.md b/research/boot_data_protection_seputil_macf_investigation.md deleted file mode 100644 index 0917a6a..0000000 --- a/research/boot_data_protection_seputil_macf_investigation.md +++ /dev/null @@ -1,115 +0,0 @@ -# Data-Protection Panic Investigation (2026-03-05) - -## Symptom - -On JB boot, system reaches `launchd` boot tasks but then panics with: - -- `Boot task failed: data-protection - exited due to exit(60)` -- repeated host-side retries: - - `[control] vsock: ... Connection reset by peer, retrying...` -- and critical kernel log: - - `IOUC AppleSEPUserClient failed MACF in process pid 6, seputil` - -## What This Confirms - -This is not an early-kernel boot hang. The failure occurs in userspace boot task `data-protection`, where `seputil` cannot pass MACF/IOKit authorization to open SEP user client paths. - -## Static Root-Cause Trace (kernelcache.research.vphone600) - -Using local disassembly on the current vphone600 research kernel: - -1. `failed MACF` log string xref resolves to IOKit MAC dispatch function at: - - function start VA: `0xFFFFFE000825B0C0` -2. The deny path emits `failed MACF` after policy callback dispatch. -3. Callback slot used in this path is: - - `policy->ops + 0x648` => index `201` -4. Current sandbox-extended patch set (before this fix) did not include `ops[201..210]`. -5. Sandbox ops table for this kernel has non-null handlers at: - - `201..210` (`0xFFFFFE00093A654C` ... `0xFFFFFE00093A598C`) - -Interpretation: IOKit MAC hooks remained active and could deny `AppleSEPUserClient` access, matching runtime `IOUC ... failed MACF`. - -## Mitigation Implemented - -### 1) Kernel patcher extension - -Updated: - -- `scripts/patchers/kernel_jb_patch_sandbox_extended.py` - -Added hook indices: - -- `201..210` as `iokit_check_201` ... `iokit_check_210` - -Patch action per entry remains: - -- `mov x0,#0` -- `ret` - -### 2) Documentation updates - -Updated: - -- `research/kernel_patch_jb/patch_sandbox_hooks_extended.md` -- `research/0_binary_patch_comparison.md` - -## Local Validation (static) - -Ran `patch_sandbox_hooks_extended()` against current `kernelcache.research.vphone600`: - -- before extension: `52` writes (`26` hooks) -- after extension: `72` writes (`36` hooks) - -New emitted entries include: - -- `_hook_iokit_check_201` ... `_hook_iokit_check_210` - -## Runtime Validation Pending - -Not yet executed in this turn. Required E2E confirmation: - -1. `make fw_patch_jb` -2. restore flow (so patched kernel is installed) -3. `make cfw_install_jb` -4. `make boot` -5. verify disappearance of: - - `IOUC AppleSEPUserClient failed MACF ... seputil` - - `Boot task failed: data-protection - exited due to exit(60)` - -## Notes - -- Canonical `mpo_iokit_*` names for these indices are not fully symbol-resolved in local KC symbols; index-based labeling is used intentionally to avoid incorrect naming. - -## 2026-03-06 Follow-up (still failing after ops[201..210] extension) - -Observed runtime still reports: - -- `IOUC AppleAPFSUserClient failed MACF in process pid 4, mount` -- `IOUC AppleSEPUserClient failed MACF in process pid 6, seputil` - -So per-policy sandbox hook stubs alone are insufficient on this path. - -### Additional Mitigation Added - -Introduced a dedicated JB patch: - -- `patch_iouc_failed_macf` in `scripts/patchers/kernel_jb_patch_iouc_macf.py` - -Method: - -- Anchor on `"failed MACF"` xref. -- Resolve centralized IOUC MACF gate function. -- Apply low-risk early return at `fn+4/fn+8`: - - `mov x0, xzr` - - `retab` - -Current static hit on this kernel: - -- function start: `0xfffffe000825b0c0` -- patched: - - `0xfffffe000825b0c4` - - `0xfffffe000825b0c8` - -Related doc: - -- `research/kernel_patch_jb/patch_iouc_failed_macf.md` diff --git a/research/boot_hang_b19_mount_dounmount_strategy_compare.md b/research/boot_hang_b19_mount_dounmount_strategy_compare.md deleted file mode 100644 index 5875092..0000000 --- a/research/boot_hang_b19_mount_dounmount_strategy_compare.md +++ /dev/null @@ -1,628 +0,0 @@ -# Boot Hang Focus: B19 + (B11/B12) Strategy Comparison - -Date: 2026-03-05 -Target binary family: `kernelcache.research.vphone600` (iOS 26.1 / 23B85) - -## Final Outcome (2026-03-05) - -This investigation is complete for the current delivery gate (bootability). - -- B19/MNT strategy switching did **not** recover boot: - - matrix artifact: `vm/ab_matrix_b19_mnt_20260305_034127.csv` - - result: 9/9 combinations failed (`code=2`, watchdog timeout) -- JB-only bisect isolated a stable bootable subset: - - PASS: `A1-A4` - - PASS: `A1-A4 + B5-B8` - - FAIL: tested combinations including `B9+` -- Current boot-safe default in `kernel_jb.py`: - - enabled: `A1-A4 + B5-B8` - - disabled: `B9-B20`, `C21-C24` -- E2E success evidence: - - `vm/testing_exec_watch_20260305_050051.log` - - `vm/testing_exec_watch_20260305_050146.log` - - both reached restore-ready markers (USB mux activated + waiting-for-host gate) - -## Scope - -This note compares two patching styles for boot-hang triage: - -1. B19 (`_IOSecureBSDRoot`) patch style mismatch: - - upstream known-to-work fixed site - - current dynamic patcher site -2. B11/B12 (`___mac_mount` / `_dounmount`) patch style mismatch: - - upstream fixed-site patches - - current dynamic strict-shape patches - -The goal is to make A/B testing reproducible with concrete trigger points, pseudocode, and expected runtime effects. - ---- - -## 1) B19 `_IOSecureBSDRoot` mismatch - -### 1.1 Trigger points (clean kernel) - -- `0x0128B598` (`VA 0xFFFFFE000828F598`) in `sub_FFFFFE000828F42C` - - before: `b.ne #0x128b5bc` - - upstream patch: `b #0x128b5bc` (`0x14000009`) -- `0x01362090` (`VA 0xFFFFFE0008366090`) in `sub_FFFFFE0008366008` - - before: `cbz w0, #0x1362234` - - current dynamic patch: `b #0x1362234` (`0x14000069`) - -Current patched checkpoint confirms: - -- `0x01362090` is patched (`b`) -- `0x0128B598` remains unpatched (`b.ne`) - -### 1.2 Function logic and pseudocode - -#### A) Upstream site function (`sub_FFFFFE000828F42C`) - -High-level logic: - -1. query `"SecureRootName"` from `IOPlatformExpert` -2. run provider call -3. release objects -4. if return code equals `0xE00002C1`, branch to fallback path (`sub_FFFFFE0007C6AA58`) - -Pseudocode: - -```c -ret = IOPlatformExpert->call("SecureRootName", ...); -release(...); -if (ret == 0xE00002C1) { - return fallback_path(); -} -return ret; -``` - -Patch effect at `0x128B598` (`b.ne -> b`): - -- always take fallback path, regardless whether `ret == 0xE00002C1`. - -#### B) Dynamic site function (`sub_FFFFFE0008366008`) - -Key branch: - -- `0x136608C`: callback for `"SecureRoot"` -- `0x1366090`: `cbz w0, #0x1362234` (branch into `"SecureRootName"` block) -- `0x1366234` onward: `"SecureRootName"` handling block - -Pseudocode: - -```c -if (matches("SecureRoot")) { - ok = callback("SecureRoot"); - if (ok == 0) goto SecureRootNameBlock; // cbz w0 - // SecureRoot success/failure handling path -} - -SecureRootNameBlock: -if (matches("SecureRootName")) { - // name-based validation + state sync -} -``` - -Patch effect at `0x1362090` (`cbz -> b`): - -- always jump into `SecureRootNameBlock`, regardless `ok`. - -### 1.3 A/B variants to test - -1. `B19-A` (upstream helper only): - - patch only `0x128B598` - - keep `0x1362090` original -2. `B19-B` (dynamic main only): - - patch only `0x1362090` - - keep `0x128B598` original -3. `B19-C` (both): - - patch both sites - -### 1.4 Expected observables - -- Boot logs around: - - `apfs_find_named_root_snapshot_xid` - - `Need authenticator (81)` - - transition into init / panic frame -- Panic signatures: - - null-deref style FAR near low address (for current failure class) - - stack path involving mount/security callback chain - ---- - -## 2) B11/B12 (`___mac_mount` / `_dounmount`) mismatch - -### 2.1 Trigger points (clean kernel) - -#### Upstream fixed-offset style - -- `0x00CA5D54`: `tbnz w28, #5, #0xca5f18` -> `nop` -- `0x00CA5D88`: `ldrb w8, [x8, #1]` -> `mov x8, xzr` -- `0x00CA8134`: `bl #0xc92ad8` -> `nop` - -#### Current dynamic style (checkpoint) - -- `0x00CA4EAC`: `cbnz w0, #0xca4ec8` -> `nop` (B11) -- `0x00CA81FC`: `bl #0xc9bdbc` -> `nop` (B12) - -And in checkpoint: - -- upstream sites remain original (`0xCA5D54`, `0xCA5D88`, `0xCA8134` unchanged) -- dynamic sites are patched (`0xCA4EAC`, `0xCA81FC` are `nop`) - -### 2.2 Function logic and pseudocode - -#### A) `___mac_mount`-related branch (dynamic site near `0xCA4EA8`) - -Disassembly window: - -- `0xCA4EA8`: `bl ...` -- `0xCA4EAC`: `cbnz w0, deny` -- deny target writes non-zero return (`mov w0, #1`) - -Pseudocode: - -```c -ret = mac_policy_check(...); -if (ret != 0) { // cbnz w0 - return EPERM_like_error; -} -continue_mount(); -``` - -Dynamic patch (`0xCA4EAC -> nop`) effect: - -- ignore `ret != 0` branch and continue mount path. - -#### B) Upstream `___mac_mount` two-site style (`0xCA5D54`, `0xCA5D88`) - -Disassembly window: - -- `0xCA5D54`: `tbnz w28, #5, ...` -- `0xCA5D88`: `ldrb w8, [x8, #1]` - -Pseudocode (behavioral interpretation): - -```c -if (flag_bit5_set(w28)) goto restricted_path; -w8 = *(u8 *)(x8 + 1); -... -``` - -Upstream patches: - -- remove bit-5 gate branch (`tbnz -> nop`) -- force register state (`ldrb -> mov x8, xzr`) - -This is broader state manipulation than dynamic deny-branch patching. - -#### C) `_dounmount` path - -Upstream site: - -- `0xCA8134`: `bl #0xc92ad8` -> `nop` - -Dynamic site: - -- `0xCA81FC`: `bl #0xc9bdbc` -> `nop` - -Pseudocode (generic): - -```c -... prepare args ... -ret = mac_or_policy_call_X(...); // site differs between two strategies -... -ret2 = mac_or_policy_call_Y(...); -``` - -Difference: - -- upstream and dynamic disable different call sites in unmount path; -- not equivalent by construction. - -### 2.3 A/B variants to test - -1. `MNT-A` (upstream-only style): - - apply `0xCA5D54`, `0xCA5D88`, `0xCA8134` - - keep `0xCA4EAC`, `0xCA81FC` original -2. `MNT-B` (dynamic-only style): - - apply `0xCA4EAC`, `0xCA81FC` - - keep `0xCA5D54`, `0xCA5D88`, `0xCA8134` original -3. `MNT-C` (both styles): - - apply all five sites - ---- - -## 3) Combined test matrix (recommended) - -For minimal triage noise, run a 3x3 matrix: - -- B19 mode: `B19-A`, `B19-B`, `B19-C` -- mount mode: `MNT-A`, `MNT-B`, `MNT-C` - -Total 9 combinations, each from the same clean baseline kernel. - -Record per run: - -1. last APFS logs before failure/success -2. whether `Need authenticator (81)` appears -3. panic presence and panic PC/FAR -4. whether init proceeds past current hang point - ---- - -## 4) Historical A/B Knobs (used during triage, now removed) - -During the triage phase, temporary runtime knobs were introduced to toggle -upstream-vs-dynamic strategies for B11/B12/B13/B14/B19 and execute the matrix. - -Those knobs are no longer part of the default runtime path after stabilization; -the shipped default now hard-selects the boot-safe subset (`A1-A4 + B5-B8`). - -The triage results from those knobs are preserved in this document and in: - -- `vm/ab_matrix_b19_mnt_20260305_034127.csv` -- `TODO.md` (Boot Hang Research + Progress Update sections) -- `research/0_binary_patch_comparison.md` (Kernelcache section) - ---- - -## 5) Practical note - -Do not mix incremental patching across already-patched binaries when comparing these modes. -Always regenerate from clean baseline before each combination, otherwise branch-site interactions can mask true causality. - ---- - -## 6) Additional non-equivalent points (beyond B19/B11/B12) - -This section answers "还有没有别的不一样的" with boot-impact-focused mismatches. - -### 6.1 B13 `_bsd_init auth` is not the same logical site - -#### Trigger points - -- upstream fixed site: `0x00F6D95C` in `sub_FFFFFE0007F6D2B8` -- current dynamic site: `0x00FA2A78` in `sub_FFFFFE0007FA2838` - -#### Function logic (high level) - -- `sub_FFFFFE0007F6D2B8` is a workqueue/thread-call state machine. -- `sub_FFFFFE0007FA2838` is another lock/CAS-heavy control path. - -Neither decompilation corresponds to `_bsd_init` body semantics directly. - -#### Pseudocode (site-level) - -`0xF6D95C` neighborhood: - -```c -... -call unlock_or_wakeup(...); // BL at 0xF6D95C -... -``` - -`0xFA2A78` neighborhood: - -```c -... -stats_counter++; -x2 = x9; // MOV at 0xFA2A78 -cas_release(lock, x2, 0); -... -``` - -#### Risk - -- This is a strong false-equivalence signal. -- If this patch is intended as `_bsd_init` auth bypass, current dynamic hit should be treated as suspect. - -### 6.2 B14 `_spawn_validate_persona` strategy changed from 2xNOP to forced branch - -#### Trigger points - -- upstream fixed sites: `0x00FA7024`, `0x00FA702C` (same function `sub_FFFFFE0007FA6F7C`) -- current dynamic site: `0x00FA694C` (function `sub_FFFFFE0007FA6858`) - -#### Function logic and loop relevance - -In `sub_FFFFFE0007FA6858`, there is an explicit spin loop: - -- `0xFA6ACC`: `LDADD ...` -- `0xFA6AD4`: `B.EQ 0xFA6ACC` (self-loop) - -Pseudocode: - -```c -do { - old = atomic_fetch_add(counter, 1); -} while (old == target); // tight spin at 0xFA6ACC/0xFA6AD4 -``` - -And same function calls: - -- `sub_FFFFFE0007B034E4` (at `0xFA6A94`) -- `sub_FFFFFE0007B040CC` (at `0xFA6AA8`) - -Your panic signature previously mapped into this call chain, so this mismatch is high-priority for 100% CPU / hang triage. - -### 6.3 B9 `_vm_fault_enter_prepare` does not hit the same function - -#### Trigger points - -- upstream fixed site: `0x00BA9E1C` in `sub_FFFFFE0007BA9C48` -- current dynamic site: `0x00BA9BB0` in `sub_FFFFFE0007BA9944` - -#### Pseudocode (site-level) - -`0xBA9E1C`: - -```c -// parameter setup right before BL -ldp x4, x5, [sp, ...]; -bl helper(...); -``` - -`0xBA9BB0`: - -```c -if (w25 == 3) w21 = 2; else w21 = w25; // csel -``` - -These are structurally unrelated. - -### 6.4 B10 `_vm_map_protect` site differs in same large function - -#### Trigger points - -- upstream fixed site: `0x00BC024C` -- current dynamic site: `0x00BC012C` -- both inside `sub_FFFFFE0007BBFA48` - -#### Pseudocode (site-level) - -`0xBC012C`: - -```c -perm = cond ? perm_a : perm_b; // csel -``` - -`0xBC024C`: - -```c -// different control block; not the same selection point -... -``` - -Even in the same function, these are not equivalent branch gates. - -### 6.5 B15 `_task_for_pid` and B17 shared-region are also shifted - -#### Trigger points - -- B15 upstream: `0x00FC383C` (`sub_FFFFFE0007FC34B4`) -- B15 dynamic: `0x00FFF83C` (`sub_FFFFFE0007FFF824`) - -- B17 upstream: `0x010729CC` -- B17 dynamic: `0x01072A88` -- both in `sub_FFFFFE000807272C`, but not same instruction role - -#### Risk - -- These are unlikely to explain early APFS/init mount failure alone, but they are still non-equivalent and should not be assumed interchangeable. - ---- - -## 7) Practical triage order for 100% virtualization CPU - -Given current evidence, prioritize: - -1. B14 strategy A/B first (upstream `0xFA7024/0xFA702C` vs dynamic `0xFA694C`). -2. B13 strategy A/B next (`0xF6D95C` vs `0xFA2A78`). -3. Then B19 and MNT matrix. - -Reason: B14 path contains a known tight spin construct and directly calls the function chain previously observed in panic mapping. - ---- - -## 8) Normal boot baseline signature (for pass/fail triage) - -Use the following runtime markers as "normal startup reached restore-ready stage" baseline: - -1. USB bring-up checkpoint completes: - - `CHECKPOINT END: MAIN:[0x040E] enable_usb` -2. Network checkpoint enters and exits without device requirement: - - `CHECKPOINT BEGIN: MAIN:[0x0411] config_network_interface` - - `no device required to enable network interface, skipping` - - `CHECKPOINT END: MAIN:[0x0411] config_network_interface` -3. Restore daemon enters host-wait state: - - `waiting for host to trigger start of restore [timeout of 120 seconds]` -4. USB/NCM path activates and host loopback socket churn appears: - - `IOUSBDeviceController::setupDeviceSetConfiguration: configuration 0 -> 1` - - `AppleUSBDeviceMux::message - kMessageInterfaceWasActivated` - - repeated `sock ... accepted ... 62078 ...` then `sock ... closed` -5. BSD network interface bring-up for `anpi0` succeeds: - - `configureDatagramSizeOnBSDInterface() [anpi0] ... returning 0x00000000` - - `enableBSDInterface() [anpi0], returning 0x00000000` - - `configureIPv6LLOnBSDInterface() [anpi0], IPv6 enable returning 0x00000000` - - `disableTrafficShapingOnBSDInterface() [anpi0], disable traffic shaping returning 0x00000000` - -Practical rule: - -- If A/B variant run reaches marker #3 and then shows #4/#5 progression, treat it as "boot path not stuck in early kernel loop". -- If run stalls before marker #1/#2 completion or never reaches #3, prioritize kernel-side loop/panic investigation. - ---- - -## 9) Why the failing sets are currently excluded - -Short answer: they are not equivalent rewrites on this firmware, and multiple -sites land in different control contexts than expected upstream references. - -IDA-backed findings used for exclusion: - -1. B9 differs by function, not just offset: - - dynamic `0xBA9BB0` in `sub_FFFFFE0007BA9944` - - upstream `0xBA9E1C` in `sub_FFFFFE0007BA9C48` -2. B10 is same large function but different decision blocks: - - dynamic `0xBC012C` vs upstream `0xBC024C` in `sub_FFFFFE0007BBFA48` -3. B13 differs by function and behavior: - - dynamic `0xFA2A78` in `sub_FFFFFE0007FA2838` - - upstream `0xF6D95C` in `sub_FFFFFE0007F6D2B8` -4. B14 dynamic path sits in the spin-loop-containing function: - - `sub_FFFFFE0007FA6858` has `0xFA6ACC`/`0xFA6AD4` tight loop - - same path calls `sub_FFFFFE0007B034E4` and `sub_FFFFFE0007B040CC` - -Given this mismatch profile, enabling B9+ as a default set is high risk for -boot regressions until each method is re-derived and validated individually on -the exact kernel build. - ---- - -## 10) Final operational state - -- Default JB boot profile: `A1-A4 + B5-B8` only -- Verified by `BASE_PATCH=jb make testing_exec` reproducibility runs: - - `vm/testing_exec_watch_20260305_050051.log` - - `vm/testing_exec_watch_20260305_050146.log` -- Delivery stance: - - prioritize bootability and deterministic restore-ready progression - - reintroduce B9+ / Group C only behind per-method revalidation - ---- - -## 11) New Field Finding: "restore done but system still not fully up" (`make boot`) - -Source: interactive serial output from `make boot` on 2026-03-05 (user report). - -### 11.1 What the log proves - -This run is **not** failing at the old restore-ready gate and **not** the old -early kernel boot-hang class. - -Observed progression: - -1. APFS root/data/xART/preboot mounts complete in kernel/userspace handoff. -2. `launchd` starts and executes boot tasks. -3. `mount-phase-1`, `mount-phase-2`, `finish-restore`, `init-with-data-volume`, - `keybag`, `usermanagerd` tasks are reached. -4. Log shows: - - `Early boot complete. Continuing system boot.` - - `hello from launchdhook.dylib` / `bye from launchdhook.dylib` - -So the pipeline already crossed into JB userspace initialization. - -### 11.2 Suspicious signals in this run - -1. Early `launchd` assertion: - - `com.apple.xpc.launchd ... assertion failed ... 0xffffffffffffffff` -2. Ignition warning: - - `libignition: cryptex1 sniff: ignition failed: 8` - - then fallback path continues (`ignition disabled`) and boot tasks proceed. -3. `vphoned` host side repeatedly reports: - - `Connection reset by peer` - - indicates daemon channel is not yet stable/ready during this phase. - -### 11.3 Most likely fault domain (ranked) - -1. **JB-1 launchd modification path (highest probability)**: - - `patch-launchd-jetsam` dynamic branch rewrite may select an incorrect - conditional in some launchd builds. - - `inject-dylib /cores/launchdhook.dylib` adds early runtime side effects. - - The assertion appears in `launchd` startup window, matching this stage. -2. **JB hook/runtime environment coupling**: - - `JB_ROOT_PATH` and BaseBin hook expectations under preboot hash path. - - If path/state is incomplete, startup can degrade without immediate kernel panic. -3. **Less likely: kernel B9+ regression** - - Current default already excludes B9+ and this log clearly reaches deep - userspace boot tasks, so this symptom class is different from earlier - watchdog/restore-gate failures. - -### 11.4 Practical triage to confirm - -Use same restored disk and isolate JB userspace phases: - -1. Baseline control: - - Boot with dev/regular userspace flow (no JB-1 launchd dylib injection). -2. Re-enable only JB-1: - - apply jetsam patch alone first, then add dylib injection. -3. Add JB-2/JB-3 incrementally: - - procursus bootstrap, then BaseBin hooks. -4. Capture first regression point and lock to the exact phase. - -### 11.5 Conclusion for this report - -- Current symptom ("restore completes but cannot fully start") is now best - modeled as a **post-restore userspace startup regression**, centered around - JB launchd modification/hook stages, not the previous kernel early-boot hang. - ---- - -## 12) Failed vs Successful Boot Log Comparison (same device class, 2026-03-05) - -Compared inputs: - -- Failing side: `vphone-cli` (startup-hang-fix branch) user-provided `make boot` log. -- Successful side: `vphone-cli-dev` (main) user-provided `make boot` log. - -### 12.1 Signals that appear in both logs (low-priority/noise for this issue) - -The following lines appear on the successful boot too, so they are unlikely to -be the direct blocker for "cannot fully start": - -1. `apfs_find_named_root_snapshot_xid ... No such file or directory (2)` -2. `TXM [Error]: selector: 45 | 78` and `failed to set boot uuid ... 78` -3. `libignition ... cryptex1 sniff: ignition failed: 8` then `ignition disabled` -4. `MKB_INIT: No system keybag found on filesystem.` -5. `mount: failed to migrate Media Keys, error = c002` -6. `Overprovision setup failed ... Ignoring...` - -These are therefore weak root-cause candidates for this specific regression. - -### 12.2 Differential signals (high-value) - -Only/primarily observed in failing run: - -1. Early `launchd` assertion: - - `assertion failed ... launchd + 59944 ... 0xffffffffffffffff` -2. JB launchd hook footprint: - - `hello from launchdhook.dylib` - - `set JB_ROOT_PATH = /private/preboot//jb-vphone/procursus` -3. Host control channel never stabilizes to a healthy daemon session - before manual stop (`Connection reset by peer` keeps repeating). - -Observed in successful run (and absent in failing excerpt): - -1. Host eventually reaches: - - `[control] connected to vphoned v1 ...` - - `[control] pushing update ...` -2. No corresponding early `launchd` assertion line in provided success log. - -### 12.3 Most likely causes (ranked by differential evidence) - -1. **`patch-launchd-jetsam` dynamic hit risk (top suspect)** - The patcher selects a conditional branch dynamically using string xref + - backward window + return-block heuristic. A wrong branch rewrite can produce - launchd internal assertion failures while still allowing partial boot-task logs. - -2. **`launchd` dylib injection (`/cores/launchdhook.dylib`) side effects** - Hook runs very early in launchd lifecycle; if environment/setup assumptions - are not met, boot can degrade without immediate kernel panic. - -3. **JB-1 combined effect (jetsam patch + dylib injection), not kernel B9+** - Kernel path already reaches deep userspace tasks in both cases; this no longer - matches the previous watchdog/restore-gate kernel hang signature. - -### 12.4 Recommended isolation sequence (to convert suspicion -> proof) - -Use same restored disk, only vary JB-1 components: - -1. `launchd` unmodified control. -2. Apply jetsam patch only. -3. Apply dylib injection only. -4. Apply both (current JB-1). - -Record for each: - -- whether `launchd assertion failed ... 0xffffffffffffffff` appears -- whether `[control] connected to vphoned v1` appears -- time to first stable userspace service set. diff --git a/research/boot_jb_mount_failure_investigation.md b/research/boot_jb_mount_failure_investigation.md deleted file mode 100644 index fc5c309..0000000 --- a/research/boot_jb_mount_failure_investigation.md +++ /dev/null @@ -1,223 +0,0 @@ -# JB Mount Failure Investigation (2026-03-04) - -## Symptom - -- `make setup_machine JB=1` reached `cfw_install_jb` and failed at: - - `Failed to mount /dev/disk1s1 at /mnt1 (opts=rw).` - -## Runtime Evidence (Normal Boot) - -From `make boot` serial log: - -- APFS mount tasks fail with permission errors: - - `mount_apfs: volume could not be mounted: Operation not permitted` - - `mount: /private/xarts failed with 77` - - `mount: /private/preboot failed with 77` - - launchd panics: `boot task failure: mount-phase-1 - exited due to exit(77)` -- Ignition/boot path shows entitlement-like failure: - - `handle_get_dev_by_role:13101: disk1s1 This operation needs entitlement` - -This indicates failure in APFS role-based device lookup during early boot mount tasks. - -## Runtime Evidence (DEV Control Run, 2026-03-04) - -From a separate `fw_patch_dev + cfw_install_dev` boot log (not JB): - -- `mount-phase-1` succeeded for xART: - - `disk1s3 mount-complete volume xART` - - `/dev/disk1s3 on /private/xarts ...` -- launch progressed to: - - `data-protection` - - `finish-obliteration` - - `detect-installed-roots` - - `mount-phase-2` - -Interpretation: APFS boot-mount path can work on this build/kernel family after recent APFS gate changes. -This does **not** prove JB flow is fixed; it is a control signal showing the kernel-side path is not universally broken. - -## Flow Separation (Critical) - -- The successful `xART mount-complete` / `mount-phase-2` log is from DEV pipeline: - - `fw_patch_dev` + `cfw_install_dev` -- JB pipeline remains: - - `fw_patch_jb` + `cfw_install_jb` -- `cfw_install_jb` does **not** call `cfw_install_dev`; it runs base `cfw_install.sh` first, then JB-only phases. - -## Kernel Artifact Checks - -### 1) Ramdisk kernel identity - -- `vm/Ramdisk/krnl.img4` payload hash was byte-identical to: - - `vm/iPhone17,3_26.1_23B85_Restore/kernelcache.research.vphone600` - -So ramdisk boot was using the same restore kernel payload (no accidental file mismatch in `ramdisk_build`). - -### 2) Patchability state (current VM kernel) - -On `vm/iPhone17,3_26.1_23B85_Restore/kernelcache.research.vphone600`: - -- Base APFS patches: - - `patch_apfs_vfsop_mount_cmp` -> not patchable (already applied) - - `patch_apfs_mount_upgrade_checks` -> not patchable (already applied) -- Key JB patches: - - `patch_mac_mount` -> patchable - - `patch_dounmount` -> patchable - - `patch_kcall10` -> patchable - -Interpretation: kernel is base-patched, but critical JB mount/syscall extensions are still missing. - -### 3) Reference hash comparison - -- CloudOS source `kernelcache.research.vphone600` payload: - - `b6846048f3a60eab5f360fcc0f3dcb5198aa0476c86fb06eb42f6267cdbfcae0` -- VM restore kernel payload: - - `b0523ff40c8a08626549a33d89520cca616672121e762450c654f963f65536a0` - -So restore kernel is modified vs source, but not fully JB-complete. - -## IDA Deep-Dive (APFS mount-phase-1 path) - -### 1) Failing function identified - -- APFS function: `sub_FFFFFE000948EB10` (log name: `handle_get_dev_by_role`) -- Trigger string in function: - - `"%s:%d: %s This operation needs entitlement\\n"` (line 13101) -- Caller xref: - - `sub_FFFFFE000947CFE4` dispatches to `sub_FFFFFE000948EB10` - -### 2) Gate logic at failure site - -The deny path is reached if either check fails: - -- Context gate: - - `BL sub_FFFFFE0007CCB994` - - `CBZ X0, deny` -- "Entitlement" gate (APFS role lookup privilege gate): - - `ADRL X1, "com.apple.apfs.get-dev-by-role"` - - `BL sub_FFFFFE000940CFC8` - - `CBZ W0, deny` -- Secondary role-path gate (role == 2 volume-group path): - - `BL sub_FFFFFE000817C240` - - `CBZ W0, deny` (to line 13115 block) - -The deny block logs line `13101` and returns failure. - -### 3) Patch sites (current vphone600 kernelcache) - -- File offsets: - - `0x0248AB50` — context gate branch (`CBZ X0, deny`) - - `0x0248AB64` — role-lookup privilege gate (`CBZ W0, deny`) - - `0x0248AC24` — secondary role==2 deny branch (`CBZ W0, deny`) -- All three patched to `NOP` in the additive APFS patch. - -### 4) Additional APFS EPERM(1) return paths in `apfs_vfsop_mount` - -Function: - -- `sub_FFFFFE0009478848` (`apfs_vfsop_mount`) - -Observed EPERM-relevant deny blocks: - -- Root-mount privilege deny: - - log string: `"%s:%d: not allowed to mount as root\n"` - - xref site: `0xFFFFFE000947905C` - - error return: sets `W25 = 1` -- Verification-mount privilege deny: - - log string: `"%s:%d: not allowed to do a verification mount of %s (is_suser %s ; uid %d)\n"` - - xref site: `0xFFFFFE0009479CA0` - - error return: sets `W25 = 1` - -Important relation to existing Patch 13: - -- At `0xFFFFFE0009479044` (same function), current code is `CMP X0, X0` (patched form), - which forces the following `B.EQ` path and should bypass one root privilege check in this region. -- Therefore, if JB still reports `mount_apfs ... Operation not permitted`, remaining EPERM candidates - include other deny branches (including the verification-mount gate path above), not only `handle_get_dev_by_role`. - -## Root Cause (Updated, Two-Stage) - -Stage 1 (confirmed and mitigated): - -- APFS `handle_get_dev_by_role` entitlement/role deny gates were a concrete mount-phase-1 blocker. -- Additive patch now NOPs all three relevant deny branches. - -Stage 2 (still under investigation, JB-only): - -- DEV control run can pass `mount-phase-1`/`mount-phase-2`. -- JB failures must be analyzed with JB-only artifacts/logs and likely involve JB-only deltas - (launchd dylib injection, BaseBin hooks, or JB preboot/bootstrap interaction), in addition to any remaining kernel checks. - -## Mitigation Implemented - -### A) Ramdisk kernel split (updated implementation) - -- `scripts/fw_patch_jb.py` - - no longer creates a ramdisk snapshot file -- `scripts/ramdisk_build.py` - - derives ramdisk kernel source internally: - - uses legacy `kernelcache.research.vphone600.ramdisk` if present - - otherwise derives from pristine CloudOS `kernelcache.research.vphone600` - under `ipsws/*CloudOS*/` using base `KernelPatcher` - - builds: - - `Ramdisk/krnl.ramdisk.img4` from derived/base source - - `Ramdisk/krnl.img4` from post-JB restore kernel -- `scripts/ramdisk_send.sh` - - prefers `krnl.ramdisk.img4` when present. - -### B) Additive APFS boot-mount gate bypass (new) - -- Added new base kernel patch method: - - `KernelPatchApfsMountMixin.patch_apfs_get_dev_by_role_entitlement()` -- Added to base kernel patch sequence in `scripts/patchers/kernel.py`. -- Behavior: - - NOPs three deny branches in `handle_get_dev_by_role` - - does not modify existing filesystem patches (APFS snapshot/seal/graft/mount/sandbox hooks remain unchanged). - -### C) JB-only differential identified (for next isolation) - -Compared with DEV flow, JB adds unique early-boot risk factors: - -- launchd binary gets `LC_LOAD_DYLIB` injection for `/cores/launchdhook.dylib` -- `launchdhook.dylib`/BaseBin environment strings include: - - `JB_ROOT_PATH` - - `JB_TWEAKLOADER_PATH` - - explicit launchdhook startup logs (`hello` / `bye`) -- procursus/bootstrap content is written under preboot hash path (`/mnt5//jb-vphone`) - -These do not prove causality yet, but they are the primary JB-only candidates after Stage-1 APFS gate mitigation. - -## Next Validation - -1. Kernel/JB isolation run (requested): - - `make fw_patch_jb` - - `make ramdisk_build` - - `make ramdisk_send` - - run `cfw_install_dev` (not JB) on this JB-patched firmware baseline -2. Compare normal boot result: - - If `mount-phase-1/2` succeeds: strong evidence issue is in JB-only userspace phases. - - If it still fails with `EPERM`: continue kernel/APFS deny-path tracing. -3. If step 2 succeeds, add back JB phases incrementally: - - first JB-1 (launchd inject + jetsam patch) - - then JB-2 (preboot bootstrap) - - then JB-3 (BaseBin hooks) - and capture first regression point. - -## 2026-03-05 Follow-up (Data-Protection / SEP UserClient MACF) - -A later failure mode moved past mount-phase and failed in `data-protection`: - -- `IOUC AppleSEPUserClient failed MACF ... seputil` -- `Boot task failed: data-protection - exited due to exit(60)` - -This was traced to unpatched IOKit MAC policy hook range (`ops[201..210]`) in -the sandbox extended hook set. Mitigation and patch details are documented in: - -- `research/boot_data_protection_seputil_macf_investigation.md` - -Follow-up (2026-03-06): - -- Even after `ops[201..210]` extension, runtime still showed: - - `IOUC AppleAPFSUserClient failed MACF ...` - - `IOUC AppleSEPUserClient failed MACF ...` -- A second-stage mitigation was added: - - `patch_iouc_failed_macf` (central IOUC MACF gate low-risk early return). diff --git a/research/boot_launchd_data_volume_lookup_analysis.md b/research/boot_launchd_data_volume_lookup_analysis.md deleted file mode 100644 index 20af586..0000000 --- a/research/boot_launchd_data_volume_lookup_analysis.md +++ /dev/null @@ -1,160 +0,0 @@ -# launchd Boot Sequence Analysis (mount-phase-1 / data volume lookup) - -Date: 2026-03-06 -Target binary in IDA: launchd (Darwin Bootstrapper 7.0.0, libxpc_executables-3089.42.1~6) - -## Scope - -This analysis focuses on the log segment: - -- `mount-phase-1` -- `failed to lookup data volume - Attribute not found` -- `mount: data volume missing, but not required in env: 1` - -and answers where launchd is involved vs where `mount`/APFS is actually failing. - -## Ground Truth from launchd (IDA) - -### 1) Boot task source is embedded plist in launchd - -- Embedded config plist string contains: - - `Boot` dictionary - - `mount-phase-1` -> `Program=/sbin/mount`, `ProgramArguments=["mount","-P","1"]` - - `data-protection` -> `Program=/usr/libexec/init_data_protection`, `CSIdentityOverride=com.apple.seputil` - -### 2) Boot task dictionary wiring - -- `sub_10004D978`: - - reads launchd `__TEXT,__config` - - extracts `Boot` dictionary - - stores into `qword_10007F138` - -- `sub_100047B94`: - - lookup task from `qword_10007F138` - - runs gate check (`sub_100047C8C`) - - executes task by calling `sub_100047DD0` - -### 3) Boot sequence order (relevant slice) - -- `start` -> `sub_1000489A4` -> async `sub_100048590` -- `sub_100048590` task order includes: - - `mount-phase-1` - - `data-protection` - - `finish-obliteration` - - `detect-installed-roots` - - `mount-phase-2` - -### 4) Task execution and failure handling - -- `sub_100047DD0`: - - logs `Doing boot task` - - `posix_spawnp()` actual executable - - for non-async tasks waits via `sub_100049180` - -- `sub_100049180`: - - checks exit status - - if `RequireSuccess=true` and exit is failure -> calls `sub_100048D0C` - -- `sub_100048D0C`: - - logs `Boot task failed: %s` - - logs `Panicking in 3 seconds.` - - then panics - -## Key Conclusion: where the quoted error originates - -`failed to lookup data volume - Attribute not found` is **not** generated by launchd internals. - -In this path, launchd is only the orchestrator: - -1. launchd runs `/sbin/mount -P 1` for `mount-phase-1` -2. `mount` prints `failed to lookup data volume...` -3. launchd only sees child exit result, then decides whether to panic based on `RequireSuccess` - -So this error’s primary fault domain is in `mount` + APFS/IOKit interactions, not in launchd task scheduler code. - -## `mount -P 1` call chain (IDA-verified) - -Target binary: `research/artifacts/launchd_23B85/mount.from_vm_disk.current` - -- `start` @ `0x100003DC8` - - phase path calls `sub_100003480(&env)` then `sub_100003674()` -- `sub_100003480` @ `0x100003480` - - reads `IODeviceTree:/filesystems/fstab` property `os_env_type` - - calls `APFSContainerGetBootDevice(&CFString)` - - builds `/dev/` string in global buffer -- `sub_100003674` @ `0x100003674` - - calls `APFSVolumeRoleFind(, 0x40, &CFArray)` - - on non-zero return: - - `fprintf("%sfailed to lookup data volume - %s\n", ..., strerror(ret & 0x3fff))` - - if single match, converts CFString -> data volume path - -Important: - -- `0x40` is the queried role selector in this build. -- `Attribute not found` corresponds to `ENOATTR` (`93`) after `ret & 0x3fff`. -- phase-1 can continue with warning, but this often cascades into `data-protection` failure in your failing trace set. - -## Updated Causality (with new control evidence) - -User-provided control result: - -- `cfw_install` (without JB extras) reproduces the same failure. -- TXM path is known-good in this setup. - -Implication: - -- JB-only userspace deltas are no longer primary suspects for this error. -- Current highest-confidence differentiator is kernel state/patch delta. - -## Plausible causes (re-ranked) - -### A. kernel-side causes (highest probability now) - -1. APFS role/device lookup path is denied/altered by kernel policy path for `mount` (`IOUC AppleAPFSUserClient ... MACF` class of failure). -2. Kernel APFS patch interaction causes role metadata read path to return "attribute not found". -3. Kernel patch ordering or overlap in `fw_patch_jb` modifies behavior that minimal/non-JB flow does not. - -### B. mount userspace path causes (still relevant, but secondary to A) - -1. `mount -P 1` phase logic expects Data role metadata that is unavailable under current kernel behavior. -2. Userspace APFS query path receives transformed errno/status from kernel and prints attribute-missing message. - -### C. launchd-level causes (currently de-prioritized) - -1. launchd task definition mismatch. -2. spawn-level failures before mount logic. -3. task gating differences. - -These are less consistent with the new control result and with observed mount-origin log text. - -## Practical meaning - -- For this failure, launchd reverse already gives enough certainty that launchd is orchestrator only. -- Next decisive work should move to APFS userspace API return-site tracing and corresponding kernel handlers. -- Detailed `mount -P 1` failure/hang matrix is documented in: - - `research/boot_mount_phase1_failure_matrix.md` - -## Most likely kernel-side silent-fail line (current ranking) - -1. `APFSVolumeRoleFind` path reaches APFS userclient method and gets transformed deny/error (most consistent with `IOUC AppleAPFSUserClient failed MACF ... mount` history). -2. Base APFS entitlement bypass patch (`patch_apfs_get_dev_by_role_entitlement`, patch #16) matched the wrong deny branch or altered control flow for role lookup. -3. Base sandbox-op stubs (mount/vnode related) hit an unintended target due ops-table drift. - -Lower probability for this exact string: - -- launchd plist/task ordering itself -- fstab format/ramdisk missing (would produce different dominant signatures) - -## Artifact notes - -Current extracted launchd sample (for reproducible local reference): - -- `research/artifacts/launchd_23B85/launchd.from_vm_disk.current` - - sha256: `411d730c95d99a088e94b673eff3fa73d6d3cc778b24b476cd0b7866cd037443` -- `research/artifacts/launchd_23B85/launchd.plist.from_vm_disk.current` - - sha256: `dc972e30220b3e9e8323d23ce4a4737d849893dd79e305693de902ff65ddacab` - -Observed in this sample: - -- no `/cores/launchdhook.dylib` load command -- launchd embedded boot task plist present and readable diff --git a/research/boot_launchdhook_assertion_handoff_20260306.md b/research/boot_launchdhook_assertion_handoff_20260306.md deleted file mode 100644 index 6411b6b..0000000 --- a/research/boot_launchdhook_assertion_handoff_20260306.md +++ /dev/null @@ -1,254 +0,0 @@ -# Launchdhook Assertion Handoff (2026-03-06) - -## Scope - -This note captures the current userspace-side findings for the failing `fw_patch_jb + cfw_install_jb` path. -It is intended as a handoff artifact for follow-up work on the `fix-boot` branch. - -The current symptom is no longer "launchd does not start". -The updated symptom is: - -- `launchd` starts -- injected `launchdhook.dylib` definitely loads -- `launchd` then hits an early internal assertion before the expected `bash` / follow-on job chain stabilizes - -## Executive Summary - -### Confirmed - -- The original JB `LC_LOAD_DYLIB /cores/launchdhook.dylib` approach is structurally unsafe on the current `launchd` sample because there is not enough load-command slack. -- A short-path alias experiment fixed the Mach-O header-space problem: - - `/cores/launchdhook.dylib` requires 56 bytes and overruns into `__TEXT,__text` - - `/cores/b` still requires 40 bytes and also overruns - - `/b` requires 32 bytes and fits exactly after removing `LC_CODE_SIGNATURE` -- Runtime test with `/b` proves the short-path alias loads successfully, but the main failure remains: - - `launchdhook.dylib` prints its startup logs - - `launchd` then asserts early: `launchd + 59944 ... 0xffffffffffffffff` - -### Current conclusion - -The short-path `/b` alias fixes the **injection-space** problem, but does **not** fix the **launchd assertion**. -So the remaining problem is now more likely in the hook logic (especially early XPC / daemon config hooks) than in the raw load-command insertion path. - -## Evidence Collected - -### 1. Mach-O injection space audit - -Local dry-run against `vm/.cfw_temp/launchd` established the following: - -- Existing load-command slack before the first section: 16 bytes -- After stripping `LC_CODE_SIGNATURE`: 32 bytes -- Required command sizes: - - `/cores/launchdhook.dylib` -> 56 bytes - - `/cores/b` -> 40 bytes - - `/b` -> 32 bytes - -Observed effect of the original long-path injection: - -- `LC_LOAD_DYLIB /cores/launchdhook.dylib` overwrote the beginning of `__TEXT,__text` -- first instructions at the start of the text section were replaced by injected path bytes - -Observed effect of the short-path injection: - -- `LC_LOAD_DYLIB /b` fits exactly in the available 32 bytes after `LC_CODE_SIGNATURE` removal -- no additional overwrite into `__TEXT,__text` is needed for that path - -### 2. Device-side mount and payload verification - -Inside ramdisk shell, manual mount and inspection showed: - -- `/dev/disk1s1` mounted at `/mnt1` -- `/dev/disk1s5` mounted at `/mnt5` -- `/mnt1/b` exists and is a Mach-O dylib -- `/mnt1/cores/launchdhook.dylib` exists and is a Mach-O dylib -- `/mnt1/cores/systemhook.dylib` and `/mnt1/cores/libellekit.dylib` are also present - -Important clarification: - -- `/.b` is an existing hidden root directory on this filesystem and is unrelated to the alias experiment -- the experiment path is `/b`, not `/.b` - -### 3. Runtime serial log after switching to `/b` - -The following lines appeared during boot: - -- `set JB_ROOT_PATH = /private/preboot//jb-vphone/procursus` -- `=========== hello from launchdhook.dylib ===========` -- `=========== bye from launchdhook.dylib ===========` -- `com.apple.xpc.launchd ... assertion failed: ... launchd + 59944 ... 0xffffffffffffffff` - -Interpretation: - -- `/b` injection is working -- `launchdhook.dylib` is loaded and runs its initializer path -- the failure is no longer attributable to the long path not loading or to the Mach-O injection missing outright - -## Source-Backed Analysis from Dopamine BaseBin - -Source tree used: - -- `/Users/qaq/Documents/GitHub/Dopamine/BaseBin` - -### 1. launchdhook initialization order - -From `Dopamine/BaseBin/launchdhook/src/main.m`, the constructor initializes hooks in this order: - -1. `initXPCHooks();` -2. `initDaemonHooks();` -3. `initSpawnHooks();` -4. `initIPCHooks();` -5. `initJetsamHook();` - -This matters because the current assertion happens very early, after `launchdhook` has definitely run. -That makes the earlier hooks higher-priority suspects than spawn-time behavior. - -### 2. What `initDaemonHooks()` actually does - -From `Dopamine/BaseBin/launchdhook/src/daemon_hook.m`: - -- hooks `xpc_dictionary_get_value` -- rewrites behavior for these keys: - - `LaunchDaemons` - - `Paths` - - `com.apple.private.xpc.launchd.userspace-reboot` - -Behavior summary: - -- appends jailbreak daemon plist entries from: - - `JBROOT_PATH("/basebin/LaunchDaemons")` - - `JBROOT_PATH("/Library/LaunchDaemons")` -- appends those same directories to `Paths` -- conditionally returns `com.apple.private.iowatchdog.user-access` when `userspace-reboot` is false/missing - -This hook touches exactly the kind of launchd configuration objects that are consulted during early daemon/bootstrap setup. - -### 3. What `initSpawnHooks()` actually does - -From `Dopamine/BaseBin/launchdhook/src/spawn_hook.c`: - -- hooks `__posix_spawn` -- during early boot, it intentionally avoids broad injection until `xpcproxy` appears -- once `xpcproxy` is seen, it flips out of early-boot mode and uses `posix_spawn_hook_shared(...)` - -Interpretation: - -- spawn hook is real, but it is comparatively later than the daemon config hook -- given the current assertion timing, `initSpawnHooks()` is no longer the top suspect - -### 4. What `initXPCHooks()` does - -From `Dopamine/BaseBin/launchdhook/src/xpc_hook.c`: - -- hooks `xpc_receive_mach_msg` -- participates in jbserver message handling and filtering inside launchd/XPC path - -This is also an early-launchd hook and remains a second-tier suspect if daemon-hook isolation does not clear the assertion. - -### 5. Runtime jetsam hook vs our static jetsam patch - -From `Dopamine/BaseBin/launchdhook/src/jetsam_hook.c`: - -- Dopamine also installs a runtime hook on `memorystatus_control` -- this is separate from the repo's static `scripts/patchers/cfw_patch_jetsam.py` binary patch - -Therefore two different "jetsam" mechanisms now exist in the failing path: - -- static launchd branch patch -- runtime `memorystatus_control` hook - -This does not prove either is the current cause, but it means the term "jetsam patch" must be disambiguated in future debugging. - -## Current Suspect Ranking - -### Highest probability - -1. **`initDaemonHooks()` / `daemon_hook.m`** - - hooks `xpc_dictionary_get_value` - - mutates `LaunchDaemons` and `Paths` - - timing matches the observed early `launchd` assertion better than spawn-time logic - -### Medium probability - -2. **`initXPCHooks()` / `xpc_hook.c`** - - also runs before spawn hook - - directly changes launchd/XPC message handling - -3. **static `patch-launchd-jetsam` matcher** - - still considered risky because its matching strategy is heuristic and not CFG-constrained - - but the `/b` experiment shows the assertion survives after fixing the obvious load-command overflow issue - -### Lower probability for the current symptom timing - -4. **`initSpawnHooks()` / `spawn_hook.c`** - - still relevant for later `bash` / job launch failures - - but no longer the best first suspect for the early `launchd + 59944` assertion - -## Recommended Isolation Order for `fix-boot` - -### Stage 1: no-daemon-hook control - -Goal: - -- keep `launchdhook.dylib` loading -- keep `/b` short-path alias experiment in place -- disable only `initDaemonHooks()` - -Reason: - -- this is the cleanest test of the current top suspect -- if the assertion disappears, the root issue is inside `daemon_hook.m` - -### Stage 2: no-xpc-hook control - -If stage 1 still asserts: - -- restore daemon hook or keep it off, but disable `initXPCHooks()` next -- test whether the assertion is tied to XPC receive hook path instead - -### Stage 3: no-spawn-hook control - -Only after stages 1 and 2: - -- disable `initSpawnHooks()` -- use this to isolate later `bash` / child-process failures if the launchd assertion is already gone or moves later - -### Stage 4: revisit static launchd jetsam patch - -If all runtime-hook controls still fail: - -- re-audit `scripts/patchers/cfw_patch_jetsam.py` -- prefer a source-backed or CFG-backed site selection instead of the current backward-scan heuristic - -## Concrete Handoff Notes for Claude - -### Facts - -- `/b` injection is confirmed working on-device -- `launchdhook.dylib` definitely runs -- launchd still asserts at `launchd + 59944` -- Dopamine source confirms `initDaemonHooks()` runs before `initSpawnHooks()` - -### Inference - -- the early assertion is more likely to be caused by `daemon_hook.m` or `xpc_hook.c` than by `spawn_hook.c` - -### Best next change - -Implement a **minimal no-daemon-hook build** first: - -- edit `Dopamine/BaseBin/launchdhook/src/main.m` -- temporarily disable only `initDaemonHooks();` -- rebuild `launchdhook.dylib` -- keep `/b` alias loading strategy unchanged for the control run - -## Related Files - -- `scripts/cfw_install_jb.sh` -- `scripts/patchers/cfw_inject_dylib.py` -- `scripts/patchers/cfw_patch_jetsam.py` -- `research/boot_jb_mount_failure_investigation.md` -- `research/boot_hang_b19_mount_dounmount_strategy_compare.md` -- `Dopamine/BaseBin/launchdhook/src/main.m` -- `Dopamine/BaseBin/launchdhook/src/daemon_hook.m` -- `Dopamine/BaseBin/launchdhook/src/spawn_hook.c` -- `Dopamine/BaseBin/launchdhook/src/xpc_hook.c` diff --git a/research/boot_log_fail_vs_success_analysis.md b/research/boot_log_fail_vs_success_analysis.md deleted file mode 100644 index a6ce8a9..0000000 --- a/research/boot_log_fail_vs_success_analysis.md +++ /dev/null @@ -1,107 +0,0 @@ -# Boot Log Comparison Analysis (fail vs success) - -Date: 2026-03-06 -Scope: compare `/Users/qaq/Desktop/boot.fail.log` and `/Users/qaq/Desktop/boot.success.log` for the current startup failure investigation. - -## Executive Verdict - -- The fail path is a `launchd` userspace panic caused by `data-protection` task `SIGTRAP`, not an APFS kernel panic. -- The immediate trigger in fail log is that mount could not find the APFS Data volume metadata: - - `failed to lookup data volume - Attribute not found` - - `mount: data volume missing, but not required in env: 1` -- APFS itself does load in both logs with the same version (`2632.40.15`) and continues mounting volumes. -- Based on these two logs alone, evidence is stronger for "Data volume discovery/mapping issue" than "APFS patch count too high". - -## Key Evidence - -1. APFS module load is healthy in both runs - -- Fail: `apfs_module_start ... com.apple.filesystems.apfs, v2632.40.15` (line 178) -- Success: same APFS load/version (line 250) -- Interpretation: no direct sign that APFS kext fails to initialize in fail run. - -2. First hard divergence is in mount-phase-1 Data volume resolution - -- Fail: - - `failed to lookup data volume - Attribute not found` (line 420) - - `mount: data volume missing, but not required in env: 1` (line 421) -- Success: - - `mount: found boot container: /dev/disk1, data volume: /dev/disk1s2 env: 1` (line 423) -- Interpretation: fail run cannot resolve data volume metadata; success run can. - -3. data-protection task outcome differs immediately after that - -- Fail: - - `(data-protection) : exited due to SIGTRAP` (line 432) - - `Boot task failed: data-protection - exited due to SIGTRAP` (line 433) - - `userspace panic` follows (line 458) -- Success: - - `init_data_protection: Gigalocker initialization completed` (line 434) - - boot continues into `mount-phase-2` and beyond (line 502+) -- Interpretation: the crash is in boot task flow after data volume lookup failure, not in APFS module load. - -4. Success path shows APFS warnings that are non-fatal - -- `mount: failed to migrate Media Keys, error = c002` (line 522) -- `mount_phase_two ... Overprovision setup failed ... Ignoring...` (line 560) -- Interpretation: APFS/AKS warnings can be tolerated when data volume path is intact; these are not the blocking condition here. - -## Additional Differences That Confound Direct "Patch Count" Attribution - -- Different host build/hash inputs: - - vphoned `GIT_HASH` differs (`e4456e9` vs `fd08c43`) - - binary path differs (`vphone-cli` vs `vphone-cli-dev`) - - `vphoned` signed hash differs -- Different device identity: - - ECID differs across logs -- Different APFS checkpoint state: - - Fail: `cleanly-unmounted`, largest xid `198` - - Success: `reloading after unclean unmount`, largest xid `491` - -These differences mean this is not a strict A/B test of only "APFS patch count". - -## Assessment of "APFS patch applied too much?" - -Current confidence: low-to-medium for that hypothesis from logs alone. - -What logs support: - -- The failure does involve APFS mount phase and data-protection. - -What logs do not support: - -- No APFS module crash/oops/panic. -- No explicit APFS patch integrity failure. -- The strongest fail signal is missing data volume attribute, not APFS code-path abort. - -More likely from current evidence: - -- APFS container/volume role metadata mismatch, or -- environment/image drift between the two runs, causing different boot task assumptions. - -## Suggested Next Validation (minimal and decisive) - -1. Re-run with identical binaries and same VM snapshot, toggling only APFS-related patch set. -2. Capture APFS volume-role metadata before boot task (expect `disk1s2` Data role to be discoverable). -3. Compare generated firmware/CFW artifacts checksums between fail/success pipelines. -4. If failure reproduces only with APFS patch delta, then bisect APFS patch subset around data-volume lookup path. - -## Bottom Line - -From these two logs, the actionable breakpoint is: - -- "data volume lookup failed" -> "data-protection SIGTRAP" -> userspace panic. - -This is a stronger lead than "APFS patch count over-applied", and should be the first branch to validate. - -## Update (Control Run) - -New control signal from user: - -- Same failure reproduces with `cfw_install` (without JB extras). -- TXM is known working in this control. - -Updated implication: - -- The prior "JB userspace difference" suspicion should be de-prioritized. -- Current primary suspect becomes kernel delta (especially APFS/IOUC/MACF-related behavior under `mount -P 1`). diff --git a/research/boot_mount_phase1_failure_matrix.md b/research/boot_mount_phase1_failure_matrix.md deleted file mode 100644 index 9436861..0000000 --- a/research/boot_mount_phase1_failure_matrix.md +++ /dev/null @@ -1,195 +0,0 @@ -# mount `-P 1` Failure / Hang Matrix (IDA) - -Date: 2026-03-06 -Target: `research/artifacts/launchd_23B85/mount.from_vm_disk.current` - -## What `mount -P 1` actually does - -Main entry: `start` @ `0x100003DC8` - -For `-P 1` (phase-1): - -1. Parse phase from `-P` and set global `dword_1000101F4 = 1`. -2. Call `setfsent()` and iterate fstab entries. -3. Resolve boot container/device via `sub_100003480` (`APFSContainerGetBootDevice`). -4. Resolve data volume via `sub_100003674` (`APFSVolumeRoleFind`). -5. Print either: - - `mount: found boot container: ..., data volume: ..., env: ...` - - or `mount: data volume missing, but not required in env: ...` -6. Continue mounting entries with pass number == phase via `sub_1000045B0` (exec `mount_`). - -Important: for phase-1, missing data volume is normally **not fatal**. - -### Critical implementation detail (IDA) - -- `sub_100003674` calls: - - `APFSVolumeRoleFind(, 0x40, &outArray)` -- If return != 0: - - prints `failed to lookup data volume - %s` with `strerror(ret & 0x3fff)` -- `Attribute not found` in your log maps to Darwin `ENOATTR(93)`. - -### Important caveat on `93` provenance - -- Confirmed fact: - - `mount` sees `APFSVolumeRoleFind` return value whose low 14 bits are `93`. -- Not yet proven: - - that kernel returns `93` directly. -- Also plausible: - - APFS userspace layer maps another kernel/APFS status to `ENOATTR` before returning to `mount`. - -## All plausible causes for phase-1 fail/hang - -### A) Early argument / mode errors (immediate fail) - -1. Invalid `-P` value - - message: `-P flag requires a valid mount phase number` - - location: `start` (`0x1000039C4`..`0x100003A70`) - -2. Invalid invocation shape (bad argv combination) - - falls into usage path (`sub_1000043B0`) - -### B) Boot container / APFS role lookup path - -1. Cannot read filesystem info from IORegistry (`os_env_type`) - - message: `failed to get filesystem info` - - function: `sub_100003480` - -2. `APFSContainerGetBootDevice` failure - - message: `failed to get boot device - ...` (with retry loop outside restore env) - - function: `sub_100003480` - -3. `APFSVolumeRoleFind` failure - - message: `failed to lookup data volume - ...` - - function: `sub_100003674` - -4. Multiple Data volumes found - - message: `found multiple data volumes` - - function: `sub_100003674` - -Note: - -- phase-1 usually continues after (3)/(4). -- phase-2 has stricter fatal behavior on missing data volume in env=1. - -### C) fstab traversal / entry filtering issues - -1. `setfsent()` failure - - message: `mount: can't get filesystem checklist` - - fatal for phase run - -2. Entry type / spec/path invalidity - - examples: - - `%s: invalid special file or file system.` - - `%s: unknown special file or file system.` - - `You must specify a filesystem type with -t.` - -These are input/config failures before actual fs-specific helper mount. - -### D) Per-filesystem mount helper failures (major phase-1 failure source) - -Dispatcher: `sub_1000045B0` - -1. FSKit path failure (`sub_100000BC0`) - - messages: - - `File system named %s not found` - - `File system named %s unable to mount` - - `FSKit unavailable` - -2. `fork()` / `waitpid()` / child process control failures - - messages include wait/fork warnings in helper path - -3. `exec` failure for `mount_` helpers - - tries `/sbin/mount_` then fallback paths under `/System/Library/Filesystems/...` - - if all fail -> returns mapped failure code - -4. Helper exits non-zero or gets signaled - - parent treats as mount failure and propagates code - -This bucket is the most common direct reason phase-1 exits non-zero. - -### E) Ramdisk special path failures (if ramdisk entry is hit in phase-1) - -Ramdisk path: `sub_100002688` + `sub_100002C34` + command wrapper `sub_100002EA4` - -Possible failures: - -1. preflight format/option parsing fail (`Ramdisk fstab not in expected format.`) -2. `mount_tmpfs` exec or command-run failures -3. copyfile / final mount / umount failures - -Not always relevant, but can fail phase-1 if fstab phase-1 includes ramdisk flow. - -### F) Kernel / IOKit policy-deny mediated failures (high-probability in your current repro) - -From your runtime evidence and control results: - -1. `mount` process can hit IOUC/MACF deny path on APFS UserClient access. -2. userspace may surface this as role/attribute lookup failure string, while root cause is kernel-side deny/altered return. - -Given: - -- same failure reproduces with non-JB `cfw_install` -- TXM known-good - -current priority remains kernel delta analysis. - -## Kernel patch candidates for this specific signature (ranked) - -### 1) Base patch #16 (`patch_apfs_get_dev_by_role_entitlement`) — highest - -Why high: - -- It directly targets APFS get-dev-by-role gate, which is exactly adjacent to `APFSVolumeRoleFind` behavior. -- It NOPs conditional branches by pattern heuristics; a false match can silently alter return path while keeping system alive. -- Symptom shape fits: boot container lookup can still succeed, but role lookup returns `ENOATTR`. -- Live-kernel validation status: - - patch #16 is present (all 3 target branches are `nop` at runtime). - - therefore current question is semantic side effect, not "patch missing". - -### 2) Base sandbox hook patch (`patch_sandbox_hooks`) — medium - -Why medium: - -- Touches mount/vnode MACF paths by ops-table index. -- If ops index resolution drifts, wrong function may be stubbed and produce semantic corruption instead of crash. - -### 3) Base APFS mount checks (#13/#14) — lower for this exact error - -Why lower: - -- These primarily alter mount authorization/upgrade checks. -- Less directly tied to role-attribute lookup API return code, but still in APFS mount vicinity. - -## What to do next (action order) - -1. Confirm userland return-site: - - break at `sub_1000036C0` (`BL _APFSVolumeRoleFind`) and inspect `w0` after return. - - expected failing value path: `w0 & 0x3fff == 93`. -2. Correlate with kernel-side return path in the same boot: - - break/trace APFS kernel role lookup function return (`handle_get_dev_by_role` path) and record final returned `w0`. - - determine whether kernel returns `93`, `22`, or other value when userspace later sees `93`. -3. Correlate with kernel log at same timestamp: - - look for `IOUC AppleAPFSUserClient failed MACF in process mount`. -4. Do base-kernel patch isolation first (not JB methods): - - run with base patch #16 disabled (or reverted) while keeping others unchanged. - - if failure clears, root cause is narrowed to entitlement-bypass matcher. -5. If #16 is not root cause, isolate base sandbox hook patch. -6. Only then continue to JB-only methods (`VPHONE_JB_DISABLE_METHODS`), because your latest control says non-JB install still reproduces. - -## Hang/stall-specific points (not just hard fail) - -1. Boot-device lookup retry loop (`APFSContainerGetBootDevice`) with sleep retries. -2. Child `mount_` helper blocking in kernel/IO path (parent waiting in `waitpid`). -3. External command wrapper (`sub_100002EA4`) blocking while waiting for command output/exit. - -These produce "looks stuck" behavior even before explicit non-zero exit. - -## Practical triage checklist for phase-1 - -1. Confirm exact failing subpath: - - preflight/APFS lookup vs helper mount vs waitpid/exec. -2. Correlate with kernel log at same timestamp: - - especially `IOUC AppleAPFSUserClient ... mount`. -3. Separate: - - non-fatal data-volume warning in phase-1 - - true fatal return path that makes launchd panic on `RequireSuccess`. diff --git a/research/kernel_jb_patch_notes.md b/research/kernel_jb_patch_notes.md index 346791a..cf40409 100644 --- a/research/kernel_jb_patch_notes.md +++ b/research/kernel_jb_patch_notes.md @@ -1,6 +1,6 @@ # Kernel JB Remaining Patches — Research Notes -Last updated: 2026-03-04 +Last updated: 2026-03-07 ## Overview @@ -12,6 +12,11 @@ Last updated: 2026-03-04 Two methods added since initial document: `patch_shared_region_map`, `patch_io_secure_bsd_root`. Three previously failing patches (`patch_nvram_verify_permission`, `patch_thid_should_crash`, `patch_hook_cred_label_update_execve`) have been implemented — see details below. +On 2026-03-06, three patches were retargeted after IDA-MCP re-analysis revealed their matchers were hitting wrong sites: +- `patch_bsd_init_auth` — was hitting `exec_handle_sugid` instead of the real `bsd_init` rootauth gate +- `patch_io_secure_bsd_root` — was patching the `"SecureRoot"` dispatch branch instead of the `"SecureRootName"` deny-return +- `patch_vm_fault_enter_prepare` — was NOPing a `pmap_lock_phys_page()` call instead of the upstream `cs_bypass` gate + Upstream reference: `/Users/qaq/Documents/GitHub/super-tart-vphone/CFW/patch_fw.py` Test kernel: `vm/iPhone17,3_26.1_23B85_Restore/kernelcache.release.vphone600` (IM4P-wrapped, bvx2 compressed) @@ -383,6 +388,24 @@ Should have moderate caller count (hundreds). **Problem**: Needed `_vfs_context_current` and `_vnode_getattr` — 0 symbols available. **Solution**: Eliminated `_vfs_context_current` entirely — shellcode constructs vfs_context inline on stack via `mrs x8, tpidr_el1` + `stp x8, x0, [sp, #0x70]`. `_vnode_getattr` found via "vnode_getattr" string anchor. Hook index found dynamically (scan first 30 ops entries). Code cave allocated via `_find_code_cave(180)`. +### patch_bsd_init_auth — RETARGETED (2026-03-06) + +**Historical repo behavior**: matched `ldr x0,[xN,#0x2b8]; cbz x0; bl` pattern, which landed on `exec_handle_sugid` at `0xFFFFFE0007FB09DC` — a false positive caused by `/dev/null` string overlap in the heuristic scoring. +**Problem**: the old matcher targeted the wrong function entirely; patching `exec_handle_sugid` instead of the real `bsd_init` rootauth gate could break boot by mutating an exec/credential path. +**Current status**: retargeted to the real `FSIOC_KERNEL_ROOTAUTH` return check in `bsd_init`. The new matcher recovers `bsd_init` via in-kernel string xrefs, locates the rootvp panic block (`"rootvp not authenticated after mounting"`), finds the unique in-function indirect call (`BLRAA`) preceded by the `0x80046833` (`FSIOC_KERNEL_ROOTAUTH`) literal, and NOPs the subsequent `CBNZ W0, panic`. Live patch hit: `0xFFFFFE0007F7B98C` / file offset `0x00F7798C`. See `research/kernel_patch_jb/patch_bsd_init_auth.md`. + +### patch_io_secure_bsd_root — RETARGETED (2026-03-06) + +**Historical repo behavior**: fallback heuristic selected the first `BL* + CBZ W0` site in `AppleARMPE::callPlatformFunction`, landing on the `"SecureRoot"` name-match gate at `0xFFFFFE000836E1F0` / file offset `0x0136A1F0`. This changed generic platform-function dispatch routing, not just the deny return. +**Problem**: the patched branch was the `isEqualTo("SecureRoot")` check, not the `"SecureRootName"` policy result used by `IOSecureBSDRoot()`. The old `CBZ->B` rewrite could corrupt control flow for unrelated platform-function calls. +**Current status**: retargeted to the final `"SecureRootName"` deny-return selector: `CSEL W22, WZR, W9, NE` at `0xFFFFFE000836E464` / file offset `0x0136A464` is replaced with `MOV W22, #0`. This preserves the string comparison, callback synchronization, and state updates, and only forces the final policy return from `kIOReturnNotPrivileged` to success. See `research/kernel_patch_jb/patch_io_secure_bsd_root.md`. + +### patch_vm_fault_enter_prepare — RETARGETED (2026-03-06) + +**Historical repo behavior**: matcher looked for `BL(rare) + LDRB [xN,#0x2c] + TBZ` and NOPed the BL at `0xFFFFFE0007BB898C`, which was actually a `pmap_lock_phys_page()` call inside the `VM_PAGE_CONSUME_CLUSTERED` macro — breaking lock/unlock pairing in the VM fault path. +**Problem**: the derived matcher overfit the wrong local shape. The upstream 26.1 patch targeted the `cs_bypass` fast-path gate (`TBZ W22, #3`), not the clustered-page lock helper. NOPing only the lock acquire while the unlock still ran caused unbalanced lock state, explaining boot failures. +**Current status**: retargeted to the upstream semantic site — `TBZ W22, #3, ...` (where W22 bit 3 = `fault_info->cs_bypass`) at file offset `0x00BA9E1C` / VA `0xFFFFFE0007BADE1C` is replaced with `NOP`, forcing the `cs_bypass` fast path unconditionally. This matches XNU's `vm_fault_cs_check_violation()` logic and preserves lock pairing and page accounting. See `research/kernel_patch_jb/patch_vm_fault_enter_prepare.md`. + --- ## Environment Notes diff --git a/research/kernel_patch_sandbox_hooks_21_26_validation.md b/research/kernel_patch_sandbox_hooks_17_26_validation.md similarity index 87% rename from research/kernel_patch_sandbox_hooks_21_26_validation.md rename to research/kernel_patch_sandbox_hooks_17_26_validation.md index a3ccdd4..75f2651 100644 --- a/research/kernel_patch_sandbox_hooks_21_26_validation.md +++ b/research/kernel_patch_sandbox_hooks_17_26_validation.md @@ -1,4 +1,4 @@ -# Kernel Patch Validation: Sandbox Hooks 21-26 (Regular/Development) +# Kernel Patch Validation: Sandbox Hooks 17-26 (Regular/Development) Date: 2026-03-05 @@ -6,6 +6,8 @@ Date: 2026-03-05 Validate the following non-JB kernel patches on a freshly prepared (unpatched) firmware kernelcache: +- 17/18 `file_check_mmap`: `mov x0,#0` + `ret` +- 19/20 `mount_check_mount`: `mov x0,#0` + `ret` - 21/22 `mount_check_remount`: `mov x0,#0` + `ret` - 23/24 `mount_check_umount`: `mov x0,#0` + `ret` - 25/26 `vnode_check_rename`: `mov x0,#0` + `ret` @@ -27,6 +29,8 @@ Patch flow under test: 1. `_find_sandbox_ops_table_via_conf()` to locate `mac_policy_conf` 2. `mpc_ops` pointer to read function entries by index 3. `HOOK_INDICES`: + - `file_check_mmap = 36` + - `mount_check_mount = 87` - `mount_check_remount = 88` - `mount_check_umount = 91` - `vnode_check_rename = 120` @@ -73,7 +77,7 @@ From direct `KernelPatcher` run on clean payload (in-memory, no file write): Using IDA DB and disassembly/decompile on the same firmware family: -- Entry sites match the three hook slots above. +- Entry sites match the five hook slots above. - For `vnode_check_rename`, downstream body includes rename-related path monitoring logic (`pathmonitor_prepare_rename`), confirming semantic alignment with rename hook behavior. - Note: current IDA database had these entry points already recognized as patched stubs; additional inspection was performed from `entry+8` into original body for semantic validation. @@ -81,8 +85,8 @@ Using IDA DB and disassembly/decompile on the same firmware family: Status: **working for now**. -For clean `fw_prepare` kernelcache, the 21-26 sandbox hook patches: +For clean `fw_prepare` kernelcache, the 17-26 sandbox hook patches: - resolve through the correct `mac_policy_ops` table, -- hit the expected three hook entry addresses, +- hit the expected five hook entry addresses, - and rewrite exactly the first two instructions to `mov x0,#0; ret`. diff --git a/research/kernel_patcher_verification.md b/research/kernel_patcher_verification.md index 45a3989..fcde947 100644 --- a/research/kernel_patcher_verification.md +++ b/research/kernel_patcher_verification.md @@ -51,8 +51,8 @@ apply the dynamic patcher to a freshly extracted vresearch101 kernelcache. Result: **byte-identical** output between hardcoded and dynamic patching. -- `KernelPatcher` patches found: 25 -- Hardcoded patches applied: 25 +- `KernelPatcher` patches found: 26 +- Hardcoded patches applied: 26 - `cmp -l /tmp/kc_vphone600_upstream.raw /tmp/kc_vphone600_dynamic.raw`: no output (files identical) @@ -77,16 +77,17 @@ Offsets and 32-bit patch values, taken from `patch_fw.py`: | 13 | 0x2475044 | 0xEB00001F | \_apfs_vfsop_mount cmp x0,x0 | | 14 | 0x2476C00 | 0x52800000 | \_apfs_mount_upgrade_checks mov w0,#0 | | 15 | 0x248C800 | 0x52800000 | \_handle_fsioc_graft mov w0,#0 | -| 16 | 0x23AC528 | 0xD2800000 | \_hook_file_check_mmap mov x0,#0 | -| 17 | 0x23AC52C | 0xD65F03C0 | \_hook_file_check_mmap ret | -| 18 | 0x23AAB58 | 0xD2800000 | \_hook_mount_check_mount mov x0,#0 | -| 19 | 0x23AAB5C | 0xD65F03C0 | \_hook_mount_check_mount ret | -| 20 | 0x23AA9A0 | 0xD2800000 | \_hook_mount_check_remount mov x0,#0 | -| 21 | 0x23AA9A4 | 0xD65F03C0 | \_hook_mount_check_remount ret | -| 22 | 0x23AA80C | 0xD2800000 | \_hook_mount_check_umount mov x0,#0 | -| 23 | 0x23AA810 | 0xD65F03C0 | \_hook_mount_check_umount ret | -| 24 | 0x23A5514 | 0xD2800000 | \_hook_vnode_check_rename mov x0,#0 | -| 25 | 0x23A5518 | 0xD65F03C0 | \_hook_vnode_check_rename ret | +| 16 | | | \_handle_get_dev_by_role entitlement bypass | +| 17 | 0x23AC528 | 0xD2800000 | \_hook_file_check_mmap mov x0,#0 | +| 18 | 0x23AC52C | 0xD65F03C0 | \_hook_file_check_mmap ret | +| 19 | 0x23AAB58 | 0xD2800000 | \_hook_mount_check_mount mov x0,#0 | +| 20 | 0x23AAB5C | 0xD65F03C0 | \_hook_mount_check_mount ret | +| 21 | 0x23AA9A0 | 0xD2800000 | \_hook_mount_check_remount mov x0,#0 | +| 22 | 0x23AA9A4 | 0xD65F03C0 | \_hook_mount_check_remount ret | +| 23 | 0x23AA80C | 0xD2800000 | \_hook_mount_check_umount mov x0,#0 | +| 24 | 0x23AA810 | 0xD65F03C0 | \_hook_mount_check_umount ret | +| 25 | 0x23A5514 | 0xD2800000 | \_hook_vnode_check_rename mov x0,#0 | +| 26 | 0x23A5518 | 0xD65F03C0 | \_hook_vnode_check_rename ret | ## TXM Patch Details @@ -125,7 +126,7 @@ pyimg4 im4p extract \ Dynamic patcher results: -- Patches found/applied: 25 +- Patches found/applied: 26 - TXM patch location: `0xFA6B98` (NOP `tbnz w8, #0, #0xfa6c80`) - Patched output: `/tmp/kc_vresearch1_dynamic.raw` @@ -134,7 +135,7 @@ Dynamic patcher results: For vphone600, the dynamic patcher output is byte-identical to the legacy hardcoded patch list, indicating functional equivalence on this kernelcache. The same dynamic patcher also successfully patches the freshly extracted -vresearch101 kernelcache with the expected TXM NOP and a full 25-patch set. +vresearch101 kernelcache with the expected TXM NOP and a full 26-patch set. ## Targeted Re-Verification (2026-03-05) diff --git a/research/txm_jb_patches.md b/research/txm_jb_patches.md index ffd5fce..9d7d67e 100644 --- a/research/txm_jb_patches.md +++ b/research/txm_jb_patches.md @@ -1,6 +1,6 @@ # TXM Jailbreak Patch Analysis -Analysis of 6 logical TXM jailbreak patches (11 instruction modifications) applied by `txm_jb.py` on the RESEARCH variant +Analysis of 6 logical TXM jailbreak patches (11 instruction modifications) applied by `txm_dev.py` on the RESEARCH variant of TXM from iPhone17,3 / PCC-CloudOS 26.x. ## TXM Execution Model @@ -314,7 +314,7 @@ falls through to the version checks which return success for version <= 5. This effectively bypasses CodeSignature hash validation --- the hash data exists in the blob but the hash-present flag is suppressed, so the consistency check passes. -### `txm_jb.py` dynamic finder: `patch_selector24_hash_extraction_nop()` +### `txm_dev.py` dynamic finder: `patch_selector24_force_pass()` Scans for `mov w0, #0xa1` as a unique anchor to locate the CS hash validator function, finds PACIBSP to determine function start, then matches the pattern @@ -412,7 +412,7 @@ Universal entitlement lookup function. When `a1 != 0`, it resolves the manifest' entitlement dictionary and searches for the named key via `sub_FFFFFFF017036294`. Returns a composite status word where bit 0 indicates the entitlement was found. -### `txm_jb.py` dynamic finder: `patch_get_task_allow_force_true()` +### `txm_dev.py` dynamic finder: `patch_get_task_allow_force_true()` Searches for string refs to `"get-task-allow"`, then scans forward for the pattern `BL X / TBNZ w0, #0, Y`. Patches the BL to `MOV X0, #1`. @@ -495,7 +495,7 @@ Since the validator returns the pointer unchanged, `x20` (raw arg) and the valid pointer both refer to the same object. The shellcode's `STRB W0, [X20, #0x30]` writes to the correct location. -### `txm_jb.py` dynamic finder: `patch_selector42_29_shellcode()` +### `txm_dev.py` dynamic finder: `patch_selector42_29_shellcode()` 1. Finds the "debugger gate function" via string refs to `"com.apple.private.cs.debugger"` 2. Locates the dispatch stub by matching `BTI j / MOV X0, X20 / BL / MOV X1, X21 / MOV X2, X22 / BL debugger_gate / B` @@ -557,7 +557,7 @@ branches to the success path, bypassing both the entitlement check and the fallback flag check. This allows any process to create debug memory mappings regardless of whether it has `com.apple.private.cs.debugger`. -### `txm_jb.py` dynamic finder: `patch_debugger_entitlement_force_true()` +### `txm_dev.py` dynamic finder: `patch_debugger_entitlement_force_true()` Searches for string refs to `"com.apple.private.cs.debugger"`, then matches the pattern: `mov x0, #0 / mov x2, #0 / bl X / tbnz w0, #0, Y`. Patches the BL @@ -645,7 +645,7 @@ if ( (byte_FFFFFFF017070F24 & 1) == 0 ) return 27; // developer mode not enabled ``` -### `txm_jb.py` dynamic finder: `patch_developer_mode_bypass()` +### `txm_dev.py` dynamic finder: `patch_developer_mode_bypass()` Searches for string refs to `"developer mode enabled due to system policy configuration"`, then scans backwards for a `tbz/tbnz/cbz/cbnz` instruction diff --git a/scripts/patchers/kernel_patch_sandbox.py b/scripts/patchers/kernel_patch_sandbox.py index 09f51de..7328834 100644 --- a/scripts/patchers/kernel_patch_sandbox.py +++ b/scripts/patchers/kernel_patch_sandbox.py @@ -5,11 +5,11 @@ from .kernel_asm import MOV_X0_0, RET class KernelPatchSandboxMixin: def patch_sandbox_hooks(self): - """Patches 16-25: Stub Sandbox MACF hooks with mov x0,#0; ret. + """Patches 17-26: Stub Sandbox MACF hooks with mov x0,#0; ret. Uses mac_policy_ops struct indices from XNU source (xnu-11215+). """ - self._log("\n[16-25] Sandbox MACF hooks") + self._log("\n[17-26] Sandbox MACF hooks") ops_table = self._find_sandbox_ops_table_via_conf() if ops_table is None: