Finding the configuration files
I started by running fs_usage to see what kind of files were being accessed when I triggered summarization in the Notes app. A couple of paths caught my eye:
sudo fs_usage -w -f filesys | grep -i safety19:41:01.285622 stat64 /System/Library/AssetsV2/com_apple_MobileAsset_UAF_FM_Overrides/purpose_auto/4b6103270669db6c18c7db52de8a466cc3bcd1ed.asset/AssetData 0.000003 GenerativeExperiencesSafetyInfer.11988643
19:41:01.285708 stat64 ary/AssetsV2/locks/com.apple.UnifiedAssetFramework/com.apple.MobileAsset.UAF.FM.Overrides/shared_locks/atomic_instance_3EEBCF9B-B245-4218-B810-D18DF18D77E3.locker 0.000003 GenerativeExperiencesSafetyInfer.11988643
19:41:01.285715 close F=3 0.000006 GenerativeExperiencesSafetyInfer.11988643
19:41:01.285912 close F=4 0.000042 GenerativeExperiencesSafetyInfer.12349812
19:41:03.119508 open F=9 (R_____N________) apple_MobileAsset_UAF_SummarizationKitConfiguration/purpose_auto/f46a9f714d900cc628bc5c42f704d7d6e68fcd29.asset/AssetData/ClientSafetyConfiguration.pbtxt 0.000066 generativeexperiencesd.12349901
19:41:03.119669 RdData[A] D=0x003a6a3d B=0x7000 /dev/disk3s5 ileAsset_UAF_SummarizationKitConfiguration/purpose_auto/f46a9f714d900cc628bc5c42f704d7d6e68fcd29.asset/AssetData/ClientSafetyConfiguration.pbtxt 0.000155 W generativeexperi.12349901
19:41:03.119964 stat64 A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt 0.000009 generativeexperiencesd.12349901
19:41:03.119969 lstat64 A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt 0.000004 generativeexperiencesd.12349901
19:41:03.120066 open F=9 (R_____N________) A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt 0.000093 generativeexperiencesd.12349901
19:41:03.120229 RdData[A] D=0x00fd1557 B=0x7000 /dev/disk3s1s1 A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt 0.000153 W generativeexperi.12349901
19:41:03.197048 open F=3 (R_____Nl_______) sV2/locks/com.apple.UnifiedAssetFramework/com.apple.MobileAsset.UAF.FM.Overrides/shared_locks/atomic_instance_3EEBCF9B-B245-4218-B810-D18DF18D77E3.locker 0.000097 GenerativeExperiencesSafetyInfer.11988643
19:41:03.197472 getattrlist [ 2] /System/Library/UnifiedAssetFramework/MinVersions 0.000004 GenerativeExperiencesSafetyInfer.11988643
19:41:03.197480 getattrlist [ 2] /System/Library/UnifiedAssetFramework/MinVersions 0.000001 GenerativeExperiencesSafetyInfer.11988643
19:41:03.197641 open F=4 (R_____Nl_______) sV2/locks/com.apple.UnifiedAssetFramework/com.apple.MobileAsset.UAF.FM.Overrides/shared_locks/atomic_instance_3EEBCF9B-B245-4218-B810-D18DF18D77E3.locker 0.000061 GenerativeExperiencesSafetyInfer.11988643It seems like there are two main types of files being accessed here. The first is the FM_Overrides which is a deny list of terms that are blocked from being summarized. The second is a binary called generativeexperiencesd which opens some pbtxt files related to safety. Searching for these files gave me the following results:
find /System/Library/AssetsV2 -name "*MobileAsset.*UAF*"/System/Library/AssetsV2/persisted/AutoAssetLocker/AutoAssetLocker_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.rules_1.0.6.13.202380_0.state
/System/Library/AssetsV2/persisted/AutoAssetLocker/AutoAssetLocker_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.configuration_1.1.14.13.202380_0.state
/System/Library/AssetsV2/persisted/AutoAssetDescriptors/AutoAssetDescriptors_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.rules_1.0.6.13.202380_0.state
/System/Library/AssetsV2/persisted/AutoAssetDescriptors/AutoAssetDescriptors_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.configuration_1.1.14.13.202380_0.state
/System/Library/AssetsV2/com_apple_MobileAsset_UAF_SummarizationKitConfiguration
/System/Library/AssetsV2/com_apple_MobileAsset_UAF_SummarizationKitConfiguration/purpose_auto/com_apple_MobileAsset_UAF_SummarizationKitConfiguration.xml
/System/Library/AssetsV2/com_apple_MobileAsset_UAF_SummarizationKitConfiguration/purpose_auto/3edcc9828b6f280a53f69b8b35ff9d664386ecb4.asset/AssetData/SummarizationOverrideRules.pbtxtThese filenames looked really interesting to me, specifically com_apple_MobileAsset_UAF_SummarizationKitConfiguration.xml and SummarizationOverrideRules.pbtxt. After looking at the contents of the XML file, it became clear that Apple uses a two-layer file structure for these files. The outer layer is AEA1 (Apple Encrypted Archive) with decryption keys defined in the XML:
<dict>
<key>ArchiveDecryptionKey</key>
<string>WcCBfaFArWm9RYdRAtxGpmhLXzIjTkWbFHLdOkucv64=</string>
<key>AssetSpecifier</key>
<string>com.apple.summarizationkit.ota.configuration</string>
<key>__BaseURL</key>
<string>https://updates.cdn-apple.com/2024/Iris/mobileassets/023-67517/C5C95674-2976-43AB-9DA4-19EB7764867A/</string>
<key>__RelativePath</key>
<string>com_apple_MobileAsset_UAF_SummarizationKitConfiguration/196573C4-3A88-4E86-B004-7A87C4D947EC.aar</string>
</dict>We can download and confirm the AEA1 magic bytes of the archive using:
curl -sL "https://updates.cdn-apple.com/2024/Iris/mobileassets/023-67517/C5C95674-2976-43AB-9DA4-19EB7764867A/com_apple_MobileAsset_UAF_SummarizationKitConfiguration/196573C4-3A88-4E86-B004-7A87C4D947EC.aar" -o config.aar
xxd config.aar | head -100000000: 4145 4131 ... (AEA1 magic)And then decrypt the archive using the key from the XML:
echo "WcCBfaFArWm9RYdRAtxGpmhLXzIjTkWbFHLdOkucv64=" | base64 -d > key.bin
aea decrypt -i config.aar -o decrypted.aa -key key.bin -vprofile: hkdf_sha256_aesctr_hmac__symmetric__none
raw data size: 34134 BNow we can list the contents of the decrypted archive which show the pbtxt configuration files:
aa list -i decrypted.aa -vF PAT=AssetData/ClassificationConfiguration.pbtxt
F PAT=AssetData/ClientSafetyConfiguration.pbtxt
F PAT=AssetData/ClientSwitchConfiguration.pbtxt
F PAT=Info.plistLooking at the header of these pbtxt files, we see the magic bytes for skencv1. This is an undocumented encryption scheme that Apple uses various things.
00000000: 736b 656e 6376 3198 c721 812d 3fe2 c025 skencv1..!.-?..%
00000010: 6ab5 a673 051f 3891 e9da 1b4b b342 39e4 j..s..8....K.B9.
00000020: 6461 fa94 03a3 3e08 4b4d c31f b1e1 0509 da....>.KM......Decrypting the configuration files
I couldn’t find any documentation on the skencv1 format, I tried some naive things like decrypting it with the same key as the AEA1 archive, but it didn’t work. The SummarizationKit.framework binary isn’t directly accessible as a standalone file, on modern macOS it lives inside a dyld shared cache which is a single monolithic binary containing many system frameworks. You can run brew install blacktop/tap/ipsw to install a tool that will extract it from /System/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e.
mkdir -p /tmp/dyld_extract
ipsw dyld extract \
/System/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e \
SummarizationKit \
-o /tmp/dyld_cache • Created /tmp/dyld_cache/SummarizationKitI figured somewhere in the binary, there will be a reference to the skencv1 string, so I grepped it:
strings -t x SummarizationKit | grep skencv11acc30 skencv1I loaded SummarizationKit into Ghidra and searched for 73 6b 65 6e 63 76 31 (ASCII for skencv1) and found a reference at 0x29dfaf2b0. After tracing the XREFS, I eventually found the function that was responsible for handling the skencv1 format, FUN_268cfdaf8. The psuedocode for the function is as follows:
undefined1[16] FUN_268cfdaf8(long param_1, ulong param_2)
{
// 1. Lazy-initialize skencv1 constant (first time only)
if (DAT_299467550 != -1) {
_swift_once(&DAT_299467550, FUN_268cfdacc);
}
// 2. Load skencv1 string reference
uVar2 = DAT_29946b5c0;
uVar10 = DAT_29946b5b8;
// 3. Extract first 7 bytes of input
FUN_268d34fb8(0, 7, param_1, param_2);
// 4. Compare to "skencv1"
FUN_268cfeaac(uVar10, uVar2, uVar7, uVar9);
// 5. If magic matches, proceed to decrypt
if ((uVar10 & 1) != 0) {
// === KEY INITIALIZATION ===
if (DAT_299467558 != -1) {
_swift_once(&DAT_299467558, FUN_268cfe2bc); // <-- CRITICAL
}
// 6. Load key material
uVar7 = DAT_29946ac00; // <-- Key data
uVar2 = DAT_29946abf8; // <-- Key data
// 7. Skip the 7-byte magic header
FUN_268d34fb8(7, uVar10, param_1, param_2);
// ... decryption continues ...
}
}I tried and failed several different ways of using Frida to hook various apple crypto functions to dump the key or the data being decrypted, but I had no luck. I also tried dumping the entire memory of the process after triggering the decryption in the hopes that the key or decrypted data would be floating around somewhere, but again no luck. With some help from Codex, I created an lldb debugging script that was ASLR-aware and would set a breakpoint on the FUN_268cfdaf8 function. After triggering the decryption in Notes, I was able to hit the breakpoint and inspect the registers to find the key material being used for decryption.
The script can be found here:
At a high level, it does the following:
- Calculates ASLR slide from SummarizationKit load address
- Sets breakpoints on Ghidra-identified functions at runtime addresses
- Captures register values (x0/x1) at entry and return
- Dumps key-global memory snapshots before/after events
- Decodes Swift-like buffer objects from x1 layouts
- Auto-dumps input/output buffers to /tmp/sktrace-dumps
The commands to run the script are:
launchctl kill SIGKILL gui/$(id -u)/com.apple.generativeexperiencesd 2>/dev/null || true
xcrun lldb -w -n generativeexperiencesdThen while in lldb:
command script import lldb_skencv1_trace.py
sktrace_init
cTrigger text summarization from the Notes app, then watch the lldb console for the decryption to happen.
| #!/usr/bin/env python3 | |
| """ | |
| LLDB helper to trace skencv1 anchor functions in SummarizationKit. | |
| Usage: | |
| (lldb) command script import /path/to/text-to-summary/lldb_skencv1_trace.py | |
| (lldb) sktrace_init # auto-detects SummarizationKit load base | |
| (lldb) # or: sktrace_init 0x287000000 | |
| (lldb) c | |
| Optional: | |
| (lldb) sktrace_show_globals | |
| (lldb) sktrace_set_dump_dir /tmp/sktrace-dumps | |
| (lldb) sktrace_set_max_dumps 128 | |
| """ | |
| import time | |
| import struct | |
| import os | |
| import lldb | |
| GHIDRA_IMAGE_BASE = 0x268BC0000 | |
| ANCHOR_FUNCS = { | |
| "main_handler": 0x268CFDAF8, # FUN_268cfdaf8 | |
| "key_init": 0x268CFE2BC, # FUN_268cfe2bc | |
| "key_source": 0x268CFEF18, # FUN_268cfef18 | |
| } | |
| KEY_GLOBALS = { | |
| "key_g0": 0x29946ABF8, | |
| "key_g1": 0x29946AC00, | |
| "key_g2": 0x29946AC08, | |
| "key_g3": 0x29946AC10, | |
| } | |
| MAX_HEX_DUMP = 96 | |
| MAX_PTR_DUMP = 64 | |
| _STATE = { | |
| "slide": None, | |
| "module_base": None, | |
| "installed": False, | |
| "hit_counts": {}, | |
| "last_main_ret_sig": None, | |
| "last_key_init_ret_sig": None, | |
| "last_key_source_ret_sig": None, | |
| "last_globals_hex": {}, | |
| "main_thread_seen_at": {}, | |
| "context_window_s": 0.4, | |
| "main_entry_count": 0, | |
| "main_return_count": 0, | |
| "input_dump_count": 0, | |
| "output_dump_count": 0, | |
| "dump_dir": "/tmp/sktrace-dumps", | |
| "max_auto_dumps": 32, | |
| "key_init_log_limit": 4, | |
| "key_source_log_limit": 4, | |
| "key_init_entry_seen": 0, | |
| "key_source_entry_seen": 0, | |
| } | |
| def _log(msg): | |
| print("[sktrace] {}".format(msg)) | |
| def _read_mem(process, addr, size): | |
| if not addr: | |
| return None | |
| err = lldb.SBError() | |
| data = process.ReadMemory(addr, size, err) | |
| if not err.Success(): | |
| return None | |
| return data | |
| def _untag_ptr(ptr): | |
| # arm64e user pointers in this target frequently carry top-byte tags (for example 0x40...) | |
| # TBI means masking to 56 bits is usually the right canonical form for LLDB memory reads. | |
| return ptr & 0x00FFFFFFFFFFFFFF | |
| def _u64_le(buf, off): | |
| if buf is None or off + 8 > len(buf): | |
| return 0 | |
| return struct.unpack_from("<Q", buf, off)[0] | |
| def _hexdump(data, width=16): | |
| if not data: | |
| return "<empty>" | |
| if isinstance(data, str): | |
| data = data.encode("latin1", errors="ignore") | |
| lines = [] | |
| for i in range(0, len(data), width): | |
| chunk = data[i : i + width] | |
| hx = " ".join("{:02x}".format(b) for b in chunk) | |
| lines.append(hx) | |
| return "\n".join(lines) | |
| def _reg_u64(frame, reg_name): | |
| reg = frame.FindRegister(reg_name) | |
| if not reg.IsValid(): | |
| return 0 | |
| return reg.GetValueAsUnsigned() | |
| def _dump_ptr(process, label, ptr, size=MAX_PTR_DUMP): | |
| if ptr == 0: | |
| _log("{}: 0x0".format(label)) | |
| return | |
| canonical = _untag_ptr(ptr) | |
| blob = _read_mem(process, canonical, size) | |
| if blob is None: | |
| _log("{}: 0x{:x} (canonical 0x{:x}, unreadable)".format(label, ptr, canonical)) | |
| return | |
| blob = blob[:MAX_HEX_DUMP] | |
| _log("{}: 0x{:x} (canonical 0x{:x})\n{}".format(label, ptr, canonical, _hexdump(blob))) | |
| def _ascii_preview(blob): | |
| if not blob: | |
| return "" | |
| out = [] | |
| for b in blob: | |
| if 32 <= b <= 126: | |
| out.append(chr(b)) | |
| else: | |
| out.append(".") | |
| return "".join(out) | |
| def _extract_swift_like_buffer(process, obj_ptr): | |
| if obj_ptr == 0: | |
| return None | |
| obj = _untag_ptr(obj_ptr) | |
| header = _read_mem(process, obj, 0x30) | |
| if header is None or len(header) < 0x28: | |
| return None | |
| cand_ptr = _u64_le(header, 0x10) | |
| cand_len = _u64_le(header, 0x18) | |
| cand_cap = _u64_le(header, 0x20) | |
| if cand_ptr == 0 or cand_len == 0: | |
| return None | |
| if cand_len > 8 * 1024 * 1024: | |
| return None | |
| buf_ptr = _untag_ptr(cand_ptr) | |
| read_len = min(cand_len, 256 * 1024) | |
| data = _read_mem(process, buf_ptr, read_len) | |
| if data is None: | |
| return None | |
| preview = data[:64] | |
| return { | |
| "obj_ptr": obj_ptr, | |
| "ptr": buf_ptr, | |
| "len": cand_len, | |
| "cap": cand_cap, | |
| "preview": preview, | |
| "data": data, | |
| } | |
| def _maybe_dump_buffer(kind, info): | |
| if info is None: | |
| return None | |
| if kind == "input": | |
| _STATE["input_dump_count"] += 1 | |
| idx = _STATE["input_dump_count"] | |
| else: | |
| _STATE["output_dump_count"] += 1 | |
| idx = _STATE["output_dump_count"] | |
| if idx > _STATE["max_auto_dumps"]: | |
| return None | |
| os.makedirs(_STATE["dump_dir"], exist_ok=True) | |
| path = os.path.join( | |
| _STATE["dump_dir"], | |
| "{}_{:03d}_len{}.bin".format(kind, idx, info["len"]), | |
| ) | |
| with open(path, "wb") as f: | |
| f.write(info["data"]) | |
| return path | |
| def _dump_swift_like_buffer(process, label, obj_ptr): | |
| info = _extract_swift_like_buffer(process, obj_ptr) | |
| if info is None: | |
| return None | |
| head8 = info["preview"][:8] | |
| ascii_preview = _ascii_preview(info["preview"][:32]) | |
| _log( | |
| "{} decoded-buffer obj=0x{:x} ptr=0x{:x} len={} cap={} head8={} ascii='{}'".format( | |
| label, | |
| info["obj_ptr"], | |
| info["ptr"], | |
| info["len"], | |
| info["cap"], | |
| head8.hex(), | |
| ascii_preview, | |
| ) | |
| ) | |
| _log("{} decoded-buffer hex:\n{}".format(label, _hexdump(info["preview"]))) | |
| return info | |
| def _inc_hit(name): | |
| current = _STATE["hit_counts"].get(name, 0) + 1 | |
| _STATE["hit_counts"][name] = current | |
| return current | |
| def _remember_main_thread(thread_id): | |
| _STATE["main_thread_seen_at"][thread_id] = time.time() | |
| def _is_recent_main_thread(thread_id): | |
| t = _STATE["main_thread_seen_at"].get(thread_id) | |
| if t is None: | |
| return False | |
| return (time.time() - t) <= _STATE["context_window_s"] | |
| def _runtime_addr(static_addr): | |
| slide = _STATE.get("slide") | |
| if slide is None: | |
| return None | |
| return static_addr + slide | |
| def _snapshot_key_globals(process): | |
| slide = _STATE.get("slide") | |
| if slide is None: | |
| _log("slide not initialized; cannot snapshot key globals") | |
| return | |
| for label, static_addr in KEY_GLOBALS.items(): | |
| runtime_addr = static_addr + slide | |
| blob = _read_mem(process, runtime_addr, 32) | |
| if blob is None: | |
| _log("{} @ 0x{:x}: unreadable".format(label, runtime_addr)) | |
| continue | |
| hx = blob.hex() | |
| _log("{} @ 0x{:x}: {}".format(label, runtime_addr, hx)) | |
| if _STATE["last_globals_hex"].get(label) == hx: | |
| continue | |
| _STATE["last_globals_hex"][label] = hx | |
| _dump_key_global_candidates(process, label, blob) | |
| def _dump_key_global_candidates(process, label, blob): | |
| if blob is None or len(blob) < 32: | |
| return | |
| q = [struct.unpack_from("<Q", blob, i * 8)[0] for i in range(4)] | |
| _log( | |
| "{} qwords: [{}]".format( | |
| label, ", ".join("0x{:x}".format(x) for x in q) | |
| ) | |
| ) | |
| seen = set() | |
| for idx, raw in enumerate(q): | |
| if raw == 0: | |
| continue | |
| ptr = _untag_ptr(raw) | |
| if ptr in seen: | |
| continue | |
| seen.add(ptr) | |
| if ptr < 0x1000: | |
| continue | |
| chunk = _read_mem(process, ptr, 0x40) | |
| if chunk is None: | |
| _log("{} q{} ptr=0x{:x} unreadable".format(label, idx, ptr)) | |
| continue | |
| _log("{} q{} ptr=0x{:x} hdr:\n{}".format(label, idx, ptr, _hexdump(chunk[:0x40]))) | |
| _dump_swift_like_buffer(process, "{} q{}".format(label, idx), raw) | |
| def _set_temp_return_bp(target, return_addr, callback_name): | |
| bp = target.BreakpointCreateByAddress(return_addr) | |
| bp.SetOneShot(True) | |
| bp.SetAutoContinue(True) | |
| bp.SetScriptCallbackFunction("{}.{}".format(__name__, callback_name)) | |
| return bp | |
| def _common_entry_trace(frame, label): | |
| target = frame.GetThread().GetProcess().GetTarget() | |
| pc = frame.GetPCAddress().GetLoadAddress(target) | |
| tid = frame.GetThread().GetThreadID() | |
| c = _inc_hit(label) | |
| x0 = _reg_u64(frame, "x0") | |
| x1 = _reg_u64(frame, "x1") | |
| x2 = _reg_u64(frame, "x2") | |
| x3 = _reg_u64(frame, "x3") | |
| _log( | |
| "{} entry #{} t={} pc=0x{:x} x0=0x{:x} x1=0x{:x} x2=0x{:x} x3=0x{:x}".format( | |
| label, c, tid, pc, x0, x1, x2, x3 | |
| ) | |
| ) | |
| if label == "FUN_268cfdaf8": | |
| _remember_main_thread(tid) | |
| _STATE["main_entry_count"] += 1 | |
| if x0 >> 32: | |
| _log("{} x0_hi32(candidate_len)={}".format(label, x0 >> 32)) | |
| process = frame.GetThread().GetProcess() | |
| _dump_ptr(process, "{} x0".format(label), x0) | |
| _dump_ptr(process, "{} x1".format(label), x1) | |
| info = _dump_swift_like_buffer(process, "{} x1".format(label), x1) | |
| if label == "FUN_268cfdaf8" and info is not None: | |
| if info["preview"][:7] == b"skencv1": | |
| p = _maybe_dump_buffer("input", info) | |
| if p: | |
| _log("input buffer dumped: {}".format(p)) | |
| def main_handler_ret_cb(frame, bp_loc, _dict): | |
| process = frame.GetThread().GetProcess() | |
| target = process.GetTarget() | |
| pc = frame.GetPCAddress().GetLoadAddress(target) | |
| x0 = _reg_u64(frame, "x0") | |
| x1 = _reg_u64(frame, "x1") | |
| sig = (pc, x0, x1) | |
| if _STATE.get("last_main_ret_sig") == sig: | |
| return False | |
| _STATE["last_main_ret_sig"] = sig | |
| _STATE["main_return_count"] += 1 | |
| _log("main_handler return pc=0x{:x} x0=0x{:x} x1=0x{:x}".format(pc, x0, x1)) | |
| if x0 >> 32: | |
| _log("main_handler return x0_hi32(candidate_len)={}".format(x0 >> 32)) | |
| _dump_ptr(process, "main_handler ret x0", x0) | |
| _dump_ptr(process, "main_handler ret x1", x1) | |
| out_info = _dump_swift_like_buffer(process, "main_handler ret x1", x1) | |
| if out_info is not None: | |
| p = _maybe_dump_buffer("output", out_info) | |
| if p: | |
| _log("output buffer dumped: {}".format(p)) | |
| _snapshot_key_globals(process) | |
| return False | |
| def key_init_ret_cb(frame, bp_loc, _dict): | |
| process = frame.GetThread().GetProcess() | |
| target = process.GetTarget() | |
| pc = frame.GetPCAddress().GetLoadAddress(target) | |
| x0 = _reg_u64(frame, "x0") | |
| x1 = _reg_u64(frame, "x1") | |
| sig = (pc, x0, x1) | |
| if _STATE.get("last_key_init_ret_sig") == sig: | |
| return False | |
| _STATE["last_key_init_ret_sig"] = sig | |
| _log("key_init return pc=0x{:x} x0=0x{:x} x1=0x{:x}".format(pc, x0, x1)) | |
| _dump_ptr(process, "key_init ret x0", x0) | |
| _dump_ptr(process, "key_init ret x1", x1) | |
| _dump_swift_like_buffer(process, "key_init ret x1", x1) | |
| _snapshot_key_globals(process) | |
| return False | |
| def key_source_ret_cb(frame, bp_loc, _dict): | |
| process = frame.GetThread().GetProcess() | |
| target = process.GetTarget() | |
| pc = frame.GetPCAddress().GetLoadAddress(target) | |
| x0 = _reg_u64(frame, "x0") | |
| x1 = _reg_u64(frame, "x1") | |
| sig = (pc, x0, x1) | |
| if _STATE.get("last_key_source_ret_sig") == sig: | |
| return False | |
| _STATE["last_key_source_ret_sig"] = sig | |
| _log("key_source return pc=0x{:x} x0=0x{:x} x1=0x{:x}".format(pc, x0, x1)) | |
| _dump_ptr(process, "key_source ret x0", x0) | |
| _dump_ptr(process, "key_source ret x1", x1) | |
| _dump_swift_like_buffer(process, "key_source ret x1", x1) | |
| _snapshot_key_globals(process) | |
| return False | |
| def main_handler_entry_cb(frame, bp_loc, _dict): | |
| _common_entry_trace(frame, "FUN_268cfdaf8") | |
| target = frame.GetThread().GetProcess().GetTarget() | |
| lr = _reg_u64(frame, "x30") | |
| if lr: | |
| _set_temp_return_bp(target, lr, "main_handler_ret_cb") | |
| return False | |
| def key_init_entry_cb(frame, bp_loc, _dict): | |
| tid = frame.GetThread().GetThreadID() | |
| if not _is_recent_main_thread(tid): | |
| return False | |
| _STATE["key_init_entry_seen"] += 1 | |
| if _STATE["key_init_entry_seen"] > _STATE["key_init_log_limit"]: | |
| return False | |
| _common_entry_trace(frame, "FUN_268cfe2bc") | |
| target = frame.GetThread().GetProcess().GetTarget() | |
| lr = _reg_u64(frame, "x30") | |
| if lr: | |
| _set_temp_return_bp(target, lr, "key_init_ret_cb") | |
| return False | |
| def key_source_entry_cb(frame, bp_loc, _dict): | |
| tid = frame.GetThread().GetThreadID() | |
| if not _is_recent_main_thread(tid): | |
| return False | |
| _STATE["key_source_entry_seen"] += 1 | |
| if _STATE["key_source_entry_seen"] > _STATE["key_source_log_limit"]: | |
| return False | |
| _common_entry_trace(frame, "FUN_268cfef18") | |
| target = frame.GetThread().GetProcess().GetTarget() | |
| lr = _reg_u64(frame, "x30") | |
| if lr: | |
| _set_temp_return_bp(target, lr, "key_source_ret_cb") | |
| return False | |
| def _detect_module_base(target, module_name_substr="SummarizationKit"): | |
| n = target.GetNumModules() | |
| for i in range(n): | |
| module = target.GetModuleAtIndex(i) | |
| if not module.IsValid(): | |
| continue | |
| filename = module.GetFileSpec().GetFilename() | |
| if not filename: | |
| continue | |
| if module_name_substr in filename: | |
| addr = module.GetObjectFileHeaderAddress() | |
| if addr.IsValid(): | |
| return addr.GetLoadAddress(target) | |
| return None | |
| def _install_bp(target, runtime_addr, cb_name, label): | |
| bp = target.BreakpointCreateByAddress(runtime_addr) | |
| bp.SetAutoContinue(True) | |
| bp.SetScriptCallbackFunction("{}.{}".format(__name__, cb_name)) | |
| _log("{} breakpoint #{} @ 0x{:x}".format(label, bp.GetID(), runtime_addr)) | |
| return bp | |
| def sktrace_init(debugger, command, exe_ctx, result, _dict): | |
| target = debugger.GetSelectedTarget() | |
| if not target.IsValid(): | |
| result.PutCString("No valid target.") | |
| return | |
| arg = command.strip() | |
| module_base = None | |
| if arg: | |
| try: | |
| module_base = int(arg, 16) | |
| except ValueError: | |
| result.PutCString("Invalid address: {}".format(arg)) | |
| return | |
| else: | |
| module_base = _detect_module_base(target) | |
| if module_base is None: | |
| result.PutCString( | |
| "Unable to auto-detect SummarizationKit base. Pass one manually: sktrace_init 0x..." | |
| ) | |
| return | |
| slide = module_base - GHIDRA_IMAGE_BASE | |
| _STATE["slide"] = slide | |
| _STATE["module_base"] = module_base | |
| _STATE["hit_counts"] = {} | |
| _STATE["last_main_ret_sig"] = None | |
| _STATE["last_key_init_ret_sig"] = None | |
| _STATE["last_key_source_ret_sig"] = None | |
| _STATE["last_globals_hex"] = {} | |
| _STATE["main_thread_seen_at"] = {} | |
| _STATE["main_entry_count"] = 0 | |
| _STATE["main_return_count"] = 0 | |
| _STATE["input_dump_count"] = 0 | |
| _STATE["output_dump_count"] = 0 | |
| _STATE["key_init_entry_seen"] = 0 | |
| _STATE["key_source_entry_seen"] = 0 | |
| _log( | |
| "module_base=0x{:x} ghidra_base=0x{:x} slide=0x{:x}".format( | |
| module_base, GHIDRA_IMAGE_BASE, slide | |
| ) | |
| ) | |
| _install_bp( | |
| target, | |
| _runtime_addr(ANCHOR_FUNCS["main_handler"]), | |
| "main_handler_entry_cb", | |
| "FUN_268cfdaf8", | |
| ) | |
| _install_bp( | |
| target, | |
| _runtime_addr(ANCHOR_FUNCS["key_init"]), | |
| "key_init_entry_cb", | |
| "FUN_268cfe2bc", | |
| ) | |
| _install_bp( | |
| target, | |
| _runtime_addr(ANCHOR_FUNCS["key_source"]), | |
| "key_source_entry_cb", | |
| "FUN_268cfef18", | |
| ) | |
| process = target.GetProcess() | |
| if process and process.IsValid(): | |
| _snapshot_key_globals(process) | |
| _STATE["installed"] = True | |
| result.PutCString( | |
| "sktrace initialized at {}. Dumps dir: {}. Continue with 'c' and trigger config load.".format( | |
| time.strftime("%Y-%m-%d %H:%M:%S"), _STATE["dump_dir"] | |
| ) | |
| ) | |
| def sktrace_show_globals(debugger, command, exe_ctx, result, _dict): | |
| target = debugger.GetSelectedTarget() | |
| if not target.IsValid(): | |
| result.PutCString("No valid target.") | |
| return | |
| process = target.GetProcess() | |
| if not process.IsValid(): | |
| result.PutCString("No running process.") | |
| return | |
| if _STATE.get("slide") is None: | |
| result.PutCString("Run sktrace_init first.") | |
| return | |
| _snapshot_key_globals(process) | |
| result.PutCString("Done.") | |
| def sktrace_set_dump_dir(debugger, command, exe_ctx, result, _dict): | |
| dump_dir = command.strip() | |
| if not dump_dir: | |
| result.PutCString("Usage: sktrace_set_dump_dir /absolute/or/relative/path") | |
| return | |
| _STATE["dump_dir"] = os.path.abspath(dump_dir) | |
| os.makedirs(_STATE["dump_dir"], exist_ok=True) | |
| result.PutCString("dump_dir set to: {}".format(_STATE["dump_dir"])) | |
| def sktrace_set_max_dumps(debugger, command, exe_ctx, result, _dict): | |
| raw = command.strip() | |
| if not raw: | |
| result.PutCString("Usage: sktrace_set_max_dumps <positive_int>") | |
| return | |
| try: | |
| val = int(raw) | |
| except ValueError: | |
| result.PutCString("Invalid integer: {}".format(raw)) | |
| return | |
| if val <= 0: | |
| result.PutCString("max dumps must be > 0") | |
| return | |
| _STATE["max_auto_dumps"] = val | |
| result.PutCString("max_auto_dumps set to: {}".format(val)) | |
| def __lldb_init_module(debugger, _dict): | |
| debugger.HandleCommand( | |
| "command script add -f {}.sktrace_init sktrace_init".format(__name__) | |
| ) | |
| debugger.HandleCommand( | |
| "command script add -f {}.sktrace_show_globals sktrace_show_globals".format(__name__) | |
| ) | |
| debugger.HandleCommand( | |
| "command script add -f {}.sktrace_set_dump_dir sktrace_set_dump_dir".format(__name__) | |
| ) | |
| debugger.HandleCommand( | |
| "command script add -f {}.sktrace_set_max_dumps sktrace_set_max_dumps".format(__name__) | |
| ) | |
| _log( | |
| "Loaded. Commands: sktrace_init [module_base_hex], sktrace_show_globals, " | |
| "sktrace_set_dump_dir <path>, sktrace_set_max_dumps <n>" | |
| ) |
FM Overrides
I some great research on the FM Override files done by github.com/BlueFalconHD/apple_generative_model_safety_decrypted
Their process for finding the decryption key was roughly:
- Use DTrace to identify which process reads .enc files
- Found GenerativeExperiencesSafetyInferenceProvider calls ModelCatalog.Obfuscation.readObfuscatedContents
- Set LLDB breakpoint on CryptoKit.AES.GCM.open(_:using:) at offset +36
- Read the SymmetricKey from register using Xcode’s Swift LLDB
Through their research, I was able to get this output from lldb:
🔑 dae8ad6ae7cee414a60525b107abbb3ec6d3f34d398d8c38317f67a3ddfc9989Using this key, we can run their decrypt_overrides.py script to decrypt all of the FM Override files. What we discover is:
| Rule Type | Count | Description |
|---|---|---|
reject |
56 | Exact phrases that block the entire request |
remove |
2 | Phrases silently removed from text |
replace |
4 | Pattern → replacement mappings |
regexReject |
1,219 | Regex patterns that block the request |
regexReplace |
880 | Regex patterns with replacements |
| Total | 2,161 | All safety rules |
In an off-chance, I tried to use the same decryption key that was found via lldb to decrypt the pbtxt files, but they didn’t work.