Finding the configuration files

I started by running fs_usage to see what kind of files were being accessed when I triggered summarization in the Notes app. A couple of paths caught my eye:

sudo fs_usage -w -f filesys | grep -i safety
19:41:01.285622  stat64                                 /System/Library/AssetsV2/com_apple_MobileAsset_UAF_FM_Overrides/purpose_auto/4b6103270669db6c18c7db52de8a466cc3bcd1ed.asset/AssetData                                 0.000003   GenerativeExperiencesSafetyInfer.11988643
19:41:01.285708  stat64                                 ary/AssetsV2/locks/com.apple.UnifiedAssetFramework/com.apple.MobileAsset.UAF.FM.Overrides/shared_locks/atomic_instance_3EEBCF9B-B245-4218-B810-D18DF18D77E3.locker    0.000003   GenerativeExperiencesSafetyInfer.11988643
19:41:01.285715  close             F=3                                                                                                                                                                                        0.000006   GenerativeExperiencesSafetyInfer.11988643
19:41:01.285912  close             F=4                                                                                                                                                                                        0.000042   GenerativeExperiencesSafetyInfer.12349812
19:41:03.119508  open              F=9        (R_____N________)  apple_MobileAsset_UAF_SummarizationKitConfiguration/purpose_auto/f46a9f714d900cc628bc5c42f704d7d6e68fcd29.asset/AssetData/ClientSafetyConfiguration.pbtxt    0.000066   generativeexperiencesd.12349901
19:41:03.119669    RdData[A]       D=0x003a6a3d  B=0x7000   /dev/disk3s5  ileAsset_UAF_SummarizationKitConfiguration/purpose_auto/f46a9f714d900cc628bc5c42f704d7d6e68fcd29.asset/AssetData/ClientSafetyConfiguration.pbtxt    0.000155 W generativeexperi.12349901
19:41:03.119964  stat64                                 A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt                                                                                                          0.000009   generativeexperiencesd.12349901
19:41:03.119969  lstat64                                A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt                                                                                                          0.000004   generativeexperiencesd.12349901
19:41:03.120066  open              F=9        (R_____N________)  A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt                                                                                                 0.000093   generativeexperiencesd.12349901
19:41:03.120229    RdData[A]       D=0x00fd1557  B=0x7000   /dev/disk3s1s1  A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt                                                                                      0.000153 W generativeexperi.12349901
19:41:03.197048  open              F=3        (R_____Nl_______)  sV2/locks/com.apple.UnifiedAssetFramework/com.apple.MobileAsset.UAF.FM.Overrides/shared_locks/atomic_instance_3EEBCF9B-B245-4218-B810-D18DF18D77E3.locker    0.000097   GenerativeExperiencesSafetyInfer.11988643
19:41:03.197472  getattrlist            [  2]           /System/Library/UnifiedAssetFramework/MinVersions                                                                                                                     0.000004   GenerativeExperiencesSafetyInfer.11988643
19:41:03.197480  getattrlist            [  2]           /System/Library/UnifiedAssetFramework/MinVersions                                                                                                                     0.000001   GenerativeExperiencesSafetyInfer.11988643
19:41:03.197641  open              F=4        (R_____Nl_______)  sV2/locks/com.apple.UnifiedAssetFramework/com.apple.MobileAsset.UAF.FM.Overrides/shared_locks/atomic_instance_3EEBCF9B-B245-4218-B810-D18DF18D77E3.locker    0.000061   GenerativeExperiencesSafetyInfer.11988643

It seems like there are two main types of files being accessed here. The first is the FM_Overrides which is a deny list of terms that are blocked from being summarized. The second is a binary called generativeexperiencesd which opens some pbtxt files related to safety. Searching for these files gave me the following results:

find /System/Library/AssetsV2 -name "*MobileAsset.*UAF*"
/System/Library/AssetsV2/persisted/AutoAssetLocker/AutoAssetLocker_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.rules_1.0.6.13.202380_0.state
/System/Library/AssetsV2/persisted/AutoAssetLocker/AutoAssetLocker_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.configuration_1.1.14.13.202380_0.state
/System/Library/AssetsV2/persisted/AutoAssetDescriptors/AutoAssetDescriptors_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.rules_1.0.6.13.202380_0.state
/System/Library/AssetsV2/persisted/AutoAssetDescriptors/AutoAssetDescriptors_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.configuration_1.1.14.13.202380_0.state
/System/Library/AssetsV2/com_apple_MobileAsset_UAF_SummarizationKitConfiguration
/System/Library/AssetsV2/com_apple_MobileAsset_UAF_SummarizationKitConfiguration/purpose_auto/com_apple_MobileAsset_UAF_SummarizationKitConfiguration.xml
/System/Library/AssetsV2/com_apple_MobileAsset_UAF_SummarizationKitConfiguration/purpose_auto/3edcc9828b6f280a53f69b8b35ff9d664386ecb4.asset/AssetData/SummarizationOverrideRules.pbtxt

These filenames looked really interesting to me, specifically com_apple_MobileAsset_UAF_SummarizationKitConfiguration.xml and SummarizationOverrideRules.pbtxt. After looking at the contents of the XML file, it became clear that Apple uses a two-layer file structure for these files. The outer layer is AEA1 (Apple Encrypted Archive) with decryption keys defined in the XML:

<dict>
    <key>ArchiveDecryptionKey</key>
    <string>WcCBfaFArWm9RYdRAtxGpmhLXzIjTkWbFHLdOkucv64=</string>
    <key>AssetSpecifier</key>
    <string>com.apple.summarizationkit.ota.configuration</string>
    <key>__BaseURL</key>
    <string>https://updates.cdn-apple.com/2024/Iris/mobileassets/023-67517/C5C95674-2976-43AB-9DA4-19EB7764867A/</string>
    <key>__RelativePath</key>
    <string>com_apple_MobileAsset_UAF_SummarizationKitConfiguration/196573C4-3A88-4E86-B004-7A87C4D947EC.aar</string>
</dict>

We can download and confirm the AEA1 magic bytes of the archive using:

curl -sL "https://updates.cdn-apple.com/2024/Iris/mobileassets/023-67517/C5C95674-2976-43AB-9DA4-19EB7764867A/com_apple_MobileAsset_UAF_SummarizationKitConfiguration/196573C4-3A88-4E86-B004-7A87C4D947EC.aar" -o config.aar

xxd config.aar | head -1
00000000: 4145 4131 ...  (AEA1 magic)

And then decrypt the archive using the key from the XML:

echo "WcCBfaFArWm9RYdRAtxGpmhLXzIjTkWbFHLdOkucv64=" | base64 -d > key.bin
aea decrypt -i config.aar -o decrypted.aa -key key.bin -v
profile: hkdf_sha256_aesctr_hmac__symmetric__none
raw data size: 34134 B

Now we can list the contents of the decrypted archive which show the pbtxt configuration files:

aa list -i decrypted.aa -v
F PAT=AssetData/ClassificationConfiguration.pbtxt
F PAT=AssetData/ClientSafetyConfiguration.pbtxt  
F PAT=AssetData/ClientSwitchConfiguration.pbtxt
F PAT=Info.plist

Looking at the header of these pbtxt files, we see the magic bytes for skencv1. This is an undocumented encryption scheme that Apple uses various things.

00000000: 736b 656e 6376 3198 c721 812d 3fe2 c025  skencv1..!.-?..%
00000010: 6ab5 a673 051f 3891 e9da 1b4b b342 39e4  j..s..8....K.B9.
00000020: 6461 fa94 03a3 3e08 4b4d c31f b1e1 0509  da....>.KM......

Decrypting the configuration files

I couldn’t find any documentation on the skencv1 format, I tried some naive things like decrypting it with the same key as the AEA1 archive, but it didn’t work. The SummarizationKit.framework binary isn’t directly accessible as a standalone file, on modern macOS it lives inside a dyld shared cache which is a single monolithic binary containing many system frameworks. You can run brew install blacktop/tap/ipsw to install a tool that will extract it from /System/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e.

mkdir -p /tmp/dyld_extract
ipsw dyld extract \
/System/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e \
SummarizationKit \
-o /tmp/dyld_cache
    • Created /tmp/dyld_cache/SummarizationKit

I figured somewhere in the binary, there will be a reference to the skencv1 string, so I grepped it:

strings -t x SummarizationKit | grep skencv1
1acc30 skencv1

I loaded SummarizationKit into Ghidra and searched for 73 6b 65 6e 63 76 31 (ASCII for skencv1) and found a reference at 0x29dfaf2b0. After tracing the XREFS, I eventually found the function that was responsible for handling the skencv1 format, FUN_268cfdaf8. The psuedocode for the function is as follows:

undefined1[16] FUN_268cfdaf8(long param_1, ulong param_2)
{
    // 1. Lazy-initialize skencv1 constant (first time only)
    if (DAT_299467550 != -1) {
        _swift_once(&DAT_299467550, FUN_268cfdacc);
    }
    
    // 2. Load skencv1 string reference
    uVar2 = DAT_29946b5c0;
    uVar10 = DAT_29946b5b8;
    
    // 3. Extract first 7 bytes of input
    FUN_268d34fb8(0, 7, param_1, param_2);
    
    // 4. Compare to "skencv1"
    FUN_268cfeaac(uVar10, uVar2, uVar7, uVar9);
    
    // 5. If magic matches, proceed to decrypt
    if ((uVar10 & 1) != 0) {
        
        // === KEY INITIALIZATION ===
        if (DAT_299467558 != -1) {
            _swift_once(&DAT_299467558, FUN_268cfe2bc);  // <-- CRITICAL
        }
        
        // 6. Load key material
        uVar7 = DAT_29946ac00;   // <-- Key data
        uVar2 = DAT_29946abf8;   // <-- Key data
        
        // 7. Skip the 7-byte magic header
        FUN_268d34fb8(7, uVar10, param_1, param_2);
        
        // ... decryption continues ...
    }
}

I tried and failed several different ways of using Frida to hook various apple crypto functions to dump the key or the data being decrypted, but I had no luck. I also tried dumping the entire memory of the process after triggering the decryption in the hopes that the key or decrypted data would be floating around somewhere, but again no luck. With some help from Codex, I created an lldb debugging script that was ASLR-aware and would set a breakpoint on the FUN_268cfdaf8 function. After triggering the decryption in Notes, I was able to hit the breakpoint and inspect the registers to find the key material being used for decryption.

The script can be found here:

At a high level, it does the following:

  1. Calculates ASLR slide from SummarizationKit load address
  2. Sets breakpoints on Ghidra-identified functions at runtime addresses
  3. Captures register values (x0/x1) at entry and return
  4. Dumps key-global memory snapshots before/after events
  5. Decodes Swift-like buffer objects from x1 layouts
  6. Auto-dumps input/output buffers to /tmp/sktrace-dumps

The commands to run the script are:

launchctl kill SIGKILL gui/$(id -u)/com.apple.generativeexperiencesd 2>/dev/null || true
xcrun lldb -w -n generativeexperiencesd

Then while in lldb:

command script import lldb_skencv1_trace.py
sktrace_init
c

Trigger text summarization from the Notes app, then watch the lldb console for the decryption to happen.

#!/usr/bin/env python3
"""
LLDB helper to trace skencv1 anchor functions in SummarizationKit.
Usage:
(lldb) command script import /path/to/text-to-summary/lldb_skencv1_trace.py
(lldb) sktrace_init # auto-detects SummarizationKit load base
(lldb) # or: sktrace_init 0x287000000
(lldb) c
Optional:
(lldb) sktrace_show_globals
(lldb) sktrace_set_dump_dir /tmp/sktrace-dumps
(lldb) sktrace_set_max_dumps 128
"""
import time
import struct
import os
import lldb
GHIDRA_IMAGE_BASE = 0x268BC0000
ANCHOR_FUNCS = {
"main_handler": 0x268CFDAF8, # FUN_268cfdaf8
"key_init": 0x268CFE2BC, # FUN_268cfe2bc
"key_source": 0x268CFEF18, # FUN_268cfef18
}
KEY_GLOBALS = {
"key_g0": 0x29946ABF8,
"key_g1": 0x29946AC00,
"key_g2": 0x29946AC08,
"key_g3": 0x29946AC10,
}
MAX_HEX_DUMP = 96
MAX_PTR_DUMP = 64
_STATE = {
"slide": None,
"module_base": None,
"installed": False,
"hit_counts": {},
"last_main_ret_sig": None,
"last_key_init_ret_sig": None,
"last_key_source_ret_sig": None,
"last_globals_hex": {},
"main_thread_seen_at": {},
"context_window_s": 0.4,
"main_entry_count": 0,
"main_return_count": 0,
"input_dump_count": 0,
"output_dump_count": 0,
"dump_dir": "/tmp/sktrace-dumps",
"max_auto_dumps": 32,
"key_init_log_limit": 4,
"key_source_log_limit": 4,
"key_init_entry_seen": 0,
"key_source_entry_seen": 0,
}
def _log(msg):
print("[sktrace] {}".format(msg))
def _read_mem(process, addr, size):
if not addr:
return None
err = lldb.SBError()
data = process.ReadMemory(addr, size, err)
if not err.Success():
return None
return data
def _untag_ptr(ptr):
# arm64e user pointers in this target frequently carry top-byte tags (for example 0x40...)
# TBI means masking to 56 bits is usually the right canonical form for LLDB memory reads.
return ptr & 0x00FFFFFFFFFFFFFF
def _u64_le(buf, off):
if buf is None or off + 8 > len(buf):
return 0
return struct.unpack_from("<Q", buf, off)[0]
def _hexdump(data, width=16):
if not data:
return "<empty>"
if isinstance(data, str):
data = data.encode("latin1", errors="ignore")
lines = []
for i in range(0, len(data), width):
chunk = data[i : i + width]
hx = " ".join("{:02x}".format(b) for b in chunk)
lines.append(hx)
return "\n".join(lines)
def _reg_u64(frame, reg_name):
reg = frame.FindRegister(reg_name)
if not reg.IsValid():
return 0
return reg.GetValueAsUnsigned()
def _dump_ptr(process, label, ptr, size=MAX_PTR_DUMP):
if ptr == 0:
_log("{}: 0x0".format(label))
return
canonical = _untag_ptr(ptr)
blob = _read_mem(process, canonical, size)
if blob is None:
_log("{}: 0x{:x} (canonical 0x{:x}, unreadable)".format(label, ptr, canonical))
return
blob = blob[:MAX_HEX_DUMP]
_log("{}: 0x{:x} (canonical 0x{:x})\n{}".format(label, ptr, canonical, _hexdump(blob)))
def _ascii_preview(blob):
if not blob:
return ""
out = []
for b in blob:
if 32 <= b <= 126:
out.append(chr(b))
else:
out.append(".")
return "".join(out)
def _extract_swift_like_buffer(process, obj_ptr):
if obj_ptr == 0:
return None
obj = _untag_ptr(obj_ptr)
header = _read_mem(process, obj, 0x30)
if header is None or len(header) < 0x28:
return None
cand_ptr = _u64_le(header, 0x10)
cand_len = _u64_le(header, 0x18)
cand_cap = _u64_le(header, 0x20)
if cand_ptr == 0 or cand_len == 0:
return None
if cand_len > 8 * 1024 * 1024:
return None
buf_ptr = _untag_ptr(cand_ptr)
read_len = min(cand_len, 256 * 1024)
data = _read_mem(process, buf_ptr, read_len)
if data is None:
return None
preview = data[:64]
return {
"obj_ptr": obj_ptr,
"ptr": buf_ptr,
"len": cand_len,
"cap": cand_cap,
"preview": preview,
"data": data,
}
def _maybe_dump_buffer(kind, info):
if info is None:
return None
if kind == "input":
_STATE["input_dump_count"] += 1
idx = _STATE["input_dump_count"]
else:
_STATE["output_dump_count"] += 1
idx = _STATE["output_dump_count"]
if idx > _STATE["max_auto_dumps"]:
return None
os.makedirs(_STATE["dump_dir"], exist_ok=True)
path = os.path.join(
_STATE["dump_dir"],
"{}_{:03d}_len{}.bin".format(kind, idx, info["len"]),
)
with open(path, "wb") as f:
f.write(info["data"])
return path
def _dump_swift_like_buffer(process, label, obj_ptr):
info = _extract_swift_like_buffer(process, obj_ptr)
if info is None:
return None
head8 = info["preview"][:8]
ascii_preview = _ascii_preview(info["preview"][:32])
_log(
"{} decoded-buffer obj=0x{:x} ptr=0x{:x} len={} cap={} head8={} ascii='{}'".format(
label,
info["obj_ptr"],
info["ptr"],
info["len"],
info["cap"],
head8.hex(),
ascii_preview,
)
)
_log("{} decoded-buffer hex:\n{}".format(label, _hexdump(info["preview"])))
return info
def _inc_hit(name):
current = _STATE["hit_counts"].get(name, 0) + 1
_STATE["hit_counts"][name] = current
return current
def _remember_main_thread(thread_id):
_STATE["main_thread_seen_at"][thread_id] = time.time()
def _is_recent_main_thread(thread_id):
t = _STATE["main_thread_seen_at"].get(thread_id)
if t is None:
return False
return (time.time() - t) <= _STATE["context_window_s"]
def _runtime_addr(static_addr):
slide = _STATE.get("slide")
if slide is None:
return None
return static_addr + slide
def _snapshot_key_globals(process):
slide = _STATE.get("slide")
if slide is None:
_log("slide not initialized; cannot snapshot key globals")
return
for label, static_addr in KEY_GLOBALS.items():
runtime_addr = static_addr + slide
blob = _read_mem(process, runtime_addr, 32)
if blob is None:
_log("{} @ 0x{:x}: unreadable".format(label, runtime_addr))
continue
hx = blob.hex()
_log("{} @ 0x{:x}: {}".format(label, runtime_addr, hx))
if _STATE["last_globals_hex"].get(label) == hx:
continue
_STATE["last_globals_hex"][label] = hx
_dump_key_global_candidates(process, label, blob)
def _dump_key_global_candidates(process, label, blob):
if blob is None or len(blob) < 32:
return
q = [struct.unpack_from("<Q", blob, i * 8)[0] for i in range(4)]
_log(
"{} qwords: [{}]".format(
label, ", ".join("0x{:x}".format(x) for x in q)
)
)
seen = set()
for idx, raw in enumerate(q):
if raw == 0:
continue
ptr = _untag_ptr(raw)
if ptr in seen:
continue
seen.add(ptr)
if ptr < 0x1000:
continue
chunk = _read_mem(process, ptr, 0x40)
if chunk is None:
_log("{} q{} ptr=0x{:x} unreadable".format(label, idx, ptr))
continue
_log("{} q{} ptr=0x{:x} hdr:\n{}".format(label, idx, ptr, _hexdump(chunk[:0x40])))
_dump_swift_like_buffer(process, "{} q{}".format(label, idx), raw)
def _set_temp_return_bp(target, return_addr, callback_name):
bp = target.BreakpointCreateByAddress(return_addr)
bp.SetOneShot(True)
bp.SetAutoContinue(True)
bp.SetScriptCallbackFunction("{}.{}".format(__name__, callback_name))
return bp
def _common_entry_trace(frame, label):
target = frame.GetThread().GetProcess().GetTarget()
pc = frame.GetPCAddress().GetLoadAddress(target)
tid = frame.GetThread().GetThreadID()
c = _inc_hit(label)
x0 = _reg_u64(frame, "x0")
x1 = _reg_u64(frame, "x1")
x2 = _reg_u64(frame, "x2")
x3 = _reg_u64(frame, "x3")
_log(
"{} entry #{} t={} pc=0x{:x} x0=0x{:x} x1=0x{:x} x2=0x{:x} x3=0x{:x}".format(
label, c, tid, pc, x0, x1, x2, x3
)
)
if label == "FUN_268cfdaf8":
_remember_main_thread(tid)
_STATE["main_entry_count"] += 1
if x0 >> 32:
_log("{} x0_hi32(candidate_len)={}".format(label, x0 >> 32))
process = frame.GetThread().GetProcess()
_dump_ptr(process, "{} x0".format(label), x0)
_dump_ptr(process, "{} x1".format(label), x1)
info = _dump_swift_like_buffer(process, "{} x1".format(label), x1)
if label == "FUN_268cfdaf8" and info is not None:
if info["preview"][:7] == b"skencv1":
p = _maybe_dump_buffer("input", info)
if p:
_log("input buffer dumped: {}".format(p))
def main_handler_ret_cb(frame, bp_loc, _dict):
process = frame.GetThread().GetProcess()
target = process.GetTarget()
pc = frame.GetPCAddress().GetLoadAddress(target)
x0 = _reg_u64(frame, "x0")
x1 = _reg_u64(frame, "x1")
sig = (pc, x0, x1)
if _STATE.get("last_main_ret_sig") == sig:
return False
_STATE["last_main_ret_sig"] = sig
_STATE["main_return_count"] += 1
_log("main_handler return pc=0x{:x} x0=0x{:x} x1=0x{:x}".format(pc, x0, x1))
if x0 >> 32:
_log("main_handler return x0_hi32(candidate_len)={}".format(x0 >> 32))
_dump_ptr(process, "main_handler ret x0", x0)
_dump_ptr(process, "main_handler ret x1", x1)
out_info = _dump_swift_like_buffer(process, "main_handler ret x1", x1)
if out_info is not None:
p = _maybe_dump_buffer("output", out_info)
if p:
_log("output buffer dumped: {}".format(p))
_snapshot_key_globals(process)
return False
def key_init_ret_cb(frame, bp_loc, _dict):
process = frame.GetThread().GetProcess()
target = process.GetTarget()
pc = frame.GetPCAddress().GetLoadAddress(target)
x0 = _reg_u64(frame, "x0")
x1 = _reg_u64(frame, "x1")
sig = (pc, x0, x1)
if _STATE.get("last_key_init_ret_sig") == sig:
return False
_STATE["last_key_init_ret_sig"] = sig
_log("key_init return pc=0x{:x} x0=0x{:x} x1=0x{:x}".format(pc, x0, x1))
_dump_ptr(process, "key_init ret x0", x0)
_dump_ptr(process, "key_init ret x1", x1)
_dump_swift_like_buffer(process, "key_init ret x1", x1)
_snapshot_key_globals(process)
return False
def key_source_ret_cb(frame, bp_loc, _dict):
process = frame.GetThread().GetProcess()
target = process.GetTarget()
pc = frame.GetPCAddress().GetLoadAddress(target)
x0 = _reg_u64(frame, "x0")
x1 = _reg_u64(frame, "x1")
sig = (pc, x0, x1)
if _STATE.get("last_key_source_ret_sig") == sig:
return False
_STATE["last_key_source_ret_sig"] = sig
_log("key_source return pc=0x{:x} x0=0x{:x} x1=0x{:x}".format(pc, x0, x1))
_dump_ptr(process, "key_source ret x0", x0)
_dump_ptr(process, "key_source ret x1", x1)
_dump_swift_like_buffer(process, "key_source ret x1", x1)
_snapshot_key_globals(process)
return False
def main_handler_entry_cb(frame, bp_loc, _dict):
_common_entry_trace(frame, "FUN_268cfdaf8")
target = frame.GetThread().GetProcess().GetTarget()
lr = _reg_u64(frame, "x30")
if lr:
_set_temp_return_bp(target, lr, "main_handler_ret_cb")
return False
def key_init_entry_cb(frame, bp_loc, _dict):
tid = frame.GetThread().GetThreadID()
if not _is_recent_main_thread(tid):
return False
_STATE["key_init_entry_seen"] += 1
if _STATE["key_init_entry_seen"] > _STATE["key_init_log_limit"]:
return False
_common_entry_trace(frame, "FUN_268cfe2bc")
target = frame.GetThread().GetProcess().GetTarget()
lr = _reg_u64(frame, "x30")
if lr:
_set_temp_return_bp(target, lr, "key_init_ret_cb")
return False
def key_source_entry_cb(frame, bp_loc, _dict):
tid = frame.GetThread().GetThreadID()
if not _is_recent_main_thread(tid):
return False
_STATE["key_source_entry_seen"] += 1
if _STATE["key_source_entry_seen"] > _STATE["key_source_log_limit"]:
return False
_common_entry_trace(frame, "FUN_268cfef18")
target = frame.GetThread().GetProcess().GetTarget()
lr = _reg_u64(frame, "x30")
if lr:
_set_temp_return_bp(target, lr, "key_source_ret_cb")
return False
def _detect_module_base(target, module_name_substr="SummarizationKit"):
n = target.GetNumModules()
for i in range(n):
module = target.GetModuleAtIndex(i)
if not module.IsValid():
continue
filename = module.GetFileSpec().GetFilename()
if not filename:
continue
if module_name_substr in filename:
addr = module.GetObjectFileHeaderAddress()
if addr.IsValid():
return addr.GetLoadAddress(target)
return None
def _install_bp(target, runtime_addr, cb_name, label):
bp = target.BreakpointCreateByAddress(runtime_addr)
bp.SetAutoContinue(True)
bp.SetScriptCallbackFunction("{}.{}".format(__name__, cb_name))
_log("{} breakpoint #{} @ 0x{:x}".format(label, bp.GetID(), runtime_addr))
return bp
def sktrace_init(debugger, command, exe_ctx, result, _dict):
target = debugger.GetSelectedTarget()
if not target.IsValid():
result.PutCString("No valid target.")
return
arg = command.strip()
module_base = None
if arg:
try:
module_base = int(arg, 16)
except ValueError:
result.PutCString("Invalid address: {}".format(arg))
return
else:
module_base = _detect_module_base(target)
if module_base is None:
result.PutCString(
"Unable to auto-detect SummarizationKit base. Pass one manually: sktrace_init 0x..."
)
return
slide = module_base - GHIDRA_IMAGE_BASE
_STATE["slide"] = slide
_STATE["module_base"] = module_base
_STATE["hit_counts"] = {}
_STATE["last_main_ret_sig"] = None
_STATE["last_key_init_ret_sig"] = None
_STATE["last_key_source_ret_sig"] = None
_STATE["last_globals_hex"] = {}
_STATE["main_thread_seen_at"] = {}
_STATE["main_entry_count"] = 0
_STATE["main_return_count"] = 0
_STATE["input_dump_count"] = 0
_STATE["output_dump_count"] = 0
_STATE["key_init_entry_seen"] = 0
_STATE["key_source_entry_seen"] = 0
_log(
"module_base=0x{:x} ghidra_base=0x{:x} slide=0x{:x}".format(
module_base, GHIDRA_IMAGE_BASE, slide
)
)
_install_bp(
target,
_runtime_addr(ANCHOR_FUNCS["main_handler"]),
"main_handler_entry_cb",
"FUN_268cfdaf8",
)
_install_bp(
target,
_runtime_addr(ANCHOR_FUNCS["key_init"]),
"key_init_entry_cb",
"FUN_268cfe2bc",
)
_install_bp(
target,
_runtime_addr(ANCHOR_FUNCS["key_source"]),
"key_source_entry_cb",
"FUN_268cfef18",
)
process = target.GetProcess()
if process and process.IsValid():
_snapshot_key_globals(process)
_STATE["installed"] = True
result.PutCString(
"sktrace initialized at {}. Dumps dir: {}. Continue with 'c' and trigger config load.".format(
time.strftime("%Y-%m-%d %H:%M:%S"), _STATE["dump_dir"]
)
)
def sktrace_show_globals(debugger, command, exe_ctx, result, _dict):
target = debugger.GetSelectedTarget()
if not target.IsValid():
result.PutCString("No valid target.")
return
process = target.GetProcess()
if not process.IsValid():
result.PutCString("No running process.")
return
if _STATE.get("slide") is None:
result.PutCString("Run sktrace_init first.")
return
_snapshot_key_globals(process)
result.PutCString("Done.")
def sktrace_set_dump_dir(debugger, command, exe_ctx, result, _dict):
dump_dir = command.strip()
if not dump_dir:
result.PutCString("Usage: sktrace_set_dump_dir /absolute/or/relative/path")
return
_STATE["dump_dir"] = os.path.abspath(dump_dir)
os.makedirs(_STATE["dump_dir"], exist_ok=True)
result.PutCString("dump_dir set to: {}".format(_STATE["dump_dir"]))
def sktrace_set_max_dumps(debugger, command, exe_ctx, result, _dict):
raw = command.strip()
if not raw:
result.PutCString("Usage: sktrace_set_max_dumps <positive_int>")
return
try:
val = int(raw)
except ValueError:
result.PutCString("Invalid integer: {}".format(raw))
return
if val <= 0:
result.PutCString("max dumps must be > 0")
return
_STATE["max_auto_dumps"] = val
result.PutCString("max_auto_dumps set to: {}".format(val))
def __lldb_init_module(debugger, _dict):
debugger.HandleCommand(
"command script add -f {}.sktrace_init sktrace_init".format(__name__)
)
debugger.HandleCommand(
"command script add -f {}.sktrace_show_globals sktrace_show_globals".format(__name__)
)
debugger.HandleCommand(
"command script add -f {}.sktrace_set_dump_dir sktrace_set_dump_dir".format(__name__)
)
debugger.HandleCommand(
"command script add -f {}.sktrace_set_max_dumps sktrace_set_max_dumps".format(__name__)
)
_log(
"Loaded. Commands: sktrace_init [module_base_hex], sktrace_show_globals, "
"sktrace_set_dump_dir <path>, sktrace_set_max_dumps <n>"
)

FM Overrides

I some great research on the FM Override files done by github.com/BlueFalconHD/apple_generative_model_safety_decrypted

Their process for finding the decryption key was roughly:

  1. Use DTrace to identify which process reads .enc files
  2. Found GenerativeExperiencesSafetyInferenceProvider calls ModelCatalog.Obfuscation.readObfuscatedContents
  3. Set LLDB breakpoint on CryptoKit.AES.GCM.open(_:using:) at offset +36
  4. Read the SymmetricKey from register using Xcode’s Swift LLDB

Through their research, I was able to get this output from lldb:

🔑 dae8ad6ae7cee414a60525b107abbb3ec6d3f34d398d8c38317f67a3ddfc9989

Using this key, we can run their decrypt_overrides.py script to decrypt all of the FM Override files. What we discover is:

Rule Type Count Description
reject 56 Exact phrases that block the entire request
remove 2 Phrases silently removed from text
replace 4 Pattern → replacement mappings
regexReject 1,219 Regex patterns that block the request
regexReplace 880 Regex patterns with replacements
Total 2,161 All safety rules

In an off-chance, I tried to use the same decryption key that was found via lldb to decrypt the pbtxt files, but they didn’t work.