Finding the configuration files

I started by running fs_usage to see what kind of files were being accessed when I triggered summarization in the Notes app. A couple of paths caught my eye:

sudo fs_usage -w -f filesys | grep -i safety

19:41:01.285622  stat64                                 /System/Library/AssetsV2/com_apple_MobileAsset_UAF_FM_Overrides/purpose_auto/4b6103270669db6c18c7db52de8a466cc3bcd1ed.asset/AssetData                                 0.000003   GenerativeExperiencesSafetyInfer.11988643
19:41:01.285708  stat64                                 ary/AssetsV2/locks/com.apple.UnifiedAssetFramework/com.apple.MobileAsset.UAF.FM.Overrides/shared_locks/atomic_instance_3EEBCF9B-B245-4218-B810-D18DF18D77E3.locker    0.000003   GenerativeExperiencesSafetyInfer.11988643
19:41:01.285715  close             F=3                                                                                                                                                                                        0.000006   GenerativeExperiencesSafetyInfer.11988643
19:41:01.285912  close             F=4                                                                                                                                                                                        0.000042   GenerativeExperiencesSafetyInfer.12349812
19:41:03.119508  open              F=9        (R_____N________)  apple_MobileAsset_UAF_SummarizationKitConfiguration/purpose_auto/f46a9f714d900cc628bc5c42f704d7d6e68fcd29.asset/AssetData/ClientSafetyConfiguration.pbtxt    0.000066   generativeexperiencesd.12349901
19:41:03.119669    RdData[A]       D=0x003a6a3d  B=0x7000   /dev/disk3s5  ileAsset_UAF_SummarizationKitConfiguration/purpose_auto/f46a9f714d900cc628bc5c42f704d7d6e68fcd29.asset/AssetData/ClientSafetyConfiguration.pbtxt    0.000155 W generativeexperi.12349901
19:41:03.119964  stat64                                 A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt                                                                                                          0.000009   generativeexperiencesd.12349901
19:41:03.119969  lstat64                                A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt                                                                                                          0.000004   generativeexperiencesd.12349901
19:41:03.120066  open              F=9        (R_____N________)  A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt                                                                                                 0.000093   generativeexperiencesd.12349901
19:41:03.120229    RdData[A]       D=0x00fd1557  B=0x7000   /dev/disk3s1s1  A/Resources/OTAConfiguration/ClientSafetyConfiguration.pbtxt                                                                                      0.000153 W generativeexperi.12349901
19:41:03.197048  open              F=3        (R_____Nl_______)  sV2/locks/com.apple.UnifiedAssetFramework/com.apple.MobileAsset.UAF.FM.Overrides/shared_locks/atomic_instance_3EEBCF9B-B245-4218-B810-D18DF18D77E3.locker    0.000097   GenerativeExperiencesSafetyInfer.11988643
19:41:03.197472  getattrlist            [  2]           /System/Library/UnifiedAssetFramework/MinVersions                                                                                                                     0.000004   GenerativeExperiencesSafetyInfer.11988643
19:41:03.197480  getattrlist            [  2]           /System/Library/UnifiedAssetFramework/MinVersions                                                                                                                     0.000001   GenerativeExperiencesSafetyInfer.11988643
19:41:03.197641  open              F=4        (R_____Nl_______)  sV2/locks/com.apple.UnifiedAssetFramework/com.apple.MobileAsset.UAF.FM.Overrides/shared_locks/atomic_instance_3EEBCF9B-B245-4218-B810-D18DF18D77E3.locker    0.000061   GenerativeExperiencesSafetyInfer.11988643

It seems like there are two main types of files being accessed here. The first is the FM_Overrides which is a deny list of terms that are blocked from being summarized. The second is a binary called generativeexperiencesd which opens some pbtxt files related to safety. Searching for these files gave me the following results:

find /System/Library/AssetsV2 -name "*MobileAsset.*UAF*"

/System/Library/AssetsV2/persisted/AutoAssetLocker/AutoAssetLocker_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.rules_1.0.6.13.202380_0.state
/System/Library/AssetsV2/persisted/AutoAssetLocker/AutoAssetLocker_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.configuration_1.1.14.13.202380_0.state
/System/Library/AssetsV2/persisted/AutoAssetDescriptors/AutoAssetDescriptors_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.rules_1.0.6.13.202380_0.state
/System/Library/AssetsV2/persisted/AutoAssetDescriptors/AutoAssetDescriptors_Entry_com.apple.MobileAsset.UAF.SummarizationKitConfiguration_com.apple.summarizationkit.ota.configuration_1.1.14.13.202380_0.state
/System/Library/AssetsV2/com_apple_MobileAsset_UAF_SummarizationKitConfiguration
/System/Library/AssetsV2/com_apple_MobileAsset_UAF_SummarizationKitConfiguration/purpose_auto/com_apple_MobileAsset_UAF_SummarizationKitConfiguration.xml
/System/Library/AssetsV2/com_apple_MobileAsset_UAF_SummarizationKitConfiguration/purpose_auto/3edcc9828b6f280a53f69b8b35ff9d664386ecb4.asset/AssetData/SummarizationOverrideRules.pbtxt

These filenames looked really interesting to me, specifically com_apple_MobileAsset_UAF_SummarizationKitConfiguration.xml and SummarizationOverrideRules.pbtxt. After looking at the contents of the XML file, it became clear that Apple uses a two-layer file structure for these files. The outer layer is AEA1 (Apple Encrypted Archive) with decryption keys defined in the XML:

<dict>
    <key>ArchiveDecryptionKey</key>
    <string>WcCBfaFArWm9RYdRAtxGpmhLXzIjTkWbFHLdOkucv64=</string>
    <key>AssetSpecifier</key>
    <string>com.apple.summarizationkit.ota.configuration</string>
    <key>__BaseURL</key>
    <string>https://updates.cdn-apple.com/2024/Iris/mobileassets/023-67517/C5C95674-2976-43AB-9DA4-19EB7764867A/</string>
    <key>__RelativePath</key>
    <string>com_apple_MobileAsset_UAF_SummarizationKitConfiguration/196573C4-3A88-4E86-B004-7A87C4D947EC.aar</string>
</dict>

We can download and confirm the AEA1 magic bytes of the archive using:

curl -sL "https://updates.cdn-apple.com/2024/Iris/mobileassets/023-67517/C5C95674-2976-43AB-9DA4-19EB7764867A/com_apple_MobileAsset_UAF_SummarizationKitConfiguration/196573C4-3A88-4E86-B004-7A87C4D947EC.aar" -o config.aar

xxd config.aar | head -1

00000000: 4145 4131 ...  (AEA1 magic)

And then decrypt the archive using the key from the XML:

echo "WcCBfaFArWm9RYdRAtxGpmhLXzIjTkWbFHLdOkucv64=" | base64 -d > key.bin
aea decrypt -i config.aar -o decrypted.aa -key key.bin -v

profile: hkdf_sha256_aesctr_hmac__symmetric__none
raw data size: 34134 B

Now we can list the contents of the decrypted archive which show the pbtxt configuration files:

aa list -i decrypted.aa -v

F PAT=AssetData/ClassificationConfiguration.pbtxt
F PAT=AssetData/ClientSafetyConfiguration.pbtxt  
F PAT=AssetData/ClientSwitchConfiguration.pbtxt
F PAT=Info.plist

Looking at the header of these pbtxt files, we see the magic bytes for skencv1. This is an undocumented encryption scheme that Apple uses various things.

00000000: 736b 656e 6376 3198 c721 812d 3fe2 c025  skencv1..!.-?..%
00000010: 6ab5 a673 051f 3891 e9da 1b4b b342 39e4  j..s..8....K.B9.
00000020: 6461 fa94 03a3 3e08 4b4d c31f b1e1 0509  da....>.KM......

Decrypting the configuration files

I couldn’t find any documentation on the skencv1 format, I tried some naive things like decrypting it with the same key as the AEA1 archive, but it didn’t work. The SummarizationKit.framework binary isn’t directly accessible as a standalone file, on modern macOS it lives inside a dyld shared cache which is a single monolithic binary containing many system frameworks. You can run brew install blacktop/tap/ipsw to install a tool that will extract it from /System/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e.

mkdir -p /tmp/dyld_extract
ipsw dyld extract \
/System/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e \
SummarizationKit \
-o /tmp/dyld_cache

    • Created /tmp/dyld_cache/SummarizationKit

I figured somewhere in the binary, there will be a reference to the skencv1 string, so I grepped it:

strings -t x SummarizationKit | grep skencv1

1acc30 skencv1

I loaded SummarizationKit into Ghidra and searched for 73 6b 65 6e 63 76 31 (ASCII for skencv1) and found a reference at 0x29dfaf2b0. After tracing the XREFS, I eventually found the function that was responsible for handling the skencv1 format, FUN_268cfdaf8. The psuedocode for the function is as follows:

undefined1[16] FUN_268cfdaf8(long param_1, ulong param_2)
{
    // 1. Lazy-initialize skencv1 constant (first time only)
    if (DAT_299467550 != -1) {
        _swift_once(&DAT_299467550, FUN_268cfdacc);
    }
    
    // 2. Load skencv1 string reference
    uVar2 = DAT_29946b5c0;
    uVar10 = DAT_29946b5b8;
    
    // 3. Extract first 7 bytes of input
    FUN_268d34fb8(0, 7, param_1, param_2);
    
    // 4. Compare to "skencv1"
    FUN_268cfeaac(uVar10, uVar2, uVar7, uVar9);
    
    // 5. If magic matches, proceed to decrypt
    if ((uVar10 & 1) != 0) {
        
        // === KEY INITIALIZATION ===
        if (DAT_299467558 != -1) {
            _swift_once(&DAT_299467558, FUN_268cfe2bc);  // <-- CRITICAL
        }
        
        // 6. Load key material
        uVar7 = DAT_29946ac00;   // <-- Key data
        uVar2 = DAT_29946abf8;   // <-- Key data
        
        // 7. Skip the 7-byte magic header
        FUN_268d34fb8(7, uVar10, param_1, param_2);
        
        // ... decryption continues ...
    }
}

I tried and failed several different ways of using Frida to hook various apple crypto functions to dump the key or the data being decrypted, but I had no luck. I also tried dumping the entire memory of the process after triggering the decryption in the hopes that the key or decrypted data would be floating around somewhere, but again no luck. With some help from Codex, I created an lldb debugging script that was ASLR-aware and would set a breakpoint on the FUN_268cfdaf8 function. After triggering the decryption in Notes, I was able to hit the breakpoint and inspect the registers to find the key material being used for decryption.

The script can be found here:

At a high level, it does the following:

Calculates ASLR slide from SummarizationKit load address
Sets breakpoints on Ghidra-identified functions at runtime addresses
Captures register values (x0/x1) at entry and return
Dumps key-global memory snapshots before/after events
Decodes Swift-like buffer objects from x1 layouts
Auto-dumps input/output buffers to /tmp/sktrace-dumps

The commands to run the script are:

launchctl kill SIGKILL gui/$(id -u)/com.apple.generativeexperiencesd 2>/dev/null || true
xcrun lldb -w -n generativeexperiencesd

Then while in lldb:

command script import lldb_skencv1_trace.py
sktrace_init
c

Trigger text summarization from the Notes app, then watch the lldb console for the decryption to happen.

	#!/usr/bin/env python3
	"""
	LLDB helper to trace skencv1 anchor functions in SummarizationKit.

	Usage:
	(lldb) command script import /path/to/text-to-summary/lldb_skencv1_trace.py
	(lldb) sktrace_init # auto-detects SummarizationKit load base
	(lldb) # or: sktrace_init 0x287000000
	(lldb) c

	Optional:
	(lldb) sktrace_show_globals
	(lldb) sktrace_set_dump_dir /tmp/sktrace-dumps
	(lldb) sktrace_set_max_dumps 128
	"""

	import time
	import struct
	import os
	import lldb

	GHIDRA_IMAGE_BASE = 0x268BC0000

	ANCHOR_FUNCS = {
	"main_handler": 0x268CFDAF8, # FUN_268cfdaf8
	"key_init": 0x268CFE2BC, # FUN_268cfe2bc
	"key_source": 0x268CFEF18, # FUN_268cfef18
	}

	KEY_GLOBALS = {
	"key_g0": 0x29946ABF8,
	"key_g1": 0x29946AC00,
	"key_g2": 0x29946AC08,
	"key_g3": 0x29946AC10,
	}

	MAX_HEX_DUMP = 96
	MAX_PTR_DUMP = 64

	_STATE = {
	"slide": None,
	"module_base": None,
	"installed": False,
	"hit_counts": {},
	"last_main_ret_sig": None,
	"last_key_init_ret_sig": None,
	"last_key_source_ret_sig": None,
	"last_globals_hex": {},
	"main_thread_seen_at": {},
	"context_window_s": 0.4,
	"main_entry_count": 0,
	"main_return_count": 0,
	"input_dump_count": 0,
	"output_dump_count": 0,
	"dump_dir": "/tmp/sktrace-dumps",
	"max_auto_dumps": 32,
	"key_init_log_limit": 4,
	"key_source_log_limit": 4,
	"key_init_entry_seen": 0,
	"key_source_entry_seen": 0,
	}


	def _log(msg):
	print("[sktrace] {}".format(msg))


	def _read_mem(process, addr, size):
	if not addr:
	return None
	err = lldb.SBError()
	data = process.ReadMemory(addr, size, err)
	if not err.Success():
	return None
	return data


	def _untag_ptr(ptr):
	# arm64e user pointers in this target frequently carry top-byte tags (for example 0x40...)
	# TBI means masking to 56 bits is usually the right canonical form for LLDB memory reads.
	return ptr & 0x00FFFFFFFFFFFFFF


	def _u64_le(buf, off):
	if buf is None or off + 8 > len(buf):
	return 0
	return struct.unpack_from("<Q", buf, off)[0]


	def _hexdump(data, width=16):
	if not data:
	return "<empty>"
	if isinstance(data, str):
	data = data.encode("latin1", errors="ignore")
	lines = []
	for i in range(0, len(data), width):
	chunk = data[i : i + width]
	hx = " ".join("{:02x}".format(b) for b in chunk)
	lines.append(hx)
	return "\n".join(lines)


	def _reg_u64(frame, reg_name):
	reg = frame.FindRegister(reg_name)
	if not reg.IsValid():
	return 0
	return reg.GetValueAsUnsigned()


	def _dump_ptr(process, label, ptr, size=MAX_PTR_DUMP):
	if ptr == 0:
	_log("{}: 0x0".format(label))
	return
	canonical = _untag_ptr(ptr)
	blob = _read_mem(process, canonical, size)
	if blob is None:
	_log("{}: 0x{:x} (canonical 0x{:x}, unreadable)".format(label, ptr, canonical))
	return
	blob = blob[:MAX_HEX_DUMP]
	_log("{}: 0x{:x} (canonical 0x{:x})\n{}".format(label, ptr, canonical, _hexdump(blob)))


	def _ascii_preview(blob):
	if not blob:
	return ""
	out = []
	for b in blob:
	if 32 <= b <= 126:
	out.append(chr(b))
	else:
	out.append(".")
	return "".join(out)


	def _extract_swift_like_buffer(process, obj_ptr):
	if obj_ptr == 0:
	return None
	obj = _untag_ptr(obj_ptr)
	header = _read_mem(process, obj, 0x30)
	if header is None or len(header) < 0x28:
	return None
	cand_ptr = _u64_le(header, 0x10)
	cand_len = _u64_le(header, 0x18)
	cand_cap = _u64_le(header, 0x20)

	if cand_ptr == 0 or cand_len == 0:
	return None
	if cand_len > 8 * 1024 * 1024:
	return None

	buf_ptr = _untag_ptr(cand_ptr)
	read_len = min(cand_len, 256 * 1024)
	data = _read_mem(process, buf_ptr, read_len)
	if data is None:
	return None
	preview = data[:64]
	return {
	"obj_ptr": obj_ptr,
	"ptr": buf_ptr,
	"len": cand_len,
	"cap": cand_cap,
	"preview": preview,
	"data": data,
	}


	def _maybe_dump_buffer(kind, info):
	if info is None:
	return None
	if kind == "input":
	_STATE["input_dump_count"] += 1
	idx = _STATE["input_dump_count"]
	else:
	_STATE["output_dump_count"] += 1
	idx = _STATE["output_dump_count"]
	if idx > _STATE["max_auto_dumps"]:
	return None
	os.makedirs(_STATE["dump_dir"], exist_ok=True)
	path = os.path.join(
	_STATE["dump_dir"],
	"{}_{:03d}_len{}.bin".format(kind, idx, info["len"]),
	)
	with open(path, "wb") as f:
	f.write(info["data"])
	return path


	def _dump_swift_like_buffer(process, label, obj_ptr):
	info = _extract_swift_like_buffer(process, obj_ptr)
	if info is None:
	return None

	head8 = info["preview"][:8]
	ascii_preview = _ascii_preview(info["preview"][:32])
	_log(
	"{} decoded-buffer obj=0x{:x} ptr=0x{:x} len={} cap={} head8={} ascii='{}'".format(
	label,
	info["obj_ptr"],
	info["ptr"],
	info["len"],
	info["cap"],
	head8.hex(),
	ascii_preview,
	)
	)
	_log("{} decoded-buffer hex:\n{}".format(label, _hexdump(info["preview"])))
	return info


	def _inc_hit(name):
	current = _STATE["hit_counts"].get(name, 0) + 1
	_STATE["hit_counts"][name] = current
	return current


	def _remember_main_thread(thread_id):
	_STATE["main_thread_seen_at"][thread_id] = time.time()


	def _is_recent_main_thread(thread_id):
	t = _STATE["main_thread_seen_at"].get(thread_id)
	if t is None:
	return False
	return (time.time() - t) <= _STATE["context_window_s"]


	def _runtime_addr(static_addr):
	slide = _STATE.get("slide")
	if slide is None:
	return None
	return static_addr + slide


	def _snapshot_key_globals(process):
	slide = _STATE.get("slide")
	if slide is None:
	_log("slide not initialized; cannot snapshot key globals")
	return
	for label, static_addr in KEY_GLOBALS.items():
	runtime_addr = static_addr + slide
	blob = _read_mem(process, runtime_addr, 32)
	if blob is None:
	_log("{} @ 0x{:x}: unreadable".format(label, runtime_addr))
	continue
	hx = blob.hex()
	_log("{} @ 0x{:x}: {}".format(label, runtime_addr, hx))

	if _STATE["last_globals_hex"].get(label) == hx:
	continue
	_STATE["last_globals_hex"][label] = hx
	_dump_key_global_candidates(process, label, blob)


	def _dump_key_global_candidates(process, label, blob):
	if blob is None or len(blob) < 32:
	return
	q = [struct.unpack_from("<Q", blob, i * 8)[0] for i in range(4)]
	_log(
	"{} qwords: [{}]".format(
	label, ", ".join("0x{:x}".format(x) for x in q)
	)
	)
	seen = set()
	for idx, raw in enumerate(q):
	if raw == 0:
	continue
	ptr = _untag_ptr(raw)
	if ptr in seen:
	continue
	seen.add(ptr)
	if ptr < 0x1000:
	continue
	chunk = _read_mem(process, ptr, 0x40)
	if chunk is None:
	_log("{} q{} ptr=0x{:x} unreadable".format(label, idx, ptr))
	continue
	_log("{} q{} ptr=0x{:x} hdr:\n{}".format(label, idx, ptr, _hexdump(chunk[:0x40])))
	_dump_swift_like_buffer(process, "{} q{}".format(label, idx), raw)


	def _set_temp_return_bp(target, return_addr, callback_name):
	bp = target.BreakpointCreateByAddress(return_addr)
	bp.SetOneShot(True)
	bp.SetAutoContinue(True)
	bp.SetScriptCallbackFunction("{}.{}".format(__name__, callback_name))
	return bp


	def _common_entry_trace(frame, label):
	target = frame.GetThread().GetProcess().GetTarget()
	pc = frame.GetPCAddress().GetLoadAddress(target)
	tid = frame.GetThread().GetThreadID()
	c = _inc_hit(label)
	x0 = _reg_u64(frame, "x0")
	x1 = _reg_u64(frame, "x1")
	x2 = _reg_u64(frame, "x2")
	x3 = _reg_u64(frame, "x3")
	_log(
	"{} entry #{} t={} pc=0x{:x} x0=0x{:x} x1=0x{:x} x2=0x{:x} x3=0x{:x}".format(
	label, c, tid, pc, x0, x1, x2, x3
	)
	)
	if label == "FUN_268cfdaf8":
	_remember_main_thread(tid)
	_STATE["main_entry_count"] += 1
	if x0 >> 32:
	_log("{} x0_hi32(candidate_len)={}".format(label, x0 >> 32))
	process = frame.GetThread().GetProcess()
	_dump_ptr(process, "{} x0".format(label), x0)
	_dump_ptr(process, "{} x1".format(label), x1)
	info = _dump_swift_like_buffer(process, "{} x1".format(label), x1)
	if label == "FUN_268cfdaf8" and info is not None:
	if info["preview"][:7] == b"skencv1":
	p = _maybe_dump_buffer("input", info)
	if p:
	_log("input buffer dumped: {}".format(p))


	def main_handler_ret_cb(frame, bp_loc, _dict):
	process = frame.GetThread().GetProcess()
	target = process.GetTarget()
	pc = frame.GetPCAddress().GetLoadAddress(target)
	x0 = _reg_u64(frame, "x0")
	x1 = _reg_u64(frame, "x1")
	sig = (pc, x0, x1)
	if _STATE.get("last_main_ret_sig") == sig:
	return False
	_STATE["last_main_ret_sig"] = sig
	_STATE["main_return_count"] += 1
	_log("main_handler return pc=0x{:x} x0=0x{:x} x1=0x{:x}".format(pc, x0, x1))
	if x0 >> 32:
	_log("main_handler return x0_hi32(candidate_len)={}".format(x0 >> 32))
	_dump_ptr(process, "main_handler ret x0", x0)
	_dump_ptr(process, "main_handler ret x1", x1)
	out_info = _dump_swift_like_buffer(process, "main_handler ret x1", x1)
	if out_info is not None:
	p = _maybe_dump_buffer("output", out_info)
	if p:
	_log("output buffer dumped: {}".format(p))
	_snapshot_key_globals(process)
	return False


	def key_init_ret_cb(frame, bp_loc, _dict):
	process = frame.GetThread().GetProcess()
	target = process.GetTarget()
	pc = frame.GetPCAddress().GetLoadAddress(target)
	x0 = _reg_u64(frame, "x0")
	x1 = _reg_u64(frame, "x1")
	sig = (pc, x0, x1)
	if _STATE.get("last_key_init_ret_sig") == sig:
	return False
	_STATE["last_key_init_ret_sig"] = sig
	_log("key_init return pc=0x{:x} x0=0x{:x} x1=0x{:x}".format(pc, x0, x1))
	_dump_ptr(process, "key_init ret x0", x0)
	_dump_ptr(process, "key_init ret x1", x1)
	_dump_swift_like_buffer(process, "key_init ret x1", x1)
	_snapshot_key_globals(process)
	return False


	def key_source_ret_cb(frame, bp_loc, _dict):
	process = frame.GetThread().GetProcess()
	target = process.GetTarget()
	pc = frame.GetPCAddress().GetLoadAddress(target)
	x0 = _reg_u64(frame, "x0")
	x1 = _reg_u64(frame, "x1")
	sig = (pc, x0, x1)
	if _STATE.get("last_key_source_ret_sig") == sig:
	return False
	_STATE["last_key_source_ret_sig"] = sig
	_log("key_source return pc=0x{:x} x0=0x{:x} x1=0x{:x}".format(pc, x0, x1))
	_dump_ptr(process, "key_source ret x0", x0)
	_dump_ptr(process, "key_source ret x1", x1)
	_dump_swift_like_buffer(process, "key_source ret x1", x1)
	_snapshot_key_globals(process)
	return False


	def main_handler_entry_cb(frame, bp_loc, _dict):
	_common_entry_trace(frame, "FUN_268cfdaf8")
	target = frame.GetThread().GetProcess().GetTarget()
	lr = _reg_u64(frame, "x30")
	if lr:
	_set_temp_return_bp(target, lr, "main_handler_ret_cb")
	return False


	def key_init_entry_cb(frame, bp_loc, _dict):
	tid = frame.GetThread().GetThreadID()
	if not _is_recent_main_thread(tid):
	return False
	_STATE["key_init_entry_seen"] += 1
	if _STATE["key_init_entry_seen"] > _STATE["key_init_log_limit"]:
	return False
	_common_entry_trace(frame, "FUN_268cfe2bc")
	target = frame.GetThread().GetProcess().GetTarget()
	lr = _reg_u64(frame, "x30")
	if lr:
	_set_temp_return_bp(target, lr, "key_init_ret_cb")
	return False


	def key_source_entry_cb(frame, bp_loc, _dict):
	tid = frame.GetThread().GetThreadID()
	if not _is_recent_main_thread(tid):
	return False
	_STATE["key_source_entry_seen"] += 1
	if _STATE["key_source_entry_seen"] > _STATE["key_source_log_limit"]:
	return False
	_common_entry_trace(frame, "FUN_268cfef18")
	target = frame.GetThread().GetProcess().GetTarget()
	lr = _reg_u64(frame, "x30")
	if lr:
	_set_temp_return_bp(target, lr, "key_source_ret_cb")
	return False


	def _detect_module_base(target, module_name_substr="SummarizationKit"):
	n = target.GetNumModules()
	for i in range(n):
	module = target.GetModuleAtIndex(i)
	if not module.IsValid():
	continue
	filename = module.GetFileSpec().GetFilename()
	if not filename:
	continue
	if module_name_substr in filename:
	addr = module.GetObjectFileHeaderAddress()
	if addr.IsValid():
	return addr.GetLoadAddress(target)
	return None


	def _install_bp(target, runtime_addr, cb_name, label):
	bp = target.BreakpointCreateByAddress(runtime_addr)
	bp.SetAutoContinue(True)
	bp.SetScriptCallbackFunction("{}.{}".format(__name__, cb_name))
	_log("{} breakpoint #{} @ 0x{:x}".format(label, bp.GetID(), runtime_addr))
	return bp


	def sktrace_init(debugger, command, exe_ctx, result, _dict):
	target = debugger.GetSelectedTarget()
	if not target.IsValid():
	result.PutCString("No valid target.")
	return

	arg = command.strip()
	module_base = None
	if arg:
	try:
	module_base = int(arg, 16)
	except ValueError:
	result.PutCString("Invalid address: {}".format(arg))
	return
	else:
	module_base = _detect_module_base(target)

	if module_base is None:
	result.PutCString(
	"Unable to auto-detect SummarizationKit base. Pass one manually: sktrace_init 0x..."
	)
	return

	slide = module_base - GHIDRA_IMAGE_BASE
	_STATE["slide"] = slide
	_STATE["module_base"] = module_base
	_STATE["hit_counts"] = {}
	_STATE["last_main_ret_sig"] = None
	_STATE["last_key_init_ret_sig"] = None
	_STATE["last_key_source_ret_sig"] = None
	_STATE["last_globals_hex"] = {}
	_STATE["main_thread_seen_at"] = {}
	_STATE["main_entry_count"] = 0
	_STATE["main_return_count"] = 0
	_STATE["input_dump_count"] = 0
	_STATE["output_dump_count"] = 0
	_STATE["key_init_entry_seen"] = 0
	_STATE["key_source_entry_seen"] = 0

	_log(
	"module_base=0x{:x} ghidra_base=0x{:x} slide=0x{:x}".format(
	module_base, GHIDRA_IMAGE_BASE, slide
	)
	)

	_install_bp(
	target,
	_runtime_addr(ANCHOR_FUNCS["main_handler"]),
	"main_handler_entry_cb",
	"FUN_268cfdaf8",
	)
	_install_bp(
	target,
	_runtime_addr(ANCHOR_FUNCS["key_init"]),
	"key_init_entry_cb",
	"FUN_268cfe2bc",
	)
	_install_bp(
	target,
	_runtime_addr(ANCHOR_FUNCS["key_source"]),
	"key_source_entry_cb",
	"FUN_268cfef18",
	)

	process = target.GetProcess()
	if process and process.IsValid():
	_snapshot_key_globals(process)

	_STATE["installed"] = True
	result.PutCString(
	"sktrace initialized at {}. Dumps dir: {}. Continue with 'c' and trigger config load.".format(
	time.strftime("%Y-%m-%d %H:%M:%S"), _STATE["dump_dir"]
	)
	)


	def sktrace_show_globals(debugger, command, exe_ctx, result, _dict):
	target = debugger.GetSelectedTarget()
	if not target.IsValid():
	result.PutCString("No valid target.")
	return
	process = target.GetProcess()
	if not process.IsValid():
	result.PutCString("No running process.")
	return
	if _STATE.get("slide") is None:
	result.PutCString("Run sktrace_init first.")
	return
	_snapshot_key_globals(process)
	result.PutCString("Done.")


	def sktrace_set_dump_dir(debugger, command, exe_ctx, result, _dict):
	dump_dir = command.strip()
	if not dump_dir:
	result.PutCString("Usage: sktrace_set_dump_dir /absolute/or/relative/path")
	return
	_STATE["dump_dir"] = os.path.abspath(dump_dir)
	os.makedirs(_STATE["dump_dir"], exist_ok=True)
	result.PutCString("dump_dir set to: {}".format(_STATE["dump_dir"]))


	def sktrace_set_max_dumps(debugger, command, exe_ctx, result, _dict):
	raw = command.strip()
	if not raw:
	result.PutCString("Usage: sktrace_set_max_dumps <positive_int>")
	return
	try:
	val = int(raw)
	except ValueError:
	result.PutCString("Invalid integer: {}".format(raw))
	return
	if val <= 0:
	result.PutCString("max dumps must be > 0")
	return
	_STATE["max_auto_dumps"] = val
	result.PutCString("max_auto_dumps set to: {}".format(val))


	def __lldb_init_module(debugger, _dict):
	debugger.HandleCommand(
	"command script add -f {}.sktrace_init sktrace_init".format(__name__)
	)
	debugger.HandleCommand(
	"command script add -f {}.sktrace_show_globals sktrace_show_globals".format(__name__)
	)
	debugger.HandleCommand(
	"command script add -f {}.sktrace_set_dump_dir sktrace_set_dump_dir".format(__name__)
	)
	debugger.HandleCommand(
	"command script add -f {}.sktrace_set_max_dumps sktrace_set_max_dumps".format(__name__)
	)
	_log(
	"Loaded. Commands: sktrace_init [module_base_hex], sktrace_show_globals, "
	"sktrace_set_dump_dir <path>, sktrace_set_max_dumps <n>"
	)

view raw lldb_skencv1_trace.py hosted with ❤ by GitHub

FM Overrides

I some great research on the FM Override files done by github.com/BlueFalconHD/apple_generative_model_safety_decrypted

Their process for finding the decryption key was roughly:

Use DTrace to identify which process reads .enc files
Found GenerativeExperiencesSafetyInferenceProvider calls ModelCatalog.Obfuscation.readObfuscatedContents
Set LLDB breakpoint on CryptoKit.AES.GCM.open(_:using:) at offset +36
Read the SymmetricKey from register using Xcode’s Swift LLDB

Through their research, I was able to get this output from lldb:

🔑 dae8ad6ae7cee414a60525b107abbb3ec6d3f34d398d8c38317f67a3ddfc9989

Using this key, we can run their decrypt_overrides.py script to decrypt all of the FM Override files. What we discover is:

Rule Type	Count	Description
`reject`	56	Exact phrases that block the entire request
`remove`	2	Phrases silently removed from text
`replace`	4	Pattern → replacement mappings
`regexReject`	1,219	Regex patterns that block the request
`regexReplace`	880	Regex patterns with replacements
Total	2,161	All safety rules

In an off-chance, I tried to use the same decryption key that was found via lldb to decrypt the pbtxt files, but they didn’t work.