预备阶段
1.Lab要求
邪恶的邪恶博士在我们班的机器上安放了大量的“二元炸弹”。二进制炸弹是一个由一系列阶段组成的程序。每个阶段都要求你在 stdin 上键入一个特定的字符串。如果你键入了正确的字符串,那么这个阶段就会被拆除,炸弹就会进入下一个阶段。否则,炸弹会通过打印“ BOOM 然后终止。当每个阶段都被拆除时,炸弹就会被拆除。
我们要处理的炸弹太多了,所以我们给每个学生一个炸弹去拆除。你们的任务,你们别无选择,只能接受,就是在due date前拆除炸弹。祝你好运,欢迎加入拆弹小组!
Step1:了解bomb.tar文件
使用以下命令解压获取的压缩包: tar-xvf bomk.tar,这将创建一个名为./bomb的文件夹,包含以下文件:
• README
• bomb: The executable binary bomb.
• bomb.c: Source file with the bomb’s main routine and a friendly greeting from Dr.Evil.
Step 2: 拆除炸弹
你在这个实验室的工作就是拆除炸弹。你必须在其中一台机器上完成作业。事实上,有传言说邪恶博士真的很邪恶,如果把炸弹运到别处,它总是会爆炸。据我们所知,炸弹里还有其他一些防篡改装置。
你可以使用许多工具来帮助你拆除炸弹。请看提示部分获取一些提示和idea。最好的方法是使用调试器逐步通过反汇编的二进制文件拆除炸弹。
每次你的炸弹爆炸,它通知炸弹服务器,你失去了 1/2 点(最多 20 点)的最终分数。所以引爆炸弹是有后果的。你一定要小心!
前四个阶段各值10 分。第五阶段和第六阶段比较困难,所以每个阶段都值15 分。所以你能得到的最高分是70 分。
虽然每个阶段炸弹变得越来越难拆除,但是你从一个phase到另一个phase所获得的经验应该可以缓解这个困难。然而,最后一个phase甚至会难倒最优秀的学生,所以请不要等到最后一分钟才开始。
炸弹会忽略空白输入行。例如,如果您使用命令行参数运行炸弹,
linux> ./bomb psol.txt
linux > ./bomb psol.txt
然后它将从psol.txt读取输入行,直到达到 EOF (文件结束) ,然后切换到 stdin,这样你就不必一直重复输入你已经拆除的阶段的解决方案。
为了避免意外引爆炸弹,你需要学习如何单步通过汇编代码和如何设置断点。你还需要学习如何检查寄存器和内存状态。做实验的一个好的副作用就是你会非常擅长使用调试器。这是一项至关
重要的技能,将在你的职业生涯中带来丰厚的回报。
2.gdb调试器基本操作
下面的内容来自csapp官网所放出来的文件中。
Summary of GDB commands for x86-64 Systems
Starting:
-
gdb
-
gdb <file>
Running and stopping
Command | Effect |
---|---|
quit | Exit gdb |
run | Run program |
run 1 2 3 | Run program with command-line arguments 1 2 3 |
kill | Stop the program |
Ctrl-d | Exit gdb |
Note: Ctrl-C does not exit from gdb, but halts the current gdb command
Breakpoints
Command | Effect |
---|---|
break sum | Set breakpoint at the entry to function sum |
break *0x80483c3 | Set breakpoint at address 0x80483c3 |
delete 1 | Delete breakpoint 1 |
disable 1 | Disable the breakpoint 1(gdb numbers each breakpoint you create) |
enable 1 | Enable breakpoint 1 |
delete | Delete all breakpoints |
clear sum | Clear any breakpoints at the entry to function sum |
Execution
Command | Effect |
---|---|
stepi | Execute one instruction |
stepi 4 | Execute four instructions |
nexti | Like stepi, but proceed through function calls without stopping |
step | Execute one C statement |
continue | Resume execution until the next breakpoint |
until 3 | Continue executing until program hits breakpoint 3 |
finish | Resume execution until current function returns |
call sum(1, 2) | Call sum(1,2) and print return value Examining code |
disas | Disassemble current function |
disas sum | Disassemble function sum |
disas 0x80483b7 | Disassemble function around 0x80483b7 |
disas 0x80483b7 0x80483c7 | Disassemble code within specified address range |
print /x $rip | Print program counter in hex |
print /d $rip | Print program counter in decimal |
print /t $rip | Print program counter in binary |
Examining data
Command | Effect |
---|---|
print /d $rax | Print contents of %rax in decimal |
print /x $rax | Print contents of %rax in hex |
print /t $rax | Print contents of %rax in binary |
print /d (int)$rax | Print contents of %rax in decimal after sign-extending lower 32-bits. |
print 0x100 | Print decimal representation of 0x100 |
print /x 555 | Print hex representation of 555 |
print /x ($rsp+8) | Print (contents of %rsp) + 8 in hex |
print *(int *) 0xbffff890 | Print integer at address 0xbffff890 |
print *(int *) ($rsp+8) | Print integer at address %rsp + 8 |
print (char *) 0xbfff890 | Examine a string stored at 0xbffff890 |
x/w 0xbffff890 | Examine (4-byte) word starting at address 0xbffff890 |
x/w $rsp | Examine (4-byte) word starting at address in $rsp |
x/wd $rsp | Examine (4-byte) word starting at address in $rsp. Print in decimal |
x/2w $rsp | Examine two (4-byte) words starting at address in $rsp |
x/2wd $rsp | Examine two (4-byte) words starting at address in $rsp. Print in decimal |
x/g $rsp | Examine (8-byte) word starting at address in $rsp. |
x/gd $rsp | Examine (8-byte) word starting at address in $rsp. Print in decimal |
x/a $rsp | Examine address in $rsp. Print as offset from previous global symbol. |
x/s 0xbffff890 | Examine a string stored at 0xbffff890 |
x/20b sum | Examine first 20 opcode bytes of function sum |
x/10i sum | Examine first 10 instructions of function sum |
(Note: the format string for the ‘x’ command has the general form x/[NUM][SIZE][FORMAT] whereNUM = number of objects to display
SIZE = size of each object (b=byte, h=half-word, w=word,g=giant (quad-word))
FORMAT = how to display each object (d=decimal, x=hex, o=octal, etc.)
If you don’t specify SIZE or FORMAT, either a default value, or the last
value you specified in a previous ‘print’ or ‘x’ command is used.
)You need print /d (int)$rax to print 32-bit, negative numbers stored in the lower 32 bits of %rax. For example, if the lower 32-bits of %rax store 0xffffffff, you will see
(gdb) print $rax
$1 = 4294967295
(gdb) print (int)$rax
$2 = -1
(gdb)
Useful information
Command | Effect |
---|---|
backtrace | Print the current address and stack backtrace |
where | Print the current address and stack backtrace |
info program | Print current status of the program |
info functions | Print functions in program |
info stack | Print backtrace of the stack |
info frame | Print information about the current stack frame |
info registers | Print registers and their contents |
info breakpoints | Print status of user-settable breakpoints |
display /FMT EXPR | Print expression EXPR using format FMT every time GDB stops |
undisplay | Turn off display mode |
help | Get information about gdb |
3.bomb.c文件解析
#include <stdio.h>
#include <stdlib.h>
#include "support.h"
#include "phases.h"/* * Note to self: Remember to erase this file so my victims will have no* idea what is going on, and so they will all blow up in a* spectaculary fiendish explosion. -- Dr. Evil */FILE *infile;int main(int argc, char *argv[])
{char *input;/* Note to self: remember to port this bomb to Windows and put a * fantastic GUI on it. *//* When run with no arguments, the bomb reads its input lines * from standard input. */if (argc == 1) { infile = stdin;} /* When run with one argument <file>, the bomb reads from <file> * until EOF, and then switches to standard input. Thus, as you * defuse each phase, you can add its defusing string to <file> and* avoid having to retype it. */else if (argc == 2) {if (!(infile = fopen(argv[1], "r"))) {printf("%s: Error: Couldn't open %s\n", argv[0], argv[1]);exit(8);}}/* You can't call the bomb with more than 1 command line argument. */else {printf("Usage: %s [<input_file>]\n", argv[0]);exit(8);}/* Do all sorts of secret stuff that makes the bomb harder to defuse. */initialize_bomb();printf("Welcome to my fiendish little bomb. You have 6 phases with\n");printf("which to blow yourself up. Have a nice day!\n");/* Hmm... Six phases must be more secure than one phase! */input = read_line(); /* Get input */phase_1(input); /* Run the phase */phase_defused(); /* Drat! They figured it out!* Let me know how they did it. */printf("Phase 1 defused. How about the next one?\n");/* The second phase is harder. No one will ever figure out* how to defuse this... */input = read_line();phase_2(input);phase_defused();printf("That's number 2. Keep going!\n");/* I guess this is too easy so far. Some more complex code will* confuse people. */input = read_line();phase_3(input);phase_defused();printf("Halfway there!\n");/* Oh yeah? Well, how good is your math? Try on this saucy problem! */input = read_line();phase_4(input);phase_defused();printf("So you got that one. Try this one.\n");/* Round and 'round in memory we go, where we stop, the bomb blows! */input = read_line();phase_5(input);phase_defused();printf("Good work! On to the next...\n");/* This phase will never be used, since no one will get past the* earlier ones. But just in case, make this one extra hard. */input = read_line();phase_6(input);phase_defused();/* Wow, they got it! But isn't something... missing? Perhaps* something they overlooked? Mua ha ha ha ha! */return 0;
}
拆除炸弹阶段
首先我们使用gdb命令进入调试模式:
Phase_1
首先我们通过disas phase_1命令反汇编phase_1函数得到如下结果:
下面我们对重要部分进行解析:
-
首先,因为有
call 0x401338 <strings_not_equal>
要调用string_not_equal
函数,这一块很容易猜到就是看字符串是否相同,相同的话寄存器%rax返回0,否则返回1,而通过后面的逻辑我们知道只有这两个字符串相同时才能跳过炸弹。 -
那么传入strings_not_equal函数的参数是什么呢,一般来说是指向两个字符串的指针,参数1寄存器%rdi就是我们的输入字符串,参数二寄存器就是%rsi,
mov $0x402400,%esi
说明我们将地址$0x402400存入%esi中,那么地址$0x402400处的字符串就是我们索要输入的字符串。
使用x/s 0x402400
命令查看此处的字符串:
因此phase_1最后的答案就是:Border relations with Canada have never been better.
Phase_2
同样我们通过disas phase_2命令反汇编phase_2函数得到如下结果:
可以看到在+9位置调用了read_six_numbers
函数,反汇编这个函数可以得到:
前两行保存一下被调用者保存寄存器,然后将栈指针%rsp减小$0x28,相当于在栈里面预留出来了28个字节的空间,现在先画出栈帧,用十六进制数简单表示相对于%rsp的偏移字节。
然后将栈指针%rsp移入%esi,调用函数read_six_numbers
。
read_six_numbers
首先明确函数参数,第一个参数通过寄存器%rdi传入,也就是我们的输入字符串,第二个参数存在%rsi中,通过前面描述可知就是上面所画phase_2函数的栈指针
简单分析该函数逻辑把mov $0x4025c3,%esi
之前的转化为c风格混合汇编代码:
int *rdx = rsi; //%rdx存放图中%rsp处地址int *rcx = rsi + 4; //其实也就是%rcx存放图中0x4处地址int *rax = rsi + 0x14; //其实也就是%rax存放图中0x14处地址(%rsp + 8) = rax; //当前栈指针%rsp所指地址 + 8处存放图中0x14处地址rax = rsi + 0x10;//%rax存放图中0x10处地址(%rsp) = rax;//当前栈指针%rsp所指地址处存放图中0x10处地址int *r9 = rsi + 0xc;int *r8 = rsi + 0x8;
上面做的事情其实很简单,就是把上面phase_2栈帧%rsp,%rsp+4,%rsp+8,…%rsp+0x14分别存在%rdx,%rcx, %r8, %r9(后四个参数传递寄存器)和’read_six_number’的栈帧顶上两个位置。
接下来使用<__isoc99_sscanf@plt>
函数读取数据,第一个参数%rdi存放我们输入字符串,第二个参数%rsi保存地址$0x4025c3,用命令看一下里面的内容如下:
再根据后面的比较返回值cmp $0x5,%eax
操作大致可以理解我们需要输入6个整型数据,分别将他们存在phase_2栈帧%rsp,%rsp+4,%rsp+8,…%rsp+0x14的位置。
分析完read_six_numbers
函数后现在我们返回主函数。
phase_2主函数
0x0000000000400f0a <+14>: cmpl $0x1,(%rsp)0x0000000000400f0e <+18>: je 0x400f30 <phase_2+52>0x0000000000400f10 <+20>: call 0x40143a <explode_bomb>
这一段说明第一个参数一定是1,否则会<explode_bomb>
,因此arg1 = 1
接下来跳转到+52处:
0x0000000000400f30 <+52>: lea 0x4(%rsp),%rbx0x0000000000400f35 <+57>: lea 0x18(%rsp),%rbp0x0000000000400f3a <+62>: jmp 0x400f17 <phase_2+27>
%rbx存放%rsp + 0x4的地址, %rbp存放%rsp + 0x18,猜测%rbp可能起一个终止指针的作用,然后跳到+27,应该是要开始循环了
0x0000000000400f17 <+27>: mov -0x4(%rbx),%eax0x0000000000400f1a <+30>: add %eax,%eax0x0000000000400f1c <+32>: cmp %eax,(%rbx)0x0000000000400f1e <+34>: je 0x400f25 <phase_2+41>0x0000000000400f20 <+36>: call 0x40143a <explode_bomb>0x0000000000400f25 <+41>: add $0x4,%rbx0x0000000000400f29 <+45>: cmp %rbp,%rbx0x0000000000400f2c <+48>: jne 0x400f17 <phase_2+27>0x0000000000400f2e <+50>: jmp 0x400f3c <phase_2+64>
- 最开始%rbx存放%rsp + 0x4的地址,首先将%rbx - 0x4(也就是%rsp)指向的内容放入%eax,通过上面的分析知道也就是arg1 = 1,%eax存放1
- 接下来
add %eax,%eax
,%eax存放2 - 比较%rsp + 0x4指向的内容和%eax指向的内容(也就是arg2与2*arg1),只有二者相等才能跳过炸弹,于是arg2 = 2
- 接下来
add $0x4,%rbx
并进行终止条件的比较cmp %rbp,%rbx
开启循环,循环到最后:
arg1 = 1
arg2 = 2 * arg1 = 2
arg3 = 2 * arg2 = 4
arg4 = 2 * arg3 = 8
arg5 = 2 * arg4 = 16
arg6 = 2 * arg5 = 32
因此phase_2的最终答案为:1 2 4 8 16 32
Phase_3
同样我们通过disas phase_3命令反汇编phase_3函数得到如下结果:
还是先大概画出栈帧:
下面来一步步分析,第一部分和phase_2里read_six_numbers函数思路极为相像:
0x0000000000400f43 <+0>: sub $0x18,%rsp0x0000000000400f47 <+4>: lea 0xc(%rsp),%rcx0x0000000000400f4c <+9>: lea 0x8(%rsp),%rdx0x0000000000400f51 <+14>: mov $0x4025cf,%esi0x0000000000400f56 <+19>: mov $0x0,%eax0x0000000000400f5b <+24>: call 0x400bf0 <__isoc99_sscanf@plt>0x0000000000400f60 <+29>: cmp $0x1,%eax0x0000000000400f63 <+32>: jg 0x400f6a <phase_3+39>0x0000000000400f65 <+34>: call 0x40143a <explode_bomb>
%esi所存地址的内容如下:
因此我们知道上面一块的作用实际上是指出我们的输入是两个整数,并且分别存在栈帧的标号0x8和0xc的位置。
接下来继续分析:
0x0000000000400f6a <+39>: cmpl $0x7,0x8(%rsp)0x0000000000400f6f <+44>: ja 0x400fad <phase_3+106>(+106处汇编指令call 0x40143a<explode_bomb>)0x0000000000400f71 <+46>: mov 0x8(%rsp),%eax0x0000000000400f75 <+50>: jmp *0x402470(,%rax,8)
上面代码中+39位置的比较告诉我们0 <= arg1 <= 7(这是因为ja是无符号数的跳转指令),否则炸弹就要爆炸然后把arg1存入%eax中,并且跳转到地址0x402470 + 8 * %rax所存的的内容所表示的地址中,因为arg1的取值范围是0-7,因此我们的跳转地址有可能保存在0x402470-0x0x4024a8地址空间中,通过x/8gx
指令查看可以得到如下结果:
接着看下面的代码:
0x0000000000400f7c <+57>: mov $0xcf,%eax0x0000000000400f81 <+62>: jmp 0x400fbe <phase_3+123>0x0000000000400f83 <+64>: mov $0x2c3,%eax0x0000000000400f88 <+69>: jmp 0x400fbe <phase_3+123>0x0000000000400f8a <+71>: mov $0x100,%eax0x0000000000400f8f <+76>: jmp 0x400fbe <phase_3+123>0x0000000000400f91 <+78>: mov $0x185,%eax0x0000000000400f96 <+83>: jmp 0x400fbe <phase_3+123>0x0000000000400f98 <+85>: mov $0xce,%eax0x0000000000400f9d <+90>: jmp 0x400fbe <phase_3+123>0x0000000000400f9f <+92>: mov $0x2aa,%eax0x0000000000400fa4 <+97>: jmp 0x400fbe <phase_3+123>0x0000000000400fa6 <+99>: mov $0x147,%eax0x0000000000400fab <+104>: jmp 0x400fbe <phase_3+123>0x0000000000400fad <+106>: call 0x40143a <explode_bomb>0x0000000000400fb2 <+111>: mov $0x0,%eax0x0000000000400fb7 <+116>: jmp 0x400fbe <phase_3+123>0x0000000000400fb9 <+118>: mov $0x137,%eax0x0000000000400fbe <+123>: cmp 0xc(%rsp),%eax0x0000000000400fc2 <+127>: je 0x400fc9 <phase_3+134>0x0000000000400fc4 <+129>: call 0x40143a <explode_bomb>
- 当arg1 = 0时,跳转到0x0000000000400f7c,%eax = 0xcf = 207
- 当arg1 = 1时,跳转到0x0000000000400fb9,%eax = 0x137 = 311
- 当arg1 = 2时,跳转到0x0000000000400f83,%eax = 0x2e3 = 707
- 当arg1 = 3时,跳转到0x0000000000400f8a,%eax = 0x100 = 256
- 当arg1 = 4时,跳转到0x0000000000400f91,%eax = 0x185 = 389
- 当arg1 = 5时,跳转到0x0000000000400f98,%eax = 0x137 = 311
- 当arg1 = 6时,跳转到0x0000000000400f9f,%eax = 0xce = 206
- 当arg1 = 7时,跳转到0x0000000000400fa6,%eax = 0x147 = 327
最后进行条件判断,arg2 = %eax时才能正常返回,所以本题有八种可能的情形,也就是有八个正确答案,任填一个即可。
Phase_4
同样我们通过disas phase_4命令反汇编phase_4函数得到如下结果:
+32之前的内容和phase_3完全一致,其实也就是我们的输入是两个整数,将两个整数分别存放在phase_4栈帧%rsp + 0x8和%rsp + 0xc指向的位置上。
0x000000000040102e <+34>: cmpl $0xe,0x8(%rsp)0x0000000000401033 <+39>: jbe 0x40103a <phase_4+46>0x0000000000401035 <+41>: call 0x40143a <explode_bomb>
这一段逻辑也很简单,其实就是告诉我们arg1 <= 0xe = 14,否则炸弹就会爆炸,接下来跳到+46继续看:
0x000000000040103a <+46>: mov $0xe,%edx0x000000000040103f <+51>: mov $0x0,%esi0x0000000000401044 <+56>: mov 0x8(%rsp),%edi0x0000000000401048 <+60>: call 0x400fce <func4>0x000000000040104d <+65>: test %eax,%eax0x000000000040104f <+67>: jne 0x401058 <phase_4+76>0x0000000000401051 <+69>: cmpl $0x0,0xc(%rsp)0x0000000000401056 <+74>: je 0x40105d <phase_4+81>0x0000000000401058 <+76>: call 0x40143a <explode_bomb>0x000000000040105d <+81>: add $0x18,%rsp0x0000000000401061 <+85>: ret
接下来主要是调用func4函数
,第一个参数通过%rdi传递,其实也就是我们的输入的第一个参数arg1,第二个参数通过%rsi传递为0,第三个参数通过%rdx传递,为0xe = 14,再往后看一段就能发现我们要求返回值为0并且arg2也为0,接下来就是求出使func4函数
返回值为0的arg1,反汇编func4函数
如下:
将其转化为类似c风格的代码如下:
//esi = 0, edx = 14 edi:arg1(小于等于14)
//要求返回值eax = 0;
int fun4(int edi, int esi, int edx) {int eax = edx - esi; //eax = 14 - 0 = 14int ecx = eax >> 0x1f; // ecx = rsult右移31位 = 0eax += ecx; //eax = 14 + 0 = 14eax >>= 1; //sal没有用cl寄存器或者立即数指定移位数那么移位数默认为1,因此eax = eax/2 = 7ecx = eax + esi; //ecx = 7 + 0 = 7if (edi >= ecx) //if(edi >= 7)goto L1; edx = ecx - 1; //edx = 7 - 1 = 6eax = fun4(edi, esi, edx); //eax = fun4(edi, 0, 6)eax = eax * 2;goto done;
L1:eax = 0;if (edi <= ecx) //if(edi <= 7)goto done;else {esi = ecx + 1;//esi = 7 + 1 = 8eax = fun4(edi, esi, edx);//eax = fun4(edi, 8, 14);eax = 2 * eax + 1;}
done:return eax;
}
通过这版代码我们只能看出来arg1 = 7的时候是满足题意的,这里我们再往深入去分析一下其他的可能取值,再增加抽象化改写一下上面的c风格代码:
int func4(int x, int a, int b) {int diff = b - a;diff = (diff + (diff >> 31)) / 2;int mid = a + diff;if (x < mid) return 2 * func4(x, a, mid - 1);else if (x > mid)return 2 * func4(x, a, mid - 1) + 1;else return 0;
}
虽然其中有一些细节不解其意(对!说的就是右移31位的操作),但是好在已经能够比较清晰地看出来这是一个二分法的函数,当一直在区间左半边时返回0。
- 当 x = arg1 = 7是,第一次调用就返回0,这个也是我们一眼就看出来的答案
- 否则x < 7返回值才有可能为0,此时调用func4(arg1, 0, 6),x = 3时直接返回0
- 否则x < 3返回值才有可能为0,此时调用func4(arg1, 0, 2),x = 1时直接返回0
- 否则x < 1返回值才有可能为0,此时调用func4(arg1, 0, 0),x = 0时返回0
综合上面说的,我们的最后答案应该有4种可能:
- 0 0
- 1 0
- 3 0
- 7 0
Phase_5
同样我们通过disas phase_5命令反汇编phase_5函数得到如下结果:
为了方便理解我们还是根据汇编代码先画一下栈帧:
然后继续分析接下来这一段:
0x0000000000401078 <+22>: xor %eax,%eax //清零返回值寄存器%rax0x000000000040107a <+24>: call 0x40131b <string_length>0x000000000040107f <+29>: cmp $0x6,%eax0x0000000000401082 <+32>: je 0x4010d2 <phase_5+112>0x0000000000401084 <+34>: call 0x40143a <explode_bomb>
将输入字符串作为参数调用<string_length>
函数,如果返回值不等于6就会爆炸,因此可以判断我们的输入为一个长度为6的字符串,接下来跳转到+112处开始执行。
0x0000000000401067 <+5>: mov %rdi,%rbx //这一段也贴上方便看%rbx的值0x00000000004010d2 <+112>: mov $0x0,%eax0x00000000004010d7 <+117>: jmp 0x40108b <phase_5+41>0x000000000040108b <+41>: movzbl (%rbx,%rax,1),%ecx0x000000000040108f <+45>: mov %cl,(%rsp)0x0000000000401092 <+48>: mov (%rsp),%rdx0x0000000000401096 <+52>: and $0xf,%edx0x0000000000401099 <+55>: movzbl 0x4024b0(%rdx),%edx0x00000000004010a0 <+62>: mov %dl,0x10(%rsp,%rax,1)0x00000000004010a4 <+66>: add $0x1,%rax0x00000000004010a8 <+70>: cmp $0x6,%rax0x00000000004010ac <+74>: jne 0x40108b <phase_5+41>0x00000000004010ae <+76>: movb $0x0,0x16(%rsp)
上面出现了关于操作地址0x4024b0附近的内容的代码,我们先贴出来以防后面要用到:
假设我们的输入字符串为str,那么上面的改写成c风格代码为:
for (int i = 0; i < 6; ++i) {int edx = (int)str[i] & 0xf; //相当于%rdx的低四字保存着str[i]的低四字,str[i]高四字其实是没有起什么作用的,那么答案肯定很多可能(%rsp + 0x10 + i) = (char)(0x4024b0 + edx);
}
写的有点丑陋了,说人话就是先将str[i]的低四字取出来,然后在栈%rsp + 0x10 + i指向的位置存入地址(0x4024b0 + str[i]的低四字处)处的字符,直到进行六次。然后将0移入%rsp + 0x16处(太挤了0就不写了),栈帧如下:
接下来继续看:
0x00000000004010b3 <+81>: mov $0x40245e,%esi0x00000000004010b8 <+86>: lea 0x10(%rsp),%rdi0x00000000004010bd <+91>: call 0x401338 <strings_not_equal>0x00000000004010c2 <+96>: test %eax,%eax0x00000000004010c4 <+98>: je 0x4010d9 <phase_5+119>0x00000000004010c6 <+100>: call 0x40143a <explode_bomb>
这一段用到了phase_1出现的函数,通过这一段逻辑我们就能知道栈上的6个char和地址0x40245e的字符相同。
那么再回头看
我们就能知道六个字符取了偏移量为9,15,14,5,6,7处的字符,于是输入的str的六个字符的低四字分别为:1001,1111,1110,0101,0110,0111
为了在前面补上0110或者0100根据ASCII值表就能得到两个比较正常的答案(当然答案有很多很多种,我只是取了比较好写的两种)
- ionefg
- IONEFG
接下来就是跳转到结尾判断一下金丝雀值是否被破坏整个程序就结束啦!(▽)
Phase_6
同样我们通过disas phase_6命令反汇编phase_6函数,得到的结果超级超级长,我省略掉保存与弹出被调用者保存寄存器的步骤贴出来如下:
0x00000000004010fc <+8>: sub $0x50,%rsp0x0000000000401100 <+12>: mov %rsp,%r130x0000000000401103 <+15>: mov %rsp,%rsi0x0000000000401106 <+18>: call 0x40145c <read_six_numbers>0x000000000040110b <+23>: mov %rsp,%r140x000000000040110e <+26>: mov $0x0,%r12d0x0000000000401114 <+32>: mov %r13,%rbp0x0000000000401117 <+35>: mov 0x0(%r13),%eax0x000000000040111b <+39>: sub $0x1,%eax0x000000000040111e <+42>: cmp $0x5,%eax0x0000000000401121 <+45>: jbe 0x401128 <phase_6+52>0x0000000000401123 <+47>: call 0x40143a <explode_bomb>0x0000000000401128 <+52>: add $0x1,%r12d0x000000000040112c <+56>: cmp $0x6,%r12d0x0000000000401130 <+60>: je 0x401153 <phase_6+95>0x0000000000401132 <+62>: mov %r12d,%ebx0x0000000000401135 <+65>: movslq %ebx,%rax0x0000000000401138 <+68>: mov (%rsp,%rax,4),%eax0x000000000040113b <+71>: cmp %eax,0x0(%rbp)0x000000000040113e <+74>: jne 0x401145 <phase_6+81>0x0000000000401140 <+76>: call 0x40143a <explode_bomb>0x0000000000401145 <+81>: add $0x1,%ebx0x0000000000401148 <+84>: cmp $0x5,%ebx0x000000000040114b <+87>: jle 0x401135 <phase_6+65>0x000000000040114d <+89>: add $0x4,%r130x0000000000401151 <+93>: jmp 0x401114 <phase_6+32>0x0000000000401153 <+95>: lea 0x18(%rsp),%rsi0x0000000000401158 <+100>: mov %r14,%rax0x000000000040115b <+103>: mov $0x7,%ecx0x0000000000401160 <+108>: mov %ecx,%edx0x0000000000401162 <+110>: sub (%rax),%edx0x0000000000401164 <+112>: mov %edx,(%rax)0x0000000000401166 <+114>: add $0x4,%rax0x000000000040116a <+118>: cmp %rsi,%rax0x000000000040116d <+121>: jne 0x401160 <phase_6+108>0x000000000040116f <+123>: mov $0x0,%esi0x0000000000401174 <+128>: jmp 0x401197 <phase_6+163>0x0000000000401176 <+130>: mov 0x8(%rdx),%rdx0x000000000040117a <+134>: add $0x1,%eax0x000000000040117d <+137>: cmp %ecx,%eax0x000000000040117f <+139>: jne 0x401176 <phase_6+130>0x0000000000401181 <+141>: jmp 0x401188 <phase_6+148>0x0000000000401183 <+143>: mov $0x6032d0,%edx0x0000000000401188 <+148>: mov %rdx,0x20(%rsp,%rsi,2)0x000000000040118d <+153>: add $0x4,%rsi0x0000000000401191 <+157>: cmp $0x18,%rsi0x0000000000401195 <+161>: je 0x4011ab <phase_6+183>0x0000000000401197 <+163>: mov (%rsp,%rsi,1),%ecx0x000000000040119a <+166>: cmp $0x1,%ecx0x000000000040119d <+169>: jle 0x401183 <phase_6+143>0x000000000040119f <+171>: mov $0x1,%eax0x00000000004011a4 <+176>: mov $0x6032d0,%edx0x00000000004011a9 <+181>: jmp 0x401176 <phase_6+130>0x00000000004011ab <+183>: mov 0x20(%rsp),%rbx0x00000000004011b0 <+188>: lea 0x28(%rsp),%rax0x00000000004011b5 <+193>: lea 0x50(%rsp),%rsi0x00000000004011ba <+198>: mov %rbx,%rcx0x00000000004011bd <+201>: mov (%rax),%rdx0x00000000004011c0 <+204>: mov %rdx,0x8(%rcx)0x00000000004011c4 <+208>: add $0x8,%rax0x00000000004011c8 <+212>: cmp %rsi,%rax0x00000000004011cb <+215>: je 0x4011d2 <phase_6+222>0x00000000004011cd <+217>: mov %rdx,%rcx0x00000000004011d0 <+220>: jmp 0x4011bd <phase_6+201>0x00000000004011d2 <+222>: movq $0x0,0x8(%rdx)0x00000000004011da <+230>: mov $0x5,%ebp0x00000000004011df <+235>: mov 0x8(%rbx),%rax0x00000000004011e3 <+239>: mov (%rax),%eax0x00000000004011e5 <+241>: cmp %eax,(%rbx)0x00000000004011e7 <+243>: jge 0x4011ee <phase_6+250>0x00000000004011e9 <+245>: call 0x40143a <explode_bomb>0x00000000004011ee <+250>: mov 0x8(%rbx),%rbx0x00000000004011f2 <+254>: sub $0x1,%ebp0x00000000004011f5 <+257>: jne 0x4011df <phase_6+235>0x00000000004011f7 <+259>: add $0x50,%rsp
确实很长,但是不要被吓怕,所有的东西都是可以被一步一步拆解滴(╹▽╹)
最开始非常温和地使用了一个我们见过的函数read_six_numbers
,回顾一下它的作用:把上面phase_4栈帧%rsp,%rsp+4,%rsp+8,…%rsp+0x14分别存在%rdx,%rcx, %r8, %r9(后四个参数传递寄存器)和’read_six_number’的栈帧顶上两个位置。
接下来使用<__isoc99_sscanf@plt>
函数读取数据,第一个参数%rdi存放我们输入字符串,第二个参数%rsi保存地址$0x4025c3,用命令看一下里面的内容如下:
再根据后面的比较返回值cmp $0x5,%eax
操作大致可以理解我们需要输入6个整型数据,分别将他们存在phase_4栈帧%rsp,%rsp+4,%rsp+8,…%rsp+0x14的位置。
0x0000000000401100 <+12>: mov %rsp,%r130x000000000040110e <+26>: mov $0x0,%r12d0x0000000000401114 <+32>: mov %r13,%rbp0x0000000000401117 <+35>: mov 0x0(%r13),%eax0x000000000040111b <+39>: sub $0x1,%eax0x000000000040111e <+42>: cmp $0x5,%eax0x0000000000401121 <+45>: jbe 0x401128 <phase_6+52>0x0000000000401123 <+47>: call 0x40143a <explode_bomb>0x0000000000401128 <+52>: add $0x1,%r12d0x000000000040112c <+56>: cmp $0x6,%r12d0x0000000000401130 <+60>: je 0x401153 <phase_6+95>0x0000000000401132 <+62>: mov %r12d,%ebx0x0000000000401135 <+65>: movslq %ebx,%rax0x0000000000401138 <+68>: mov (%rsp,%rax,4),%eax0x000000000040113b <+71>: cmp %eax,0x0(%rbp)0x000000000040113e <+74>: jne 0x401145 <phase_6+81>0x0000000000401140 <+76>: call 0x40143a <explode_bomb>0x0000000000401145 <+81>: add $0x1,%ebx0x0000000000401148 <+84>: cmp $0x5,%ebx0x000000000040114b <+87>: jle 0x401135 <phase_6+65>0x000000000040114d <+89>: add $0x4,%r130x0000000000401151 <+93>: jmp 0x401114 <phase_6+32>
先来看第一大段,仔细分析一下,将这段代码转化为c风格代码如下,假设我们输入的6个整数组成了一个数组为input[6]。
int index = 0
while (index != 6) {if (input[index] - 1 <= 5) goto L1;explode_bomb();
L1:index++;for (int i = index; i < 6; ++i) {if (input[index - 1] == input[i]) explode_bomb();}
}
- 第一段的if迫使(arg1 - 1)-(arg6 - 1)都小于等于5,又因为使用的是jbe所以针对无符号整数,因此arg1-arg6都是1-6之间的整数。
- 第二段的if是我们输入的这六个整数互不相同,因此我们输入的arg1-arg6是整数1-6的一个排列
0x000000000040110b <+23>: mov %rsp,%r140x0000000000401153 <+95>: lea 0x18(%rsp),%rsi0x0000000000401158 <+100>: mov %r14,%rax0x000000000040115b <+103>: mov $0x7,%ecx0x0000000000401160 <+108>: mov %ecx,%edx0x0000000000401162 <+110>: sub (%rax),%edx0x0000000000401164 <+112>: mov %edx,(%rax)0x0000000000401166 <+114>: add $0x4,%rax0x000000000040116a <+118>: cmp %rsi,%rax0x000000000040116d <+121>: jne 0x401160 <phase_6+108>
逻辑非常简单,其实也就是将7-arg作用于arg上再存入栈中,设现在栈上的元素设为newarg[1]-newarg[6]。接下来就到了重头戏,也是我个人认为这段代码最难理解的部分,下面调整一下汇编代码顺序让它符合跳转逻辑:
0x000000000040116f <+123>: mov $0x0,%esi0x0000000000401174 <+128>: jmp 0x401197 <phase_6+163>0x0000000000401197 <+163>: mov (%rsp,%rsi,1),%ecx0x000000000040119a <+166>: cmp $0x1,%ecx0x000000000040119d <+169>: jle 0x401183 <phase_6+143>for newarg[i] == 1:%rsi存放4i %ecx存放newarg[i]
---0x0000000000401183 <+143>: mov $0x6032d0,%edx0x0000000000401188 <+148>: mov %rdx,0x20(%rsp,%rsi,2) //将地址$0x6032d0存入栈上%rsp + 0x20 + (i - 1) * 8的位置0x000000000040118d <+153>: add $0x4,%rsi0x0000000000401191 <+157>: cmp $0x18,%rsi0x0000000000401195 <+161>: je 0x4011ab <phase_6+183>
---for newarg[i] > 1: %rsi存放4i %ecx存放newarg[i]
---0x000000000040119f <+171>: mov $0x1,%eax0x00000000004011a4 <+176>: mov $0x6032d0,%edx0x00000000004011a9 <+181>: jmp 0x401176 <phase_6+130>0x0000000000401176 <+130>: mov 0x8(%rdx),%rdx0x000000000040117a <+134>: add $0x1,%eax0x000000000040117d <+137>: cmp %ecx,%eax0x000000000040117f <+139>: jne 0x401176 <phase_6+130>0x0000000000401181 <+141>: jmp 0x401188 <phase_6+148>0x0000000000401188 <+148>: mov %rdx,0x20(%rsp,%rsi,2) //将地址$0x6032d0 + 8 * (newarg[i] - 1)存入栈上%rsp + 0x20 + (i - 1) * 8的位置0x000000000040118d <+153>: add $0x4,%rsi0x0000000000401191 <+157>: cmp $0x18,%rsi0x0000000000401195 <+161>: je 0x4011ab <phase_6+183>
---
先来看一下地址0x6032d0里面到底存了什么东西:
仔细观察可以看出来这是一个单链表。那么这段代码实现了一个什么功能呢?
简单来说就是注释里写的这一段所实现的功能将地址$0x6032d0 + 8 * (newarg[i] - 1)存入栈上%rsp + 0x20 + i * 8的位置。
0x00000000004011ab <+183>: mov 0x20(%rsp),%rbx//%rbx是栈上存的内容但是也是地址0x00000000004011b0 <+188>: lea 0x28(%rsp),%rax//%rax是栈上的地址0x00000000004011b5 <+193>: lea 0x50(%rsp),%rsi//终止指针0x00000000004011ba <+198>: mov %rbx,%rcx0x00000000004011bd <+201>: mov (%rax),%rdx 0x00000000004011c0 <+204>: mov %rdx,0x8(%rcx) 0x00000000004011c4 <+208>: add $0x8,%rax0x00000000004011c8 <+212>: cmp %rsi,%rax0x00000000004011cb <+215>: je 0x4011d2 <phase_6+222>0x00000000004011cd <+217>: mov %rdx,%rcx0x00000000004011d0 <+220>: jmp 0x4011bd <phase_6+201>0x00000000004011d2 <+222>: movq $0x0,0x8(%rdx)
这一块整理成C风格代码如下:
for (int i = 1; i < 6; ++i) {node[newarg[i]].next = &node[newarg[i + 1]];
}
node[6].next = nullptr;
终于要到最后一块了!
0x00000000004011ab <+183>: mov 0x20(%rsp),%rbx0x00000000004011da <+230>: mov $0x5,%ebp0x00000000004011df <+235>: mov 0x8(%rbx),%rax0x00000000004011e3 <+239>: mov (%rax),%eax0x00000000004011e5 <+241>: cmp %eax,(%rbx)0x00000000004011e7 <+243>: jge 0x4011ee <phase_6+250>0x00000000004011e9 <+245>: call 0x40143a <explode_bomb>0x00000000004011ee <+250>: mov 0x8(%rbx),%rbx0x00000000004011f2 <+254>: sub $0x1,%ebp0x00000000004011f5 <+257>: jne 0x4011df <phase_6+235>
由这一块逻辑可以得到在上面画的栈上由下到上所表示的node的值降序排列,通过node.val如下进行排序(注意仅对低四字节进行排序):
node3.val > node4.val > node5.val > node6.val > node1.val > node2.val
那么在栈上的对应关系如图所示:
那么newarg1-newarg6分别为 3 4 5 6 1 2
arg1-arg6分别为4 3 2 1 6 5,这就是phase_6最终的答案。
嘿嘿完结撒花!✿✿ヽ(°▽°)ノ✿
但是真的只是这样吗?回顾一下bomb.c文件,在破解phase_6之后:
input = read_line();phase_6(input);phase_defused();/* Wow, they got it! But isn't something... missing? Perhaps* something they overlooked? Mua ha ha ha ha! */
所以遗漏了什么呢?这个phase_defused
函数好像从来没有反汇编去了解过,那么到此我们才算进入最后一个阶段。
Secret_phase
反汇编得到如下代码:
当我们查看0x402619的内容时发现:
这说明我们的输入是两个整数+一个字符串,并且分析可知分别存在了%rsp +0x8,%rsp + 0xc,%rsp + 0x10的位置,字符串的信息通过分析显然很清晰长度为6,通过查看0x402622的内容:
现在字符串已经很清晰了,就剩下两个整数,可当我们去查看0x603870的内容会发现他为空,猜测这个是因为这个地址在未运行时是没有放入东西的,放入的东西是在前几个phase运行产生的,那么我们就在call 0x400bf0 <__isoc99_sscanf@plt>
前面设置一个断点让其运行到此然后再查看该内存地址处的内容。
可以神奇地发现这个就是我们phase_4求出来的答案,因此可以知道我们要进入secret_phase的话就要在phase_4的答案后面加上一个字符串DrEvil
,然后我们就可以在最后处理secret_phase了!phased_defused剩下的没什么重要的,我们直接进入secret_phase的反汇编代码:
这个函数的主干其实是fun7
函数,我们简要介绍一下总体的逻辑:
- 从stdin或文件读入一行字符然后将其转化为long类型并且限制器范围大于0小于等于0x3e9 = 1001
- 然后调用fun7函数,第一个参数为地址0x6030f0,第二个参数为第一步得到的long类型整数
- 当fun7函数返回值为2时我们正常结束函数,否则炸弹爆炸
fun7函数如下:
重点其实就是我们传进去的那个地址,查看一下里面的内容:
观察一下可以发现这其实是一个二叉树,画图如下:
那么将fun7的逻辑转成c风格代码:
struct node {long val;node *left;node *right;
};int fun7(node *head, long val) {if (head->val == val) {return 0;} else if (head->val < val) {return 2 * fun7(head->right, val) + 1;} else {return 2 * fun7(head->left, val);}
}
结合二叉树和代码可知只有22可以使得返回值为2,所以secret_phase最后的答案就是22。
最后总体来运行一下,ans.txt的答案如下(多个答案的选了一部分写):
最后结果:
Summary
最后来一点碎碎念,终于终于写完了,这个Lab从上周五就开始写,本来把第三章又过了一遍自信满满地以为一天就可以搞定,结果因为中间写的很暴躁已经各种汇报啊朋友来参观啊之类的事情,拖到今天才写完+整理完,总体来说前几个phase和secret_phase基本上是水到渠成,但是phase_6卡了好久好久,中间node和参数的对应真的搞得人头大,但好在最后也是搞完了,看到汇编代码不再头疼了而且也学会了一些gdb调试的技巧,我可真棒呀,继续加油ヾ(◍°∇°◍)ノ゙