最近時常使用valgrind,紀錄一下分析使用心得。
從最簡單的範例開始
以下是一個很明顯的錯誤程式
1 | #include <stdlib.h> |
編譯他並用valgrind檢查,可以看到類似的結果。
1 | $ gcc leak.c -g -o leak |
下方的3413
是PID,而告訴我們在離開之前分配了32768bytes的Memory,而最後Free掉0Bytes。
1 | ==3413== HEAP SUMMARY: |
valgrind將Leak分成幾類
definitely lost 絕對是Leak,不用看一定要處理
possibly lost: 可能是Leak,跟程式語言特性有關,要仔細分析。
這邊有個possibly lost的範例,valgrind認為可能有問題。1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char** argv) {
char* s = "string";
// this will allocate a new array
char* p = strdup(s);
// move the pointer into the array
// we know we can reset the pointer by subtracting
// but for valgrind the array is now lost
p += 1;
// deliberately trigger a segfault to crash the program
*s = 'S';
// reset the pointer to the beginning of the array
p -= 1;
// properly free the memory for the array
free(p);
return 0;
}這邊是輸出結果
1
2
3
4
5
6
7
8
9
10
11==4422== 7 bytes in 1 blocks are possibly lost in loss record 1 of 1
==4422== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4422== by 0x4EC0679: strdup (strdup.c:42)
==4422== by 0x40059F: main (invalid.c:8)
==4422==
==4422== LEAK SUMMARY:
==4422== definitely lost: 0 bytes in 0 blocks
==4422== indirectly lost: 0 bytes in 0 blocks
==4422== possibly lost: 7 bytes in 1 blocks
==4422== still reachable: 0 bytes in 0 blocks
==4422== suppressed: 0 bytes in 0 blocks在Segment fault之前,p的pointer已經被改動了,valgrind無法確認這會計液體的狀態,只好用
possibly lost
來描述。still reachable: 這種是在Process結束之前還能夠接觸到的記憶體,既然Process結束之後,所有記憶體會全部回收,這個部份就要分析是否需要特別處理。
以下是個範例1
2int *p = malloc(10);
exit(0);結果
1
2
3
4
5
6
7
8
9
10
11
12
13
14==4469== HEAP SUMMARY:
==4469== in use at exit: 10 bytes in 1 blocks
==4469== total heap usage: 1 allocs, 0 frees, 10 bytes allocated
==4469==
==4469== 10 bytes in 1 blocks are still reachable in loss record 1 of 1
==4469== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4469== by 0x400595: main (invalid.c:6)
==4469==
==4469== LEAK SUMMARY:
==4469== definitely lost: 0 bytes in 0 blocks
==4469== indirectly lost: 0 bytes in 0 blocks
==4469== possibly lost: 0 bytes in 0 blocks
==4469== still reachable: 10 bytes in 1 blocks
==4469== suppressed: 0 bytes in 0 blocks
常見錯誤
malloc(new/new[]) 和 free(delete/delete[])不匹配
1 | int *p = new int; |
輸出結果
1 | ==3558== Mismatched free() / delete / delete [] |
Address 0x5a1d04
就是我們分配出來的記憶體位置,不過雖然不匹配,不過不會造成Memory Leak。
1 | ==3558== HEAP SUMMARY: |
不過由於這個範例太簡單了,不是每段程式這樣用都不會出錯,之後有時間來討論這邊為什麼可以過。
Double free
註明ㄜ
Invalid Read / Write
1 | int *p = new int[10]; |
結果
1 | ==3687== Invalid write of size 4 |
我們分配到的記憶體位置是0x5a1d040
,而valgrind告訴我們0x5a1d06c
的讀取超出邊界,0x5a1d06c - 0x5a1d040 == 2c(hex) = 40 (dec)
,依照int是4bytes的大小來看,剛好是int array index為10的部份。
同理可以說明Invalid write的部份,另外可以看到
Address 0x5a1d068 is 0 bytes after a block of size 40 alloc’d
Address 0x5a1d06c is 4 bytes after a block of size 40 alloc’d
表示第一個讀取的記憶體正好靠近在Allocate 40bytes的記憶體邊界,而寫入的部份是距離4bytes遠,跟我們上面分析的結果一樣。
Invlid free
著名的Double free範例
1 | int *p = (int *)malloc(100); |
有了上面那個例子,看了Log比較不會一頭霧水
1 | ==3956== Invalid free() / delete / delete[] / realloc() |
注意看,HEAP SUMMARY那邊告訴我們,我們只Allocate一次,不過Free兩次。
而這行
Address 0x51fd040 is 0 bytes inside a block of size 100 free’d
告訴我們,0x51fd040
是在一塊分配100bytes的記憶體的頭,不過他已經被free掉了。
Syscall param uninitialised
基本上這不算一個Bug,不過會有Security concern。
1 | char buf[100]; |
結果如下
1 | ==3803== Syscall param write(buf) points to uninitialised byte(s) |
Conditional jump or move depends on uninitialised value(s)
跟上面那種很像,不過世發生在User space。
1 | int v; |
1 | ==3898== Conditional jump or move depends on uninitialised value(s) |
Source and destination overlap
在使用memcpy/strcpy等函數時,src跟dst重疊,這時有可能造成問題。
1 | char src[10]; |
1 | ==4108== Source and destination overlap in memcpy(0xffefffe30, 0xffefffe20, 50) |
Using gdb and Valgrind together
有時候總有這樣的需求
1 | $ valgrind --db-attach=yes program argument(s) |
當錯誤發生的時候,會有以下選擇
1 | ==4222== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- |
按y就能用gdb進去看發生什麼事了。