0%

Valgrind memcheck usage analysis

最近時常使用valgrind,紀錄一下分析使用心得。

從最簡單的範例開始

以下是一個很明顯的錯誤程式

1
2
3
4
5
6
#include <stdlib.h>
int main()
{
int *p = malloc(32768);
return 0;
}

編譯他並用valgrind檢查,可以看到類似的結果。

1
2
$ gcc leak.c -g -o leak
$ valgrind --leak-check=full ./leak

下方的3413是PID,而告訴我們在離開之前分配了32768bytes的Memory,而最後Free掉0Bytes。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
==3413== HEAP SUMMARY:
==3413== in use at exit: 32,768 bytes in 1 blocks
==3413== total heap usage: 1 allocs, 0 frees, 32,768 bytes allocated
==3413==
==3413== 32,768 bytes in 1 blocks are definitely lost in loss record 1 of 1
==3413== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3413== by 0x40053E: main (leak.c:4)
==3413==
==3413== LEAK SUMMARY:
==3413== definitely lost: 32,768 bytes in 1 blocks
==3413== indirectly lost: 0 bytes in 0 blocks
==3413== possibly lost: 0 bytes in 0 blocks
==3413== still reachable: 0 bytes in 0 blocks
==3413== suppressed: 0 bytes in 0 blocks
==3413==
==3413== For counts of detected and suppressed errors, rerun with: -v
==3413== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

valgrind將Leak分成幾類

  • definitely lost 絕對是Leak,不用看一定要處理

  • possibly lost: 可能是Leak,跟程式語言特性有關,要仔細分析。
    這邊有個possibly lost的範例,valgrind認為可能有問題。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>

    int main(int argc, char** argv) {
    char* s = "string";
    // this will allocate a new array
    char* p = strdup(s);
    // move the pointer into the array
    // we know we can reset the pointer by subtracting
    // but for valgrind the array is now lost
    p += 1;
    // deliberately trigger a segfault to crash the program
    *s = 'S';
    // reset the pointer to the beginning of the array
    p -= 1;
    // properly free the memory for the array
    free(p);
    return 0;
    }

    這邊是輸出結果

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    ==4422== 7 bytes in 1 blocks are possibly lost in loss record 1 of 1
    ==4422== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==4422== by 0x4EC0679: strdup (strdup.c:42)
    ==4422== by 0x40059F: main (invalid.c:8)
    ==4422==
    ==4422== LEAK SUMMARY:
    ==4422== definitely lost: 0 bytes in 0 blocks
    ==4422== indirectly lost: 0 bytes in 0 blocks
    ==4422== possibly lost: 7 bytes in 1 blocks
    ==4422== still reachable: 0 bytes in 0 blocks
    ==4422== suppressed: 0 bytes in 0 blocks

    在Segment fault之前,p的pointer已經被改動了,valgrind無法確認這會計液體的狀態,只好用possibly lost來描述。

  • still reachable: 這種是在Process結束之前還能夠接觸到的記憶體,既然Process結束之後,所有記憶體會全部回收,這個部份就要分析是否需要特別處理。
    以下是個範例

    1
    2
    int *p = malloc(10);
    exit(0);

    結果

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    ==4469== HEAP SUMMARY:
    ==4469== in use at exit: 10 bytes in 1 blocks
    ==4469== total heap usage: 1 allocs, 0 frees, 10 bytes allocated
    ==4469==
    ==4469== 10 bytes in 1 blocks are still reachable in loss record 1 of 1
    ==4469== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==4469== by 0x400595: main (invalid.c:6)
    ==4469==
    ==4469== LEAK SUMMARY:
    ==4469== definitely lost: 0 bytes in 0 blocks
    ==4469== indirectly lost: 0 bytes in 0 blocks
    ==4469== possibly lost: 0 bytes in 0 blocks
    ==4469== still reachable: 10 bytes in 1 blocks
    ==4469== suppressed: 0 bytes in 0 blocks

常見錯誤

malloc(new/new[]) 和 free(delete/delete[])不匹配

1
2
int *p = new int;
delete [] p;

輸出結果

1
2
3
4
5
6
==3558== Mismatched free() / delete / delete []
==3558== at 0x4C2C83C: operator delete[](void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3558== by 0x400675: main (mismatch.cpp:5)
==3558== Address 0x5a1d040 is 0 bytes inside a block of size 4 alloc'd
==3558== at 0x4C2B0E0: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3558== by 0x40065E: main (mismatch.cpp:4)

Address 0x5a1d04就是我們分配出來的記憶體位置,不過雖然不匹配,不過不會造成Memory Leak。

1
2
3
4
5
==3558== HEAP SUMMARY:
==3558== in use at exit: 0 bytes in 0 blocks
==3558== total heap usage: 1 allocs, 1 frees, 4 bytes allocated
==3558==
==3558== All heap blocks were freed -- no leaks are possible

不過由於這個範例太簡單了,不是每段程式這樣用都不會出錯,之後有時間來討論這邊為什麼可以過。

Double free

註明ㄜ

Invalid Read / Write

1
2
3
int *p = new int[10];
p[10] = 100;
int v = p[11];

結果

1
2
3
4
5
6
7
8
9
10
11
==3687== Invalid write of size 4
==3687== at 0x4006D1: main (invalid.cpp:7)
==3687== Address 0x5a1d068 is 0 bytes after a block of size 40 alloc'd
==3687== at 0x4C2B800: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3687== by 0x4006AE: main (invalid.cpp:5)
==3687==
==3687== Invalid read of size 4
==3687== at 0x4006DB: main (invalid.cpp:8)
==3687== Address 0x5a1d06c is 4 bytes after a block of size 40 alloc'd
==3687== at 0x4C2B800: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3687== by 0x4006AE: main (invalid.cpp:5)

我們分配到的記憶體位置是0x5a1d040,而valgrind告訴我們0x5a1d06c的讀取超出邊界,0x5a1d06c - 0x5a1d040 == 2c(hex) = 40 (dec),依照int是4bytes的大小來看,剛好是int array index為10的部份。
同理可以說明Invalid write的部份,另外可以看到

Address 0x5a1d068 is 0 bytes after a block of size 40 alloc’d
Address 0x5a1d06c is 4 bytes after a block of size 40 alloc’d

表示第一個讀取的記憶體正好靠近在Allocate 40bytes的記憶體邊界,而寫入的部份是距離4bytes遠,跟我們上面分析的結果一樣。

Invlid free

著名的Double free範例

1
2
3
int *p = (int *)malloc(100);
free(p);
free(p);

有了上面那個例子,看了Log比較不會一頭霧水

1
2
3
4
5
6
7
8
9
==3956== Invalid free() / delete / delete[] / realloc()
==3956== at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3956== by 0x400600: main (invalid.cpp:9)
==3956== Address 0x51fd040 is 0 bytes inside a block of size 100 free'd
==3956== at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==3956== by 0x4005F4: main (invalid.cpp:8)
==3956== HEAP SUMMARY:
==3956== in use at exit: 0 bytes in 0 blocks
==3956== total heap usage: 1 allocs, 2 frees, 100 bytes allocated

注意看,HEAP SUMMARY那邊告訴我們,我們只Allocate一次,不過Free兩次。
而這行

Address 0x51fd040 is 0 bytes inside a block of size 100 free’d

告訴我們,0x51fd040是在一塊分配100bytes的記憶體的頭,不過他已經被free掉了。

Syscall param uninitialised

基本上這不算一個Bug,不過會有Security concern。

1
2
char buf[100];
write(2, buf, 100);

結果如下

1
2
3
4
==3803== Syscall param write(buf) points to uninitialised byte(s)
==3803== at 0x4F23700: __write_nocancel (syscall-template.S:81)
==3803== by 0x4005C9: main (uninit.cpp:7)
==3803== Address 0xffefffdf0 is on thread 1's stack

Conditional jump or move depends on uninitialised value(s)

跟上面那種很像,不過世發生在User space。

1
2
3
4
5
6
int v;
if (v) {
printf("OK");
} else {
printf("Bye");
}
1
2
3
==3898== Conditional jump or move depends on uninitialised value(s)
==3898== at 0x400539: main (uninit.cpp:7)
==3898==

Source and destination overlap

在使用memcpy/strcpy等函數時,src跟dst重疊,這時有可能造成問題。

1
2
3
char src[10];
char dst[20];
memcpy(dst, src, 50);
1
2
3
==4108== Source and destination overlap in memcpy(0xffefffe30, 0xffefffe20, 50)
==4108== at 0x4C2F71C: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4108== by 0x400658: main (invalid.cpp:11)

Using gdb and Valgrind together

有時候總有這樣的需求

1
$ valgrind --db-attach=yes program argument(s)

當錯誤發生的時候,會有以下選擇

1
==4222== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ----

按y就能用gdb進去看發生什麼事了。