且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

C:分段故障:GDB:<读取变量的错误>

更新时间:2023-11-21 14:25:22

在这行之前看起来不错:

  appendLocationsToQueue(currentView,pQ,pQLen,buffer,bufferLen,cur); 

,当它告诉我你已经踩了(写了 0x7fff00000000 to) $ rbp 注册表(所有局部变量和参数都相对于 $ rbp />

您可以在GDB中使用 print $ rbp 在调用之前和之后确认 appendLocationsToQueue $ rbp 应该在给定的函数内始终具有相同的值,但会改变) / p>

假设这是真的,只有几种方法可能会发生,最可能的方法是在 appendLocationsToQueue $中的堆栈缓冲区溢出c $ c>(或者它所调用的东西)。



你应该可以使用地址清理器( g ++ -fsanitize = address ... )相当容易找到这个错误。



在GDB中找到溢出也很容易:进入 appendLocationsToQueue ,然后执行 watch -l *(char **)$ rbp continue 。当您的代码覆盖 $ rbp 保存位置时,观察点将触发。


I have a function shortestPath() that is a modified implementation of Dijkstra's algorithm for use with a board game AI I am working on for my comp2 class. I have trawled through the website and using gdb and valgrind I know exactly where the segfault happens (actually knew that a few hours ago), but can't figure out what undefined behaviour or logic error is causing the problem.

The function in which the problem occurs is called around 10x and works as expected until it segfaults with GDB: "error reading variable: cannot access memory" and valgrind: "Invalid read of size 8"

Normally that would be enough, but I can't work this one out. Also any general advise and tips are appreciated... thanks!

GDB: https://gist.github.com/mckayryan/b8d1e9cdcc58dd1627ea
Valgrind: https://gist.github.com/mckayryan/8495963f6e62a51a734f

Here is the function in which the segfault occurs:

static void processBuffer (GameView currentView, Link pQ, int *pQLen, 
                           LocationID *buffer, int bufferLen, Link prev,
                           LocationID cur)
{
    //printLinkIndex("prev", prev, NUM_MAP_LOCATIONS);
    // adds newly retrieved buffer Locations to queue adding link types 
    appendLocationsToQueue(currentView, pQ, pQLen, buffer, bufferLen, cur);
    // calculates distance of new locations and updates prev when needed
    updatePrev(currentView, pQ, pQLen, prev, cur);  <--- this line here 

    qsort((void *) pQ, *pQLen, sizeof(link), (compfn)cmpDist);
    // qsort sanity check
    int i, qsortErr = 0;
    for (i = 0; i < *pQLen-1; i++) 
        if (pQ[i].dist > pQ[i+1].dist) qsortErr = 1;
    if (qsortErr) {
        fprintf(stderr, "loadToPQ: qsort did not sort succesfully");
        abort();
    }  
}

and the function whereby after it is called everything falls apart:

static void appendLocationsToQueue (GameView currentView, Link pQ, 
                                   int *pQLen, LocationID *buffer, 
                                   int bufferLen, LocationID cur)
{
    int i, c, conns;
    TransportID type[MAX_TRANSPORT] = { NONE };     

    for (i = 0; i < bufferLen; i++) { 
        // get connection information (up to 3 possible)  
        conns = connections(currentView->gameMap, cur, buffer[i], type);
        for (c = 0; c < conns; c++) {
            pQ[*pQLen].loc = buffer[i];
            pQ[(*pQLen)++].type = type[c];            
        }            
    }
}

So I thought that a pointer had been overridden to the wrong address, but after a lot of printing in GDB that doesn't seem to be the case. I also rotated through making reads/writes to the variables in question to see which trigger the fault and they all do after appendLocationsToQueue(), but not before (or at the end of that function for that matter).

Here is the rest of the relevant code: shortestPath():

Link shortestPath (GameView currentView, LocationID from, LocationID to, PlayerID player, int road, int rail, int boat)
{
    if (!RAIL_MOVE) rail = 0;

    // index of locations that have been visited    
    int visited[NUM_MAP_LOCATIONS] = { 0 };

    // current shortest distance from the source
    // the previous node for current known shortest path
    Link prev;
    if(!(prev = malloc(NUM_MAP_LOCATIONS*sizeof(link))))
        fprintf(stderr, "GameView.c: shortestPath: malloc failure (prev)");

    int i;
    // intialise link data structure
    for (i = 0; i < NUM_MAP_LOCATIONS; i++) {
        prev[i].loc = NOWHERE;
        prev[i].type = NONE;
        if (i != from) prev[i].dist = INF; 
        else prev[i].dist = LAST; 
    }
    LocationID *buffer, cur;
    // a priority queue that dictates the order LocationID's are checked
    Link pQ;
    int bufferLen, pQLen = 0;
    if (!(pQ = malloc(MAX_QUEUE*sizeof(link))))
        fprintf(stderr, "GameView.c: shortestPath: malloc failure (pQ)");
    // load initial location into queue
    pQ[pQLen++].loc = from;

    while (!visited[to]) {
        // remove first item from queue into cur  
        shift(pQ, &pQLen, &cur);
        if (visited[cur]) continue;
        // freeing malloc from connectedLocations()
        if (cur != from) free(buffer); 
        // find all locations connected to   
        buffer = connectedLocations(currentView, &bufferLen, cur, 
                                    player, currentView->roundNum, road, 
                                    rail, boat); 
        // mark current node as visited
        visited[cur] = VISITED;
        // locations from buffer are used to update priority queue (pQ) 
        // and distance information in prev       
        processBuffer(currentView, pQ, &pQLen, buffer, bufferLen, prev,
                      cur);
    }
    free(buffer);
    free(pQ);
    return prev;
}

The fact that all your parameters look good before this line:

appendLocationsToQueue(currentView, pQ, pQLen, buffer, bufferLen, cur);

and become unavailable after it tells me that you've stepped on (wrote 0x7fff00000000 to) the $rbp register (all local variables and parameters are relative to $rbp when building without optimization).

You can confirm this in GDB with print $rbp before and after call to appendLocationsToQueue ($rbp is supposed to always have the same value inside a given function, but will have changed).

Assuming this is true, there are only a few ways this could happen, and the most likely way is a stack buffer overflow in appendLocationsToQueue (or something it calls).

You should be able to use Address Sanitizer (g++ -fsanitize=address ...) to find this bug fairly easily.

It's also fairly easy to find the overflow in GDB: step into appendLocationsToQueue, and do watch -l *(char**)$rbp, continue. The watchpoint should fire when your code overwrites the $rbp save location.