# Boost Fibonacci Heap Access Violation during pop()

0

## Context

I'm currently implementing some form of A* algorithm. I decided to use boost's fibonacci heap as underlying priority queue.

My Graph is being built while the algorithm runs. As Vertex object I'm using:

``````class Vertex {
public:
Vertex(double, double);
double distance = std::numeric_limits<double>::max();
double heuristic = 0;
HeapData* fib;
Vertex* predecessor = nullptr;

double euclideanDistanceTo(Vertex* v);
}
``````

My Edge looks like:

``````class Edge {
public:
Edge(Vertex*, double);
Vertex* vertex = nullptr;
double weight = 1;
}
``````

In order to use boosts fibonacci heap, I've read that one should create a heap data object, which I did like that:

``````struct HeapData {
Vertex* v;
boost::heap::fibonacci_heap<HeapData>::handle_type handle;

HeapData(Vertex* u) {
v = u;
}

bool operator<(HeapData const& rhs) const {
return rhs.v->distance + rhs.v->heuristic < v->distance + v->heuristic;
}
};
``````

Note, that I included the heuristic and the actual distance in the comparator to get the A* behaviour, I want.

My actual A* implementation looks like that:

``````    boost::heap::fibonacci_heap<HeapData> heap;

HeapData fibs(startPoint);
startPoint->distance = 0;
startPoint->heuristic = getHeuristic(startPoint);
auto handles = heap.push(fibs);
(*handles).handle = handles;

while (!heap.empty()) {
HeapData u = heap.top();
heap.pop();

if (u.v->equals(endPoint)) {
return;
}

doSomeGraphCreationStuff(u.v); // this only creates vertices and edges

for (Edge* e : u.v->adj) {
double newDistance = e->weight + u.v->distance;

if (e->vertex->distance > newDistance) {

e->vertex->distance = newDistance;
e->vertex->predecessor = u.v;

if (!e->vertex->fib) {
if (!u.v->equals(endPoint)) {
e->vertex->heuristic = getHeuristic(e->vertex);
}
e->vertex->fib = new HeapData(e->vertex);
e->vertex->fib->handle = heap.push(*(e->vertex->fib));
}
else {
heap.increase(e->vertex->fib->handle);
}
}
}
}
``````

## Problem

The algorithm runs just fine, if I use a very small heuristic (which degenerates A* to Dijkstra). If I introduce some stronger heuristic, however, the program throws an exepction stating: `0xC0000005: Access violation writing location 0x0000000000000000.` in the unlink method of boosts circular_list_algorithm.hpp. For some reason, `next` and `prev` are null. This is a direct consequence of calling `heap.pop()`. Note that `heap.pop()` works fine for several times and does not crash immediately.

## Question

What causes this problem and how can I fix it?

## What I have tried

• My first thought was that I accidentally called increase() even though distance + heuristic got bigger instead of smaller (according to the documentation, this can break stuff). This is not possible in my implementation, however, because I can only change a node if the distance got smaller. The heurisitic stays constant. I tried to use update() instead of increase() anyway, without success
• I tried to set several break points to get a more detailed view, but my data set is huge and I fail to reproduce it with smaller sets.

• Boost Version: 1.76.0
• C++14
• the increase function is indeed right (instead of a decrease function) since all boost heaps are implemented as max-heaps. We get a min-heap by reversing the comparator and using increase/decrease reversed
c++
boost

2

Okay, prepare for a ride.

1. First I found a bug
2. Next, I fully reviewed, refactored and simplified the code
3. When the dust settled, I noticed a behaviour change that looked like a potential logic error in the code

## 1. The Bug

Like I commented at the question, the code complexity is high due to over-reliance on raw pointers without clear semantics.

While I was reviewing and refactoring the code, I found that this has, indeed, lead to a bug:

``````e->vertex->fib = new HeapData(e->vertex);
e->vertex->fib->handle = heap.push(*(e->vertex->fib));
``````
• In the first line you create a HeapData object. You make the `fib` member point to that object.
• The second line inserts a copy of that object (meaning, it's a new object, with a different object identity, or practically speaking: a different address).

So, now

• `e->vertex->fib` points to a (leaked) HeapData object that does not exist in the queue, and
• the actual queued `HeapData` copy has a default-constructed `handle` member, which means that the handle wraps a null pointer. (Check `boost::heap::detail::node_handle<>` in `detail/stable_heap.hpp` to verify this).

This would handsomely explain the symptom you are seeing.

## 2. Refactor, Simplify

So, after understanding the code I have come to the conclusion that

• `HeapData` and `Vertex` should to be merged: HeapData only served to link a handle to a Vertex, but you can already make the Vertex contain a Handle directly.

As a consequence of this merge

• your vertex queue now actually contains vertices, expressing intent of the code

• you reduce all of the vertex access by one level of indirection (reducing Law Of Demeter violations)

• you can write the push operation in one natural line, removing the room for your bug to crop up. Before:

`````` target->fib = new HeapData(target);
target->fib->handle = heap.push(*(target->fib));
``````

After:

`````` target->fibhandle = heap.push(target);
``````
• Your `Edge` class doesn't actually model an edge, but rather an "adjacency" - the target part of the edge, with the weight attribute.

I renamed it `OutEdge` for clarity and also changed the vector to contain values instead of dynamically allocated `OutEdge` instances.

I can't tell from the code shown, but I can almost guarantee these were being leaked.

Also, `OutEdge` is only 16 bytes on most platforms, so copying them will be fine, and adjacencies are by definition owned by their source vertex (because including/moving it to another source vertex would change the meaning of the adjacency).

In fact, if you're serious about performance, you may want to use a `boost::container::small_vector` with a suitably chosen capacity if you know that e.g. the median number of edges is "small"

• Your comparison can be "outsourced" to a function object

`````` using Node = Vertex*;
struct PrioCompare {
bool operator()(Node a, Node b) const;
};
``````

After which the heap can be typed as:

`````` namespace bh = boost::heap;
using Heap   = bh::fibonacci_heap<Node, bh::compare<PrioCompare>>;
using Handle = Heap::handle_type;
``````
• Your cost function violated more Law-Of-Demeter, which was easily fixed by adding a Literate-Code accessor:

`````` Cost cost() const { return distance + heuristic; }
``````
• From quick inspection I think it would be more accurate to use `infinite()` over `max()` as initial distance. Also, use a constant for readability:

`````` static constexpr auto INF = std::numeric_limits<Cost>::infinity();
Cost distance = INF;
``````
• You had a repeated check for `xyz->equals(endPoint)` to avoid updating the heuristic for a vertex. I suggest moving the update till after vertex dequeue, so the repetition can be gone (of both the check and the `getHeuristic(...)` call).

• Like you said, we need to tread carefully around the `increase`/`update` fixup methods. As I read your code, the priority of a node is inversely related to the "cost" (cumulative edge-weight and heuristic values).

Because Boost Heap heaps are max heaps this implies that increasing the priority should match decreasing cost. We can just assert this to detect any programmer error in debug builds:

`````` assert(target->cost() < previous_cost);
heap.increase(target->fibhandle);
``````

With these changes in place, the code can read a lot quieter:

``````Cost AStarSearch(Node start, Node destination) {
Heap heap;

start->distance  = 0;
start->fibhandle = heap.push(start);

while (!heap.empty()) {
Node u = heap.top();
heap.pop();

if (u->equals(destination)) {
return u->cost();
}
u->heuristic = getHeuristic(start);

doSomeGraphCreationStuff(u);

for (auto& [target, weight] : u->adj) {
auto curDistance = weight + u->distance;

// if cheaper route, queue or update queued
if (curDistance < target->distance) {
auto cost_prior     = target->cost();
target->distance    = curDistance;
target->predecessor = u;

if (target->fibhandle == NOHANDLE) {
target->fibhandle = heap.push(target);
} else {
assert(target->cost() < cost_prior);
heap.update(target->fibhandle);
}
}
}
}

return INF;
}
``````

## 2(b) Live Demo

Live On Coliru

``````#include <boost/heap/fibonacci_heap.hpp>
#include <iostream>

using Cost = double;
struct Vertex;

Cost getHeuristic(Vertex const*) { return 0; }
void doSomeGraphCreationStuff(Vertex const*) {
// this only creates vertices and edges
}

struct OutEdge { // adjacency from implied source vertex
Vertex* target = nullptr;
Cost    weight = 1;
};

namespace bh = boost::heap;
using Node   = Vertex*;
struct PrioCompare {
bool operator()(Node a, Node b) const;
};
using Heap   = bh::fibonacci_heap<Node, bh::compare<PrioCompare>>;
using Handle = Heap::handle_type;
static const Handle   NOHANDLE{}; // for expressive comparisons
static constexpr auto INF = std::numeric_limits<Cost>::infinity();

struct Vertex {
Vertex(Cost d = INF, Cost h = 0) : distance(d), heuristic(h) {}

Cost    distance  = INF;
Cost    heuristic = 0;
Handle  fibhandle{};
Vertex* predecessor = nullptr;

Cost cost() const { return distance + heuristic; }
Cost euclideanDistanceTo(Vertex* v);
bool equals(Vertex const* u) const { return this == u; }
};

// Now Vertex is a complete type, implement comparison
bool PrioCompare::operator()(Node a, Node b) const {
return a->cost() > b->cost();
}

Cost AStarSearch(Node start, Node destination) {
Heap heap;

start->distance  = 0;
start->fibhandle = heap.push(start);

while (!heap.empty()) {
Node u = heap.top();
heap.pop();

if (u->equals(destination)) {
return u->cost();
}
u->heuristic = getHeuristic(start);

doSomeGraphCreationStuff(u);

for (auto& [target, weight] : u->adj) {
auto curDistance = weight + u->distance;

// if cheaper route, queue or update queued
if (curDistance < target->distance) {
auto cost_prior     = target->cost();
target->distance    = curDistance;
target->predecessor = u;

if (target->fibhandle == NOHANDLE) {
target->fibhandle = heap.push(target);
} else {
assert(target->cost() < cost_prior);
heap.update(target->fibhandle);
}
}
}
}

return INF;
}

int main() {
// a very very simple graph data structure with minimal helpers:
std::vector<Vertex> graph(10);
auto node = [&graph](int id)             { return &graph.at(id);       };
auto id   = [&graph](Vertex const* node) { return node - graph.data(); };

// defining 6 edges
graph.adj = {{node(2), 1.5}, {node(3), 15}};
graph.adj = {{node(4), 2.5}, {node(1), 5}};

// do a search
Node startPoint = node(0);
Node endPoint   = node(7);

Cost cost = AStarSearch(startPoint, endPoint);

std::cout << "Overall cost: " << cost << ", reverse path: \n";
for (Node node = endPoint; node != nullptr; node = node->predecessor) {
std::cout << " - " << id(node) << " distance " << node->distance
<< "\n";
}
}
``````

Prints

``````Overall cost: 7, reverse path:
- 7 distance 7
- 1 distance 6.5
- 2 distance 1.5
- 0 distance 0
``````

## 3. The Plot Twist: Lurking Logic Errors?

I felt uneasy about moving the `getHeuristic()` update around. I wondered whether I might have changed the meaning of the code, even though the control flow seemed to check out.

And then I realized that indeed the behaviour changed. It is subtle. At first I thought the the old behaviour was just problematic. So, let's analyze:

The source of the risk is an inconsistency in node visitation vs. queue prioritization.

• When visiting nodes, the condition to see whether the target became "less distant" is expressed in terms of distance only.
• However, the queue priority will be based on cost, which is different from distance in that it includes any heuristics.

The problem lurking there is that it is possible to write code that where the fact that distance decreases, NEED NOT guarantee that cost decreases.

Going back to the code, we can see that this narrowly avoided, because the getHeuristic update is only executed in the non-update path of the code.

### In Conclusion

Understanding this made me realize that

• the `Vertex::heuristic` field is intended merely as a "cached" version of the `getHeuristic()` function call
• implying that that function is treated as if it is idempotent
• that my version did change behaviour in that `getHeuristic` was now potentially executed more than once for the same vertex (if visited again via a cheaper path)

I would suggest to fix this by

• renaming the `heuristic` field to `cachedHeuristic`
• making an `enqueue` function to encapsulate the three steps for enqueuing a vertex:
• simply omitting the endpoint check because it can at MOST eliminate a single invocation of `getHeuristic` for that node, probably not worth the added complexity
• add a comment pointing out the subtlety of that code path
• UPDATE as discovered in the comments, we also need the inverse operatione (`dequeue`) to symmtrically update `handle` so it reflects that the node is no longer in the queue...

It also drives home the usefulness of having the precondition `assert` added before invoking `Heap::increase`.

## Final Listing

With the above changes

• encapsulated into a Graph object, that

• also reads the graph from input like:

`````` 0   2   1.5
0   3    15
2   4   2.5
2   1     5
1   7   0.5
7   3   0.5
``````

Where each line contains (source, target, weight).

• A separate file can contain heuristic values for vertices index `[0, ...)`, optionally newline-separated, e.g. "7 11 99 33 44 55"

• and now returning the arrived-at node instead of its cost only

Live On Coliru

``````#include <boost/heap/fibonacci_heap.hpp>
#include <iostream>
#include <deque>
#include <fstream>

using Cost = double;
struct Vertex;

struct OutEdge { // adjacency from implied source vertex
Vertex* target = nullptr;
Cost    weight = 1;
};

namespace bh = boost::heap;
using Node   = Vertex*;
struct PrioCompare {
bool operator()(Node a, Node b) const;
};
using MutableQueue   = bh::fibonacci_heap<Node, bh::compare<PrioCompare>>;
using Handle = MutableQueue::handle_type;
static const Handle   NOHANDLE{}; // for expressive comparisons
static constexpr auto INF = std::numeric_limits<Cost>::infinity();

struct Vertex {
Vertex(Cost d = INF, Cost h = 0) : distance(d), cachedHeuristic(h) {}

Cost    distance        = INF;
Cost    cachedHeuristic = 0;
Handle  handle{};
Vertex* predecessor = nullptr;

Cost cost() const { return distance + cachedHeuristic; }
Cost euclideanDistanceTo(Vertex* v);
};

// Now Vertex is a complete type, implement comparison
bool PrioCompare::operator()(Node a, Node b) const {
return a->cost() > b->cost();
}

class Graph {
std::vector<Cost> _heuristics;

Cost getHeuristic(Vertex* v) {
size_t n = id(v);
return n < _heuristics.size() ? _heuristics[n] : 0;
}
void doSomeGraphCreationStuff(Vertex const*) {
// this only creates vertices and edges
}

public:
Graph(std::string edgeFile, std::string heurFile) {
{
std::ifstream stream(heurFile);
_heuristics.assign(std::istream_iterator<Cost>(stream), {});
if (!stream.eof())
throw std::runtime_error("Unexpected heuristics");
}
std::ifstream stream(edgeFile);

size_t src, tgt;
double weight;
while (stream >> src >> tgt >> weight) {
_nodes.resize(std::max({_nodes.size(), src + 1, tgt + 1}));
}

if (!stream.eof())
throw std::runtime_error("Unexpected input");
}

Node search(size_t from, size_t to) {
assert(from < _nodes.size());
assert(to < _nodes.size());
return AStar(node(from), node(to));
}

size_t id(Node node) const {
// ugh, this is just for "pretty output"...
for (size_t i = 0; i < _nodes.size(); ++i) {
if (node == &_nodes[i])
return i;
}
throw std::out_of_range("id");
};
Node node(int id) { return &_nodes.at(id); };

private:
// simple graph data structure with minimal helpers:
std::deque<Vertex> _nodes; // reference stable when growing at the back

// search state
MutableQueue _queue;

void enqueue(Node n) {
assert(n && (n->handle == NOHANDLE));
// get heuristic before insertion!
n->cachedHeuristic = getHeuristic(n);
n->handle = _queue.push(n);
}

Node dequeue() {
Node node = _queue.top();
node->handle = NOHANDLE;
_queue.pop();
return node;
}

Node AStar(Node start, Node destination) {
_queue.clear();

start->distance = 0;
enqueue(start);

while (!_queue.empty()) {
Node u = dequeue();

if (u == destination) {
return u;
}

doSomeGraphCreationStuff(u);

for (auto& [target, weight] : u->adj) {
auto curDistance = u->distance + weight;

// if cheaper route, queue or update queued
if (curDistance < target->distance) {
auto cost_prior     = target->cost();
target->distance    = curDistance;
target->predecessor = u;

if (target->handle == NOHANDLE) {
// also caches heuristic
enqueue(target);
} else {
// NOTE: avoid updating heuristic here, because it
// breaks the queue invariant if heuristic increased
// more than decrease in distance
assert(target->cost() < cost_prior);
_queue.increase(target->handle);
}
}
}
}

return nullptr;
}
};

int main() {
Graph graph("input.txt", "heur.txt");

Node arrival = graph.search(0, 7);

std::cout << "reverse path: \n";
for (Node n = arrival; n != nullptr; n = n->predecessor) {
std::cout << " - " << graph.id(n) << " cost " << n->cost() << "\n";
}
}
``````

Again, printing the expected

``````reverse path:
- 7 cost 7
- 1 cost 17.5
- 2 cost 100.5
- 0 cost 7
``````

Note how the heuristics changed the cost, but not optimal path in this case.

User contributions licensed under CC BY-SA 3.0