Priority queue implementation in ASH

heeheehee

Developer
Staff member
In writing one of my scripts, I found that I wanted a priority queue. A naive ASH implementation using sort resulted in n log n cost to retrieve an element, which is hideously bad (with 1 million elements, that's about 20 million comparisons, as opposed to the desired 20 or so). After a bit of whining to myself, I sucked it up and implemented a priority heap in ASH. I'm posting this here in the hope that someone else might find it useful.

I don't *think* it's buggy (I ran some number of tests on it), although I could be mistaken.
PHP:
record Heap {
    string [int] heap;
    float [string] cost;
};

void add(Heap h, string value, float cost) {
    h.cost[""] = 2.0 ** 1023 * 2; // infinity; can substitute some other large number here
    int index = count(h.heap);
    h.heap[index] = value;
    h.cost[value] = cost;
    while (cost < h.cost[h.heap[index / 2]]) {
        h.heap[index] = h.heap[index / 2];
        index /= 2;
        h.heap[index] = value;
    }
}

string poll(Heap h) {
    string ret = h.heap[0];
    int index;
    while (index < count(h.heap)) {
        h.heap[index] = "";
        if (h.heap[2*index+1] != "" && h.cost[h.heap[2*index + 1]] < h.cost[h.heap[2*index + 2]]) {
            h.heap[index] = h.heap[2*index+1];
            index = 2*index + 1;
        } else {
            h.heap[index] = h.heap[2*index+2];
            index = 2*index + 2;
        }
    }
    return ret;
}
 

xKiv

Active member
At a first sight, it looks like poll() will potentially leave the heap in an unbalanced state - looks like you are putting an empty element in the root and then bubbling it down ... but "" has 'infinite' cost, so it should bubble right to the last element?
But then h.heap will still contain indexes for all empty nodes, so it will only grow.

https://en.wikipedia.org/wiki/Binary_heap says that removing the root should start by moving the last element (which has index count(h.heap)-1) to root, and then bubbling it down (using exchanges). This moving of the last element also means removing h.heap[count(h.heap)-1]. Or having an explicit attribute (h.head ?) to remember what your last index is.
 

heeheehee

Developer
Staff member
Sure. I didn't really feel like figuring out removal of empty elements, and I don't think self-balancing heaps are a thing (I may be wrong, though). The issue with moving the last element to the root is, well, you can't necessarily know where the last index is without a lot of bookkeeping (especially if you poll multiple times in succession).

In this implementation, removal of empty elements probably should happen when bubbling a new element up -- if its parent is an empty node, then bubble it up and delete the parent.
 

chown

Member
You may be misunderstanding. The "last element of the heap" h is h.heap[count(h.heap)-1]. You should be able to do something like:
Code:
string ret = h.heap[0];
remove h.cost[ret];
h.heap[0] = h.heap[count(h.heap)-1];
remove h.heap[count(h.heap)-1];
...and then re-heap-ify, starting at index 0. You'll check against the new count(h.heap), instead of looking for "" entries.
 

chown

Member
Also, I'm a little skeptical that your implementation is correct. In particular, "index / 2" in the add function doesn't seem like it could be right. Here's what I came up with:

Code:
Heap add(Heap h, string value, float cost) {
    int index = count(h.heap);
    h.heap[index] = value;
    if (h.cost contains value) abort("Attempt to add value " + value + " to heap twice!");
    h.cost[value] = cost;
    while (cost < h.cost[h.heap[(index-1) / 2]]) {
        h.heap[index] = h.heap[(index-1) / 2];
        index = (index-1) / 2;
        h.heap[index] = value;
    }
    check(h);
    return h;
}

boolean empty(Heap h) {
    return 0 == count(h.heap);
}

string poll(Heap h) {
    string ret = h.heap[0];
    h.heap[0] = h.heap[count(h.heap)-1];
    remove h.cost[ret];
    remove h.heap[count(h.heap)-1];
    int newSize = count(h.heap);
    int index = 0;
    while (true) {
        int i_l = 2*index+1;
        int i_r = i_l+1;
        int lowest = index;
        if (i_l < newSize && h.cost[h.heap[i_l]] < h.cost[h.heap[lowest]]) {
            lowest = i_l;
        }
        if (i_r < newSize && h.cost[h.heap[i_r]] < h.cost[h.heap[lowest]]) {
            lowest = i_r;
        }
        if (index == lowest) {
            check(h);
            return ret;
        }
        string temp = h.heap[lowest];
        h.heap[lowest] = h.heap[index];
        h.heap[index] = temp;

        index = lowest;
    }
}
(In the above code, "check" is a function that checks the heap property, and aborts if it is not satisfied. In a correct implementation, it can be a no-op.)
 
Last edited:

heeheehee

Developer
Staff member
Oh, okay. Those both make sense, and admittedly I haven't actually thought about priority heaps in a few years, so it's plausible that I'd have made a few implementation errors. Thanks for the feedback!
 

xKiv

Active member
The wiki article i linked explains it reasonably well, I think (naybe your forum style doesn't show links as links?), including the self-balancing thing (all levels of the equivalent tree are full, except the last one; and the last one is filled from left. For the array, this means that all indexes from 0 to length-1 are value-carrying nodes).
 
Top