When to calculate key's hashcode, spread() method is called:
static final int spread(int h) {
return (h ^ (h >>> 16)) & HASH_BITS;
}
where HASH_BITS
equals 0x7fffffff
, so, what is the purpose of HASH_BITS
? Some one says it make the sign bit to 0, I am not sure about that.
The index
of KV Node in hash buckets is calculated by following formula:
index = (n - 1) & hash
hash
is the result of spread()
n
is the length of hash buckets which maximum is 2^30
private static final int MAXIMUM_CAPACITY = 1 << 30;
So the maximum of n - 1
is 2^30 - 1 which means the top bit of hash
will never be used in index calculation.
But i still don't understand is it necessary to clear the top bit of hash
to 0.It seems that there are more reasons to do so.
/**
* Spreads (XORs) higher bits of hash to lower and also forces top
* bit to 0. Because the table uses power-of-two masking, sets of
* hashes that vary only in bits above the current mask will
* always collide. (Among known examples are sets of Float keys
* holding consecutive whole numbers in small tables.) So we
* apply a transform that spreads the impact of higher bits
* downward. There is a tradeoff between speed, utility, and
* quality of bit-spreading. Because many common sets of hashes
* are already reasonably distributed (so don't benefit from
* spreading), and because we use trees to handle large sets of
* collisions in bins, we just XOR some shifted bits in the
* cheapest possible way to reduce systematic lossage, as well as
* to incorporate impact of the highest bits that would otherwise
* never be used in index calculations because of table bounds.
*/
static final int spread(int h) {
return (h ^ (h >>> 16)) & HASH_BITS;
}
I think it is to avoid collision with the preserved hashcodes: MOVED(-1), TREEBIN(-2) and RESERVED(-3) of which symbol bits are always 1.
User contributions licensed under CC BY-SA 3.0