Why ConcurrentHashMap calculate hashcode with 0x7fffffff in 1.8?

Question

Why ConcurrentHashMap calculate hashcode with 0x7fffffff in 1.8?

When to calculate key's hashcode, spread() method is called:

static final int spread(int h) {
    return (h ^ (h >>> 16)) & HASH_BITS;
}

where HASH_BITS equals 0x7fffffff, so, what is the purpose of HASH_BITS? Some one says it make the sign bit to 0, I am not sure about that.

java

hashcode

concurrenthashmap

asked on Stack Overflow May 30, 2018 by

lambdie

2 Answers

The index of KV Node in hash buckets is calculated by following formula:

index = (n - 1) & hash

hash is the result of spread()
n is the length of hash buckets which maximum is 2^30

private static final int MAXIMUM_CAPACITY = 1 << 30;

So the maximum of n - 1 is 2^30 - 1 which means the top bit of hash will never be used in index calculation.

But i still don't understand is it necessary to clear the top bit of hash to 0.It seems that there are more reasons to do so.

/**
 * Spreads (XORs) higher bits of hash to lower and also forces top
 * bit to 0. Because the table uses power-of-two masking, sets of
 * hashes that vary only in bits above the current mask will
 * always collide. (Among known examples are sets of Float keys
 * holding consecutive whole numbers in small tables.)  So we
 * apply a transform that spreads the impact of higher bits
 * downward. There is a tradeoff between speed, utility, and
 * quality of bit-spreading. Because many common sets of hashes
 * are already reasonably distributed (so don't benefit from
 * spreading), and because we use trees to handle large sets of
 * collisions in bins, we just XOR some shifted bits in the
 * cheapest possible way to reduce systematic lossage, as well as
 * to incorporate impact of the highest bits that would otherwise
 * never be used in index calculations because of table bounds.
 */
static final int spread(int h) {
    return (h ^ (h >>> 16)) & HASH_BITS;
}

answered on Stack Overflow May 31, 2018 by

Wangsla

I think it is to avoid collision with the preserved hashcodes: MOVED(-1), TREEBIN(-2) and RESERVED(-3) of which symbol bits are always 1.

answered on Stack Overflow Dec 15, 2018 by

Bennyhuo • edited Dec 15, 2018 by

Bennyhuo

User contributions licensed under CC BY-SA 3.0