JAVA自带的各种HashMap(基于JDK11)

  • Post author:
  • Post category:java


1、HashMap

HashMap的数据结构在JDK1.8以下是:(数组+链表)

在JDK1.8时更新为 (数组+链表+红黑树)

为什么要做这种转变呢?

原因是当链表长度过长时会查询的时间复杂度时O(n),而转换成红黑树后查询的时间复杂度为O(logn),提高了查询效率,那么什么时候由链表转化为红黑树呢?

//树化阈值
static final int TREEIFY_THRESHOLD = 8;

//树化数组容量阈值
static final int MIN_TREEIFY_CAPACITY = 64;

//取自putVal
for (int binCount = 0; ; ++binCount) {
    if ((e = p.next) == null) {
        p.next = newNode(hash, key, value, null); //插入节点
        if (binCount >= TREEIFY_THRESHOLD - 1) // 长度到达阈值
            treeifyBin(tab, hash); //树化
        break;
    }
    if (e.hash == hash &&
        ((k = e.key) == key || (key != null && key.equals(k))))
        break;
    p = e;
}

//取自treeifyBin
final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY) //如果数组容量小于阈值
        resize(); //扩容
    else if ((e = tab[index = (n - 1) & hash]) != null) { //树化
        TreeNode<K,V> hd = null, tl = null;
        do {
            TreeNode<K,V> p = replacementTreeNode(e, null);
            if (tl == null)
                hd = p;
            else {
                p.prev = tl;
                tl.next = p;
            }
            tl = p;
        } while ((e = e.next) != null);
        if ((tab[index] = hd) != null)
            hd.treeify(tab);
    }
}

查看源代码可知,当链表长度>=8时,且数组长度>=64时,链表转化成红黑树,这是为啥呢?

* Because TreeNodes are about twice the size of regular nodes, we
     * use them only when bins contain enough nodes to warrant use
     * (see TREEIFY_THRESHOLD). And when they become too small (due to
     * removal or resizing) they are converted back to plain bins.  In
     * usages with well-distributed user hashCodes, tree bins are
     * rarely used.  Ideally, under random hashCodes, the frequency of
     * nodes in bins follows a Poisson distribution
     * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
     * parameter of about 0.5 on average for the default resizing
     * threshold of 0.75, although with a large variance because of
     * resizing granularity. Ignoring variance, the expected
     * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
     * factorial(k)). The first values are:
     *
     * 0:    0.60653066
     * 1:    0.30326533
     * 2:    0.07581633
     * 3:    0.01263606
     * 4:    0.00157952
     * 5:    0.00015795
     * 6:    0.00001316
     * 7:    0.00000094
     * 8:    0.00000006
     * more: less than 1 in ten million

翻译过来的意思就是红黑树的占用的空间大小是链表的两倍,只有当链表中的元素足够多时才做转化,而链表长度的概率则符合泊松分布,

链表长度为8的概率已经很低了,所以在这个时候转化成红黑树已优化查询效率的性价比是很高的。

那么什么时候红黑树会还原成链表呢?

//红黑树转链表阈值
static final int UNTREEIFY_THRESHOLD = 6;

//取自split
if (loHead != null) {
    if (lc <= UNTREEIFY_THRESHOLD)
        tab[index] = loHead.untreeify(map);
    else {
        tab[index] = loHead;
        if (hiHead != null) // (else is already treeified)
            loHead.treeify(tab);
    }
}

从上可知当链表长度<=6时,红黑树转化为链表。

我们知道HashMap在遍历时是无序的(即非插入顺序),那为什么会是无序的呢? 我们来看看HashMap实现的迭代器代码

abstract class HashIterator {
    Node<K,V> next;        // next entry to return
    Node<K,V> current;     // current entry
    int expectedModCount;  // for fast-fail
    int index;             // current slot

    HashIterator() {
        expectedModCount = modCount;
        Node<K,V>[] t = table;
        current = next = null;
        index = 0;
        if (t != null && size > 0) { // 定位到table中第一个不为空的Node
            do {} while (index < t.length && (next = t[index++]) == null);
        }
    }

    final Node<K,V> nextNode() {
        Node<K,V>[] t;
        Node<K,V> e = next;
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
        if (e == null)
            throw new NoSuchElementException();
        //如果next为空且table不为空就继续定位下一个不为空的Node
        if ((next = (current = e).next) == null && (t = table) != null) {
            do {} while (index < t.length && (next = t[index++]) == null);
        }
        return e;
    }
    
    ...//其他代码省略
}

可以看到,遍历时按照数组顺序+链表的next遍历的,而插入Node时是根据hashcode算出在某个位置插入的,所以HashMap遍历时无序的。

2、LinkedHashMap

LinkedHashMap继承自HashMap并声明了3个字段

//头结点
transient LinkedHashMap.Entry<K,V> head;

//尾节点
transient LinkedHashMap.Entry<K,V> tail;

//true 为访问顺序,false为插入顺序
final boolean accessOrder;

干嘛用的?

我们知道HashMap是无须的,即插入顺序和遍历顺序不一致。

而LinkedHashMap在HashMap的基础上增加了2个指针分别指向当前这个Node的头尾节点,从而使得可以按照插入顺序或者访问顺序来进行元素遍历(默认为插入顺序)

我们来看一下其中一个构造函数,可以指定为访问顺序排序,即LRU(最近最少使用算法)

public LinkedHashMap(int initialCapacity,
        float loadFactor,
        boolean accessOrder) {
	super(initialCapacity, loadFactor);
	this.accessOrder = accessOrder;
}

当指定了为LRU方式排序时,每当访问或插入一个元素之后,便会把这个元素放到链表的最后面 ,我们来看一下代码

//取自putVal片段
if (e != null) { // existing mapping for key
    V oldValue = e.value;
    if (!onlyIfAbsent || oldValue == null)
        e.value = value;
    afterNodeAccess(e);  //放到最后
    return oldValue;
}

public V get(Object key) {
    Node<K,V> e;
    if ((e = getNode(hash(key), key)) == null)
        return null;
    if (accessOrder) //设置为true
        afterNodeAccess(e); //移动节点到最后
    return e.value;
}

void afterNodeAccess(Node<K,V> e) { // move node to last
    LinkedHashMap.Entry<K,V> last;
    if (accessOrder && (last = tail) != e) {
        LinkedHashMap.Entry<K,V> p =
            (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
        p.after = null;
        if (b == null)
            head = a;
        else
            b.after = a;
        if (a != null)
            a.before = b;
        else
            last = b;
        if (last == null)
            head = p;
        else {
            p.before = last;
            last.after = p;
        }
        tail = p;
        ++modCount;
    }
}

LinkedHashMap还可以实现简单的缓存功能

void afterNodeInsertion(boolean evict) { // 可能移除最少使用的节点
    LinkedHashMap.Entry<K,V> first;
    if (evict && (first = head) != null && removeEldestEntry(first)) { //条件允许则移除头结点
        K key = first.key;
        removeNode(hash(key), key, null, false, true);
    }
}

//子类基础重写
protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
        return false;
}

栗子:

public class TestLinked extends LinkedHashMap<String,String>{

	private int size;
	
	public TestLinked(int initialCapacity,
            float loadFactor,
            boolean accessOrder) {
		super(initialCapacity, loadFactor, accessOrder);
	}

	@Override
	protected boolean removeEldestEntry(java.util.Map.Entry<String, String> eldest) {
		return size>=3;
	}
	
	@Override
	public String put(String key, String value) {
		String result = super.put(key, value);
		size++;
		return result;
	}

	public static void main(String[] args) {
		TestLinked linked = new TestLinked(16, 0.75f, true);
		linked.put("a", "a");
		linked.put("b", "b");
		linked.put("c", "c");
		linked.get("a"); //get一下不然元素会被删除
		linked.put("d", "d");
		linked.put("e", "e");
		System.out.println(linked); //输出{a=a, d=d, e=e}
	}
}

3、TreeMap

TreeMap是无序的(插入和遍历顺序不一致),可以通过使用不同的构造方法指定其遍历顺序

数据结构为:(红黑树)

值得注意的是,若使用TreeMap的默认构造方法,作为Key的类必须实现Comparable接口,若使用传入比较器的构造函数则作为Key的类不用实现Comparable接口

//默认构造
public TreeMap() {
    comparator = null; //Key的类必须实现Comparable
}

//传入比较器
public TreeMap(Comparator<? super K> comparator) {
    this.comparator = comparator;
}

普通排序栗子:

public static void main(String[] args) {
	Comparator<? super String> comparator1 = (k1,k2)->k2.compareTo(k1);
	Comparator<? super String> comparator2 = (k1,k2)->k1.compareTo(k2);
	Map<String,String> map = new TreeMap<String, String>(comparator1);
	map.put("c", "c");
	map.put("a", "a");
	map.put("b", "b");
	System.out.println(map);
	//使用comparator1输出{c=c, b=b, a=a}
	//使用comparator2输出{a=a, b=b, c=c}
}

一些有用的方法

public static void main(String[] args) {
	Comparator<? super String> comparator2 = String::compareTo;
	TreeMap<String,String> map = new TreeMap<String, String>(comparator2);
	map.put("c", "c");
	map.put("a", "a");
	map.put("b", "b");
	map.put("e", "e");
	map.put("f", "f");
	System.out.println(map);
	System.out.println(map.firstKey()); //a
	System.out.println(map.lastKey()); //f
	System.out.println(map.lowerKey("f")); //e
	System.out.println(map.higherKey("a")); //b
}

4、IdentityHashMap

IdentityHashMap的数据结构为数组

IdentityHashMap可以保存”相同”的key,这里的相同指的是O1.hashCode()==O2.hashCode(),或者O1.equals(O2),为什么呢?

因为IdentityHashMap的hash算法与上述的两个方法无关,我们看看源代码怎么说:

private static int hash(Object x, int length) {
    int h = System.identityHashCode(x); 
    // Multiply by -127, and left-shift to use least bit as part of hash
    return ((h << 1) - (h << 8)) & (length - 1);
}

/**
 * Returns the same hash code for the given object as
 * would be returned by the default method hashCode(),
 * whether or not the given object's class overrides
 * hashCode().
 * The hash code for the null reference is zero.
 *
 * @param x object for which the hashCode is to be calculated
 * @return  the hashCode
 * @since   1.1
 * @see Object#hashCode
 * @see java.util.Objects#hashCode(Object)
 */
@HotSpotIntrinsicCandidate 
public static native int identityHashCode(Object x); 
//和默认方法的hashCode有关,不管子类是否覆盖都无效

可以简单的理解为与对象的内存地址有关,和其他无关,地址一样就是相同的key反之则不相同。

举例子:

public static void main(String[] args) {
	Map<String,String> map = new IdentityHashMap<String, String>();
	String a = "a";
	String b = "a";
	String c = new String("a");
	map.put(a, "c");
	map.put(a, "a");
	map.put(b, "b");
	map.put(c, "d");
	System.out.println(map); //输出{a=d, a=b}
}

5、EnumMap

EnumMap是针对枚举的集合类,其中存储的Key必须为枚举类型。

数据结构为数组,而在数组中的顺序是依赖枚举中定义的顺序,看看它的put代码:

public V put(K key, V value) {
    typeCheck(key);

    int index = key.ordinal(); //枚举中定义的顺序
    Object oldValue = vals[index];
    vals[index] = maskNull(value);
    if (oldValue == null)
        size++;
    return unmaskNull(oldValue);
}

数组的长度是固定的,在其构造时会算出数组的固定长度:

public EnumMap(Class<K> keyType) {
    this.keyType = keyType;
    keyUniverse = getKeyUniverse(keyType); //获取所有枚举
    vals = new Object[keyUniverse.length]; //长度固定
}

使用栗子:

public static void main(String[] args) {
	EnumMap<Style,String> map = new EnumMap<>(Style.class);
	map.put(Style.GREEN, "绿");
	map.put(Style.BLUE, "蓝");
	map.put(Style.RED, "红");
	System.out.println(map);
}

6、WeakHashMap

WeakHashMap是一个特殊的HashMap,它的Entry继承了弱引用类WeakReference,并传入Key作为弱引用,这表名它的每个Entry中的key的存活时间只能到下次GC前。

数据结构为:(数组+链表)

private static class Entry<K,V> extends WeakReference<Object> implements Map.Entry<K,V> {
    V value;
    final int hash;
    Entry<K,V> next;

    Entry(Object key, V value,
        ReferenceQueue<Object> queue,
        int hash, Entry<K,V> next) {
      //调用WeakReference的构造,传入Key
      super(key, queue);
      this.value = value;
      this.hash  = hash;
      this.next  = next;
    }
}

那么有一个问题,key被回收了,那Value呢,我们来看看它是怎么做的:

/**
 * Returns the table after first expunging stale entries.
 */
private Entry<K,V>[] getTable() { //get|put|remove啥的都会调用到这个方法
    expungeStaleEntries(); //去除过期entry
    return table;
}

/**
 * Expunges stale entries from the table.
 */
private void expungeStaleEntries() {
    for (Object x; (x = queue.poll()) != null; ) {
        synchronized (queue) {
            @SuppressWarnings("unchecked")
            	//过期对象
                Entry<K,V> e = (Entry<K,V>) x;
            int i = indexFor(e.hash, table.length);
            //定位到Entry
            Entry<K,V> prev = table[i];
            Entry<K,V> p = prev;
            //遍历清空
            while (p != null) {
                Entry<K,V> next = p.next;
                if (p == e) {
                    if (prev == e)
                        table[i] = next;
                    else
                        prev.next = next;
                    // Must not null out e.next;
                    // stale entries may be in use by a HashIterator
                    e.value = null; // Help GC
                    size--;
                    break;
                }
                prev = p;
                p = next;
            }
        }
    }
}

做个小测试:

public static void main(String[] args) {
	byte[] a = new byte[1024*1024*100];
	Object b = new Object();
	Map<Object,byte[]> map = new WeakHashMap<Object, byte[]>();
	map.put(a,new byte[1024*1024*100]);
	map.put(b,new byte[1024*1024*50]);
	a = null;
	System.out.println(map.size()); //2
	System.out.println(map); //{[B@7d6f77cc=[B@606d8acf, java.lang.Object@5aaa6d82=[B@782830e}
	System.gc();
	System.out.println(map.size()); //1
	System.out.println(map); //{java.lang.Object@5aaa6d82=[B@782830e}
	map.get(b); //触发expungeStaleEntries
	System.gc(); //回收掉entry的value
	System.out.println(map.size()); //1
	System.out.println(map); //{java.lang.Object@5aaa6d82=[B@782830e}
}

看一下内存情况:

7、HashTable

HashTable是一个线程安全的集合类,内部的数据结构为(数据+链表)

所有操作都加上了锁来确保线程安全,我们看一下它的默认构造方法:

public Hashtable() {
    this(11, 0.75f); //默认容量为11,为什么设置为11?
}

public Hashtable(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal Load: "+loadFactor);

    if (initialCapacity==0)
        initialCapacity = 1;
    this.loadFactor = loadFactor;
    table = new Entry<?,?>[initialCapacity];
    //threshold = 初始*加载因子 或 最大容量+1
    threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
}

HashTable的put方法比HashMap来的简单,put时容量不够就扩容

public synchronized V put(K key, V value) {
    // Make sure the value is not null
    if (value == null) {
        throw new NullPointerException();
    }
    // Makes sure the key is not already in the hashtable.
    Entry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % tab.length; //先与上Integer.MAX_VALUE,确保为正数,然后再取模
    @SuppressWarnings("unchecked")
    Entry<K,V> entry = (Entry<K,V>)tab[index];
    for(; entry != null ; entry = entry.next) {
        if ((entry.hash == hash) && entry.key.equals(key)) {
            V old = entry.value;
            entry.value = value;
            return old;
        }
    }
    addEntry(hash, key, value, index);
    return null;
}

private void addEntry(int hash, K key, V value, int index) {
    Entry<?,?> tab[] = table;
    if (count >= threshold) {
        // Rehash the table if the threshold is exceeded
        rehash(); 
        tab = table;
        hash = key.hashCode();
        index = (hash & 0x7FFFFFFF) % tab.length;
    }
    // Creates the new entry.
    @SuppressWarnings("unchecked")
    Entry<K,V> e = (Entry<K,V>) tab[index];
    tab[index] = new Entry<>(hash, key, value, e);
    count++;
    modCount++;
}

protected void rehash() {
    int oldCapacity = table.length;
    Entry<?,?>[] oldMap = table;
    // overflow-conscious code
    int newCapacity = (oldCapacity << 1) + 1; //扩容2倍+1
    if (newCapacity - MAX_ARRAY_SIZE > 0) { 
        if (oldCapacity == MAX_ARRAY_SIZE)
            // 保持在最大容量值
            return;
        newCapacity = MAX_ARRAY_SIZE;
    }
    Entry<?,?>[] newMap = new Entry<?,?>[newCapacity];
    modCount++;
    threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
    table = newMap;
    for (int i = oldCapacity ; i-- > 0 ;) { //循环重新插入元素
        for (Entry<K,V> old = (Entry<K,V>)oldMap[i] ; old != null ; ) {
            Entry<K,V> e = old;
            old = old.next;
            int index = (e.hash & 0x7FFFFFFF) % newCapacity;
            e.next = (Entry<K,V>)newMap[index];
            newMap[index] = e;
        }
    }
}

其他也没什么好看的了,现在HashTable也几乎没人用。

参考文章:


https://blog.csdn.net/liubenlong007/article/details/87937209


https://blog.csdn.net/a724888/article/details/80290276


https://blog.csdn.net/qq_39736103/article/details/105306652


https://www.cnblogs.com/lighten/p/7381905.html



版权声明:本文为u011492805原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。