jdk8中的Arrays.sort

jdk8中Arrays.sort

这里可以看到根据传入数组类型的不同，排序的算法是由区别的。

拆分解析

我们在平时引用的时候，一般只会传入一个数组，但是真正调用的时候，参数会进行补全。

    public static void sort(int[] a) {DualPivotQuicksort.sort(a, 0, a.length - 1, null, 0, 0);}/*** Sorts the specified range of the array into ascending order. The range* to be sorted extends from the index {@code fromIndex}, inclusive, to* the index {@code toIndex}, exclusive. If {@code fromIndex == toIndex},* the range to be sorted is empty.** <p>Implementation note: The sorting algorithm is a Dual-Pivot Quicksort* by Vladimir Yaroslavskiy, Jon Bentley, and Joshua Bloch. This algorithm* offers O(n log(n)) performance on many data sets that cause other* quicksorts to degrade to quadratic performance, and is typically* faster than traditional (one-pivot) Quicksort implementations.** @param a the array to be sorted* @param fromIndex the index of the first element, inclusive, to be sorted* @param toIndex the index of the last element, exclusive, to be sorted** @throws IllegalArgumentException if {@code fromIndex > toIndex}* @throws ArrayIndexOutOfBoundsException*     if {@code fromIndex < 0} or {@code toIndex > a.length}*/public static void sort(int[] a, int fromIndex, int toIndex) {rangeCheck(a.length, fromIndex, toIndex);DualPivotQuicksort.sort(a, fromIndex, toIndex - 1, null, 0, 0);}

Arrays.sort里面的排序是灵活的，选取条件一般是传入数组的类型，数组的长度，包括数组的有序度进行判断，然后来选择排序。

一般他会先对这个数组的合法进行一个检查。

这里我们以int的为例子

可以看到排序代码为：

static void sort(int[] a, int left, int right,int[] work, int workBase, int workLen) {// Use Quicksort on small arraysif (right - left < QUICKSORT_THRESHOLD) {sort(a, left, right, true);return;}/** Index run[i] is the start of i-th run* (ascending or descending sequence).*/int[] run = new int[MAX_RUN_COUNT + 1];int count = 0; run[0] = left;// Check if the array is nearly sortedfor (int k = left; k < right; run[count] = k) {if (a[k] < a[k + 1]) { // ascendingwhile (++k <= right && a[k - 1] <= a[k]);} else if (a[k] > a[k + 1]) { // descendingwhile (++k <= right && a[k - 1] >= a[k]);for (int lo = run[count] - 1, hi = k; ++lo < --hi; ) {int t = a[lo]; a[lo] = a[hi]; a[hi] = t;}} else { // equalfor (int m = MAX_RUN_LENGTH; ++k <= right && a[k - 1] == a[k]; ) {if (--m == 0) {sort(a, left, right, true);return;}}}/** The array is not highly structured,* use Quicksort instead of merge sort.*/if (++count == MAX_RUN_COUNT) {sort(a, left, right, true);return;}}// Check special cases// Implementation note: variable "right" is increased by 1.if (run[count] == right++) { // The last run contains one elementrun[++count] = right;} else if (count == 1) { // The array is already sortedreturn;}// Determine alternation base for mergebyte odd = 0;for (int n = 1; (n <<= 1) < count; odd ^= 1);// Use or create temporary array b for mergingint[] b;                 // temp array; alternates with aint ao, bo;              // array offsets from 'left'int blen = right - left; // space needed for bif (work == null || workLen < blen || workBase + blen > work.length) {work = new int[blen];workBase = 0;}if (odd == 0) {System.arraycopy(a, left, work, workBase, blen);b = a;bo = 0;a = work;ao = workBase - left;} else {b = work;ao = 0;bo = workBase - left;}// Mergingfor (int last; count > 1; count = last) {for (int k = (last = 0) + 2; k <= count; k += 2) {int hi = run[k], mi = run[k - 1];for (int i = run[k - 2], p = i, q = mi; i < hi; ++i) {if (q >= hi || p < mi && a[p + ao] <= a[q + ao]) {b[i + bo] = a[p++ + ao];} else {b[i + bo] = a[q++ + ao];}}run[++last] = hi;}if ((count & 1) != 0) {for (int i = right, lo = run[count - 1]; --i >= lo;b[i + bo] = a[i + ao]);run[++last] = right;}int[] t = a; a = b; b = t;int o = ao; ao = bo; bo = o;}}

比较疑惑的地方就在于，他是如何进行长度判断的，又是如何检测数组的有序性的，是在排序前还是排序中呢，如果在排序后才知道数组是否有序，就没有意义了。

这里看源码

逻辑从上往下看：

第一步先以数组的长度为判断条件:

计算数组的长度容易：

直接right-left就可以了

那么这里的quicksort_threshold就是一个阈值

   static void sort(int[] a, int left, int right,int[] work, int workBase, int workLen) {// Use Quicksort on small arraysif (right - left < QUICKSORT_THRESHOLD) {sort(a, left, right, true);return;}

我们可以先看一下常量值：

Max_Run_Count是衡量有序性的，MAX_RUN_LENGTH是衡量重复重复元素的。

再回头看的话可以知道这个代码是长度小于这个286就直接采用双基准点快排了。

那么如果大于这个值呢？接着往下看

   /** Index run[i] is the start of i-th run* (ascending or descending sequence).*/int[] run = new int[MAX_RUN_COUNT + 1];int count = 0; run[0] = left;

run用来记录记录升序或者降序的起点

// Check if the array is nearly sorted
for (int k = left; k < right; run[count] = k) {// 循环遍历数组中的元素，k 从左边界开始逐步向右移动if (a[k] < a[k + 1]) { // ascending// 如果当前元素小于下一个元素，说明当前区间是递增的while (++k <= right && a[k - 1] <= a[k]);// 继续移动 k 直到找到递增区间的结束位置} else if (a[k] > a[k + 1]) { // descending// 如果当前元素大于下一个元素，说明当前区间是递减的while (++k <= right && a[k - 1] >= a[k]);// 继续移动 k 直到找到递减区间的结束位置// 然后反转递减区间，将递减区间转换为递增区间for (int lo = run[count] - 1, hi = k; ++lo < --hi; ) {int t = a[lo]; a[lo] = a[hi]; a[hi] = t;// 交换递减区间中的元素，实现反转}} else { // equal// 如果当前元素等于下一个元素，说明当前区间是相等的for (int m = MAX_RUN_LENGTH; ++k <= right && a[k - 1] == a[k]; ) {// 继续移动 k 直到找到不相等的元素或者达到最大相等长度if (--m == 0) {// 如果重复元素超过阈值，算法会尝试通过不同的分区策略来处理。这些策略旨在提高排序的效率，尽可能地减少重复元素对排序性能的影响，这里的sort还需要进行源码查看，比较复杂。sort(a, left, right, true);return;}}}/** The array is not highly structured,* use Quicksort instead of merge sort.*/// 如果发现数组不是高度结构化的，即当前区间并不是递增或递减的if (++count == MAX_RUN_COUNT) {// 如果累积的 run 数达到了最大值sort(a, left, right, true);// 则认为数组无序，使用快速排序进行排序return;}
}

、

这段代码是来判断这个区间的有序度。

count值越多，说明这个越无序，如果无序的话，那么采用快排来进行排序。

否则接着往下看

if (run[count] == right++) { // The last run contains one element 这段代码在检查最后一个 run 是否只包含一个元素。run[count] 表示当前记录的最后一个 run 的起始索引位置。如果这个 run 的起始索引位置与 right 相等，即最后一个 run 只包含一个元素，那么 right++ 会使 right 增加 1，这是因为 right 之前可能已经增加过 1。然后 run[++count] = right; 将新的右边界 right 记录到 run 数组中，表示这个 run 的结束位置。这样做的目的是为了确保 run 数组中的所有 run 都是左闭右开的区间。

第二个是cout值没有发生变化，已经排好序了，直接返回即可。

   if (run[count] == right++) { // The last run contains one elementrun[++count] = right;} else if (count == 1) { // The array is already sortedreturn;}

那么最后一段代码则是归并排序了

  // Determine alternation base for mergebyte odd = 0;for (int n = 1; (n <<= 1) < count; odd ^= 1);// Use or create temporary array b for mergingint[] b;                 // temp array; alternates with aint ao, bo;              // array offsets from 'left'int blen = right - left; // space needed for bif (work == null || workLen < blen || workBase + blen > work.length) {work = new int[blen];workBase = 0;}if (odd == 0) {System.arraycopy(a, left, work, workBase, blen);b = a;bo = 0;a = work;ao = workBase - left;} else {b = work;ao = 0;bo = workBase - left;}// Mergingfor (int last; count > 1; count = last) {for (int k = (last = 0) + 2; k <= count; k += 2) {int hi = run[k], mi = run[k - 1];for (int i = run[k - 2], p = i, q = mi; i < hi; ++i) {if (q >= hi || p < mi && a[p + ao] <= a[q + ao]) {b[i + bo] = a[p++ + ao];} else {b[i + bo] = a[q++ + ao];}}run[++last] = hi;}if ((count & 1) != 0) {for (int i = right, lo = run[count - 1]; --i >= lo;b[i + bo] = a[i + ao]);run[++last] = right;}int[] t = a; a = b; b = t;int o = ao; ao = bo; bo = o;}

数量大于快排阈值并且有序度比较高的情况会用到归并。

小总结

那么这里举了int的分析例子，最后附上一个参考表格。

排序目标	条件	采用算法
int[] long[] float[] double[]	size < 47	混合插入排序 (pair)
	size < 286	双基准点快排
	有序度低	双基准点快排
	有序度高	归并排序
byte[]	size <= 29	插入排序
	size > 29	计数排序
char[] short[]	size < 47	插入排序
	size < 286	双基准点快排
	有序度低	双基准点快排
	有序度高	归并排序
	size > 3200	计数排序
Object[]	-Djava.util.Arrays.useLegacyMergeSort=true	传统归并排序
		TimSort