《编程之法》2.1寻找最小的k个数

  • Post author:
  • Post category:其他




题目描述:数组里面有n个整数,请找出其中最小的k个数,要求时间复杂度尽可能低。



解法一:全部排序:


先对这个序列快速排序,再输出前k个元素。时间复杂度为O(nlogn)+O(k);

#include <iostream>
#include <algorithm>
using namespace std;
void  SortFindK(int *nums, int k){
	sort(nums, nums+100);
	int i;
	for(i = 0; i < k; ++i)
		cout << nums[i] << " ";
	cout << endl;
}


int main(){
	int k, nums[100] = {29,-24,62,7,-30,88,75,10,24,17,-58,-40,-6,-54,69,-61,-55,
		-66,-54,-13,-38,85,-14,-63,81,96,-12,-78,-48,-18,19,-48,21,42,-56,-77,
		-41,-36,-15,2,-83,-48,60,-94,86,46,-2,16,-53,-8,93,9,4,-54,-2,25,36,
	    -21,-27,98,-92,77,83,59,-80,-48,-33,36,-73,44,-79,31,-1,56,43,81,
	    78,-33,40,-60,-94,49,0,-4,81,22,24,72,61,15,-63,-52,77,-94,-2,-66,96,43,0,-6};	
	while(cin >> k)
		SortFindK(nums, k);
	return 0;
}



解法二:部分排序:


前k个数不做处理,从下标k开始遍历数组,利用选择排序选择出前k个数的最大值kmax,每遍历一个数x,将其与kmax作比较,若x<kmax,用x替换kmax,并重新对这k个数进行一次选择排序选择出kmax’来比较下一个数;若x>=kmax,不更新数组而继续向前遍历。时间复杂度为(n-k)O(k)=O(nk),需用到指针指向每次遍历的最大值,则空间复杂度为O(1);

#include <iostream>
using namespace std;
int SelectSort(int *nums, int k){
	int i, max = 0;
	for(i = 1; i < k; ++i){
		if(nums[i] > nums[max])
			max = i;
	}
	return max;
}
void  SelectFindK(int *nums, int k){
	int i, max;
	for(i = k; i < 100; ++i){
		max = SelectSort(nums, k);
		int temp;
		if(nums[i] < nums[max]){
			temp = nums[i];
			nums[i] = nums[max];
			nums[max] = temp;
		}
	}
	for(i = 0; i < k; ++i)
		cout << nums[i] << "  ";
	cout << endl;
}


int main(){
	int k, nums[100] = {29,-24,62,7,-30,88,75,10,24,17,-58,-40,-6,-54,69,-61,-55,
		-66,-54,-13,-38,85,-14,-63,81,96,-12,-78,-48,-18,19,-48,21,42,-56,-77,
		-41,-36,-15,2,-83,-48,60,-94,86,46,-2,16,-53,-8,93,9,4,-54,-2,25,36,
	    -21,-27,98,-92,77,83,59,-80,-48,-33,36,-73,44,-79,31,-1,56,43,81,
	    78,-33,40,-60,-94,49,0,-4,81,22,24,72,61,15,-63,-52,77,-94,-2,-66,96,43,0,-6};	
	while(cin >> k)
		SelectFindK(nums, k);
	return 0;
}



解法三:用堆代替数组:




用容量为k的最大堆存储前k个数,建堆费时O(k),此时堆中元素时有序的,继续遍历剩余n-k个数,每次遍历到的新元素的值为x,将其与堆顶元素kmax作比较,若x<kmax,用x替换kmax,然后更新堆,否则不更新堆。时间复杂度为O(k)+(n-k)*O(logk)=O(nlogk)。

#include <iostream>
#include <queue>
using namespace std;
//priority_queue<int, vector<int>, greater<int> > heap; //从小往大排,为小顶堆
priority_queue<int> heap; //默认从大往小排,为大顶堆
void  HeapFindK(int *nums, int k){
	while(!heap.empty()) heap.pop();
	int i;
	for(i = 0; i < k; ++i)
		heap.push(nums[i]);
	int temp;
	for(i = k; i < 100; ++i){ //heap每次弹出和压入一个元素时,会自动调整成大顶堆
		temp = heap.top();
		if(temp > nums[i]){
			heap.pop();
			heap.push(nums[i]);
		}
	}
	for(i = 0; i < k; ++i){
		cout << heap.top() << " ";
		heap.pop();
	}
	cout << endl;
}


int main(){
	int k, nums[100] = {29,-24,62,7,-30,88,75,10,24,17,-58,-40,-6,-54,69,-61,-55,
		-66,-54,-13,-38,85,-14,-63,81,96,-12,-78,-48,-18,19,-48,21,42,-56,-77,
		-41,-36,-15,2,-83,-48,60,-94,86,46,-2,16,-53,-8,93,9,4,-54,-2,25,36,
	    -21,-27,98,-92,77,83,59,-80,-48,-33,36,-73,44,-79,31,-1,56,43,81,
	    78,-33,40,-60,-94,49,0,-4,81,22,24,72,61,15,-63,-52,77,-94,-2,-66,96,43,0,-6};	
	while(cin >> k)
		HeapFindK(nums, k);
	return 0;
}


我这里是直接使用stl值的大顶堆,具体堆的构建与实现:见网页:

http://www.cnblogs.com/mengdd/archive/2012/11/30/2796845.html


构建堆的时间复杂度为O(n)的原因:堆是一棵完全二叉树,故树高为h = O(log2(n)),由于初始的结点有可能在第h层,故根结点最多需要h-1次父子结点颠倒到达第1层,第二层的结点最多需要h-2次父子结点颠倒到达第2层,第h-1层的结点最多需要1次父子结点颠倒到达第h-1层,建堆时间复杂度为:

S = 1*(h-1) + 2*(h-2) + 2^2*(h-3) + … 2^(h-1)*1; 2S = 2*(h-1) + 2^2*(h-2) + 2^3*(h-3) + … 2^h*1;

两者错位相减可得S = -(h-1) + 2 + 2^2 + 2^h,约等于2^(h+1),将h = log2(n)带入,则时间复杂度为O(n);

至于每次更新堆,每次只需要对堆顶做处理,故时间复杂度为O(log2(n))。



解法四:线性选择算法


该算法实际上是求一个数组中第k小的数,且时间复杂度为O(n),如果要输出k个最小的数,找到数组中第k小的数后,此时k的左半段的元素都比第k个数小,即已经找到了数组中最小的k个元素,直接遍历下标0~k-1的元素输出即可,时间复杂度为O(n) + O(k) = O(n)。。


由于该算法篇幅有点长,具体实现见我的另外一篇博客:

寻找数组中第k大的数:平均情况下时间复杂度为O(n)的快速选择算法



举一反三:




三元组的数量:


给定一个数列a1,a2,a3…an和m个三元组表示的查询,对于每个查询(i, j, k),输出ai,ai+1,ai+2,…aj的升序排列中的第k个数。


最好使用线性选择算法,可使时间复杂度变为O(i-j)。使用前三种解法都会使该算法的复杂度超过O(i-j),具体思路可根据上述四个解法就行调整,就不细述了。



版权声明:本文为qibofang原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。