题目描述:数组里面有n个整数,请找出其中最小的k个数,要求时间复杂度尽可能低。
解法一:全部排序:
先对这个序列快速排序,再输出前k个元素。时间复杂度为O(nlogn)+O(k);
#include <iostream>
#include <algorithm>
using namespace std;
void SortFindK(int *nums, int k){
sort(nums, nums+100);
int i;
for(i = 0; i < k; ++i)
cout << nums[i] << " ";
cout << endl;
}
int main(){
int k, nums[100] = {29,-24,62,7,-30,88,75,10,24,17,-58,-40,-6,-54,69,-61,-55,
-66,-54,-13,-38,85,-14,-63,81,96,-12,-78,-48,-18,19,-48,21,42,-56,-77,
-41,-36,-15,2,-83,-48,60,-94,86,46,-2,16,-53,-8,93,9,4,-54,-2,25,36,
-21,-27,98,-92,77,83,59,-80,-48,-33,36,-73,44,-79,31,-1,56,43,81,
78,-33,40,-60,-94,49,0,-4,81,22,24,72,61,15,-63,-52,77,-94,-2,-66,96,43,0,-6};
while(cin >> k)
SortFindK(nums, k);
return 0;
}
解法二:部分排序:
前k个数不做处理,从下标k开始遍历数组,利用选择排序选择出前k个数的最大值kmax,每遍历一个数x,将其与kmax作比较,若x<kmax,用x替换kmax,并重新对这k个数进行一次选择排序选择出kmax’来比较下一个数;若x>=kmax,不更新数组而继续向前遍历。时间复杂度为(n-k)O(k)=O(nk),需用到指针指向每次遍历的最大值,则空间复杂度为O(1);
#include <iostream>
using namespace std;
int SelectSort(int *nums, int k){
int i, max = 0;
for(i = 1; i < k; ++i){
if(nums[i] > nums[max])
max = i;
}
return max;
}
void SelectFindK(int *nums, int k){
int i, max;
for(i = k; i < 100; ++i){
max = SelectSort(nums, k);
int temp;
if(nums[i] < nums[max]){
temp = nums[i];
nums[i] = nums[max];
nums[max] = temp;
}
}
for(i = 0; i < k; ++i)
cout << nums[i] << " ";
cout << endl;
}
int main(){
int k, nums[100] = {29,-24,62,7,-30,88,75,10,24,17,-58,-40,-6,-54,69,-61,-55,
-66,-54,-13,-38,85,-14,-63,81,96,-12,-78,-48,-18,19,-48,21,42,-56,-77,
-41,-36,-15,2,-83,-48,60,-94,86,46,-2,16,-53,-8,93,9,4,-54,-2,25,36,
-21,-27,98,-92,77,83,59,-80,-48,-33,36,-73,44,-79,31,-1,56,43,81,
78,-33,40,-60,-94,49,0,-4,81,22,24,72,61,15,-63,-52,77,-94,-2,-66,96,43,0,-6};
while(cin >> k)
SelectFindK(nums, k);
return 0;
}
解法三:用堆代替数组:
用容量为k的最大堆存储前k个数,建堆费时O(k),此时堆中元素时有序的,继续遍历剩余n-k个数,每次遍历到的新元素的值为x,将其与堆顶元素kmax作比较,若x<kmax,用x替换kmax,然后更新堆,否则不更新堆。时间复杂度为O(k)+(n-k)*O(logk)=O(nlogk)。
#include <iostream>
#include <queue>
using namespace std;
//priority_queue<int, vector<int>, greater<int> > heap; //从小往大排,为小顶堆
priority_queue<int> heap; //默认从大往小排,为大顶堆
void HeapFindK(int *nums, int k){
while(!heap.empty()) heap.pop();
int i;
for(i = 0; i < k; ++i)
heap.push(nums[i]);
int temp;
for(i = k; i < 100; ++i){ //heap每次弹出和压入一个元素时,会自动调整成大顶堆
temp = heap.top();
if(temp > nums[i]){
heap.pop();
heap.push(nums[i]);
}
}
for(i = 0; i < k; ++i){
cout << heap.top() << " ";
heap.pop();
}
cout << endl;
}
int main(){
int k, nums[100] = {29,-24,62,7,-30,88,75,10,24,17,-58,-40,-6,-54,69,-61,-55,
-66,-54,-13,-38,85,-14,-63,81,96,-12,-78,-48,-18,19,-48,21,42,-56,-77,
-41,-36,-15,2,-83,-48,60,-94,86,46,-2,16,-53,-8,93,9,4,-54,-2,25,36,
-21,-27,98,-92,77,83,59,-80,-48,-33,36,-73,44,-79,31,-1,56,43,81,
78,-33,40,-60,-94,49,0,-4,81,22,24,72,61,15,-63,-52,77,-94,-2,-66,96,43,0,-6};
while(cin >> k)
HeapFindK(nums, k);
return 0;
}
我这里是直接使用stl值的大顶堆,具体堆的构建与实现:见网页:
http://www.cnblogs.com/mengdd/archive/2012/11/30/2796845.html
构建堆的时间复杂度为O(n)的原因:堆是一棵完全二叉树,故树高为h = O(log2(n)),由于初始的结点有可能在第h层,故根结点最多需要h-1次父子结点颠倒到达第1层,第二层的结点最多需要h-2次父子结点颠倒到达第2层,第h-1层的结点最多需要1次父子结点颠倒到达第h-1层,建堆时间复杂度为:
S = 1*(h-1) + 2*(h-2) + 2^2*(h-3) + … 2^(h-1)*1; 2S = 2*(h-1) + 2^2*(h-2) + 2^3*(h-3) + … 2^h*1;
两者错位相减可得S = -(h-1) + 2 + 2^2 + 2^h,约等于2^(h+1),将h = log2(n)带入,则时间复杂度为O(n);
至于每次更新堆,每次只需要对堆顶做处理,故时间复杂度为O(log2(n))。
解法四:线性选择算法
该算法实际上是求一个数组中第k小的数,且时间复杂度为O(n),如果要输出k个最小的数,找到数组中第k小的数后,此时k的左半段的元素都比第k个数小,即已经找到了数组中最小的k个元素,直接遍历下标0~k-1的元素输出即可,时间复杂度为O(n) + O(k) = O(n)。。
由于该算法篇幅有点长,具体实现见我的另外一篇博客:
寻找数组中第k大的数:平均情况下时间复杂度为O(n)的快速选择算法
举一反三:
三元组的数量:
给定一个数列a1,a2,a3…an和m个三元组表示的查询,对于每个查询(i, j, k),输出ai,ai+1,ai+2,…aj的升序排列中的第k个数。
最好使用线性选择算法,可使时间复杂度变为O(i-j)。使用前三种解法都会使该算法的复杂度超过O(i-j),具体思路可根据上述四个解法就行调整,就不细述了。