Contact
IOI’98
The cows have developed a new interest in scanning the universe outside their farm with radiotelescopes. Recently, they noticed a very curious microwave pulsing emission sent right from the centre of the galaxy. They wish to know if the emission is transmitted by some extraterrestrial form of intelligent life or if it is nothing but the usual heartbeat of the stars.
Help the cows to find the Truth by providing a tool to analyze bit patterns in the files they record. They are seeking bit patterns of length
A
through
B
inclusive (1 <= A <= B <= 12) that repeat themselves most often in each day’s data file. They are looking for the patterns that repeat themselves most often. An input limit tells how many of the most frequent patterns to output.
Pattern occurrences may overlap, and only patterns that occur at least once are taken into account.
PROGRAM NAME: contact
INPUT FORMAT
Line 1: | Three space-separated integers: A, B, N; (1 <= N < 50) |
Lines 2 and beyond: | A sequence of as many as 200,000 characters, all 0 or 1; the characters are presented 80 per line, except potentially the last line. |
SAMPLE INPUT (file contact.in)
2 4 10 01010010010001000111101100001010011001111000010010011110010000000
In this example, pattern 100 occurs 12 times, and pattern 1000 occurs 5 times. The most frequent pattern is 00, with 23 occurrences.
OUTPUT FORMAT
Lines that list the N highest frequencies (in descending order of frequency) along with the patterns that occur in those frequencies. Order those patterns by shortest-to-longest and increasing binary number for those of the same frequency. If fewer than N highest frequencies are available, print only those that are.
Print the frequency alone by itself on a line. Then print the actual patterns space separated, six to a line (unless fewer than six remain).
SAMPLE OUTPUT (file contact.out)
23 00 15 01 10 12 100 11 11 000 001 10 010 8 0100 7 0010 1001 6 111 0000 5 011 110 1000 4 0001 0011 1100
题意:
给出 a(1 ~ 12) ,b(1 ~ 12),n(1 ~ 50),后给出一个 01 字符串 (1 ~ 200000)。统计连续长度为 a 到 b 的子串中出现频率,输出前 n 个出现频率最高的子字符串。如果有一样的则首先按字符串短到长输出,同样长度则按二进制小到大输出。严格按照每 80 个字符一行输出。
思路:
暴搜。用 map 标记每个子字符串出现的下标位置,统计的时候直接往下标上加就好,不然每次都找一遍的话会导致 TLE ,为方便排序,所以还要转化成二进制,并且保存好子串的长度,最后 sort 一遍后按要求输出即可,输出的时候判断一下会不会超过 80 ,如果超过则跳到下一行输出即可。
AC:
/*
TASK:contact
LANG:C++
ID:sum-g1
*/
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <map>
#include <string>
using namespace std;
typedef struct {
char bin[15];
int ans;
int num;
int len;
} node;
char str[300005];
node no[5000];
map<string, int> m;
int change (char *a) {
int num = 0, len = strlen(a);
for (int i = 0; i < len; ++i) {
num = num * 2 + a[i] - '0';
}
return num;
}
bool cmp (node a, node b) {
if(a.ans != b.ans) return a.ans > b.ans;
if(a.len != b.len) return a.len < b.len;
return a.num < b.num;
}
int main() {
freopen("contact.in", "r", stdin);
freopen("contact.out", "w", stdout);
int a, b, n, len = 0;
char c;
scanf("%d%d%d", &a, &b, &n);
while(~scanf(" %c", &c)) { str[len++] = c; }
int sum = 1;
for (int i = 0; i < len; ++i) {
char bin[15];
int to = min(b, len - i);
for (to; to >= a; --to) {
memset(bin, 0, sizeof(bin));
strncpy(bin, str + i, to);
if (!m[bin]) {
strcpy(no[sum].bin, bin);
++no[sum].ans;
no[sum].num = change(bin);
no[sum].len = strlen(bin);
m[bin] = sum;
++sum;
} else ++no[ m[bin] ].ans;
}
}
sort(no + 1, no + sum, cmp);
int tt = 0, nn = 0;
for (int i = 1; i < sum; ++i) {
printf("%d\n", no[i].ans);
printf("%s", no[i].bin);
nn = strlen(no[i].bin);
for (int j = i + 1; j < sum; ++j) {
if (no[i].ans != no[j].ans) {
i = j - 1;
break;
}
if (nn + strlen(no[j].bin) + 1 >= 80) {
printf("\n");
printf("%s", no[j].bin);
nn = 0;
} else printf(" %s", no[j].bin);
nn += strlen(no[j].bin);
}
printf("\n");
++tt;
if(tt == n) break;
}
return 0;
}