对我来说,Google Kick Start 就是一条充满荆棘的山路,无处不在的WA和TLE整得人简直没脾气。接下来一段时间,我打算在这条路上好好走上一走,四处看看,与TLE斗争到底。一般来说,TLE发生的原因即在于时间复杂度过高,需要想办法优化代码,其中一个思路是借助堆的应用,之前的也有文章提到,本文从一道更难的题目入手,一起来看看如何借助堆优化时间复杂度。
H-index
It is important for researchers to write many high quality academic papers. Jorge has recently discovered a way to measure how impactful a researcher’s papers are: the H-index.
The H-index score of a researcher is the largest integer h such that the researcher has h papers with at least h citations each.
Jorge has written N papers in his lifetime. The i-th paper has Ai citations. The number of citations that each paper has will never change after it is written. Please help Jorge determine his H-index score after each paper he wrote.
Simple Test Cases
Input
The first line of the input gives the number of test cases, T. T test cases follow. Each test case begins with a line containing N, the number of papers Jorge wrote.
The second line contains N integers. The i-th integer is Ai, the number of citations the i-th paper has.
1
2
3
4
5 2
3
5 1 2
6
1 3 3 2 2 15
Output
For each test case, output one line containing
Case #x: y, wherexis the test case number (starting from 1) andyis a space-separated list of integers. The i-th integer is the H-index score after Jorge wrote his i-th paper.
1
2 Case #1: 1 1 2
Case #2: 1 1 2 2 2 3
Limit
Time limit: 50 seconds per test set.
Memory limit: 1GB.
$1 \leq T \leq 100$
$1 \leq A_i \leq 105$
Test set 1 (Visible)
$1 \leq N \leq 1000$
Test set 2 (Hidden)
$1 \leq N \leq 10^5$
首先,需要了解H-Index的概念:H-Index又称为H指数或H因子,是一种评价学术成就的新方法。H代表“高引用次数”(high citations),一名科研人员的h指数是指他至多有H篇论文分别被引用了至少H次。如Simple Test Case #1 所示,当科研人员发了第一篇论文,引用次数为5,那么其H因子为1,解释为其至多发表了1(paper #1)篇论文且至少被引用了1次;当发表了第二篇论文,引用次数为1,那么其H因子依旧为1,因为其至多发表了1篇论文(paper #1 or paper #2)且至少被引用了1次;当发表了第三篇论文,引用次数为2,那么其H因子为2,因为其至多发表了2篇论文(paper #1 and paper #2)且至少被引用了2次。
那么,从直觉上,我们可以借助暴力枚举的方式,当科研人员发表了第k篇论文,寻找:$h-index(k) = max_k min(f(k), k)$。如此,其时间复杂度为$O(n^2)$,当然,其对于简单测试集Test set 1是够用的,但对于复杂测试集Test set 2来说便会超时,因此必须换个思路。
回顾题意,对于第k篇文章,我们需要找到m篇文章,且其引用次数>=m,同时,需要注意的是,H-Index总体上与发表文章数量呈正比,因此,我们可以设计一个小根堆来存储发表文章的引用次数,当堆顶元素小于当前H-Index时,移除该堆顶元素;当小根堆的容量依旧大于当前H-Index时,则当前H-Index++。由此,我们仅仅扫描一遍文章引用次数数组,其时间复杂度为$O(n)$。代码如下:
1 |
|
以上!