--- comments: true difficulty: 中等 edit_url: https://github.com/doocs/leetcode/edit/main/solution/0200-0299/0274.H-Index/README.md tags: - 数组 - 计数排序 - 排序 --- # [274. H 指数](https://leetcode.cn/problems/h-index) [English Version](/solution/0200-0299/0274.H-Index/README_EN.md) ## 题目描述

给你一个整数数组 citations ，其中 citations[i] 表示研究者的第 i 篇论文被引用的次数。计算并返回该研究者的 h 指数。

根据维基百科上 h 指数的定义：h 代表“高引用次数” ，一名科研人员的 h 指数是指他（她）至少发表了 h 篇论文，并且至少有 h 篇论文被引用次数大于等于 h 。如果 h 有多种可能的值，h 指数 是其中最大的那个。

示例 1：

输入：citations = [3,0,6,1,5]
输出：3 
解释：给定数组表示研究者总共有 5 篇论文，每篇论文相应的被引用了 3, 0, 6, 1, 5 次。
     由于研究者有 3 篇论文每篇 至少 被引用了 3 次，其余两篇论文每篇被引用 不多于 3 次，所以她的 h 指数是 3。

示例 2：

输入：citations = [1,3,1]
输出：1

提示：

n == citations.length
1 <= n <= 5000
0 <= citations[i] <= 1000

## 解法 ### 方法一：排序我们可以先对数组 `citations` 按照元素值从大到小进行排序。然后我们从大到小枚举 $h$ 值，如果某个 $h$ 值满足 $citations[h-1] \geq h$，则说明有至少 $h$ 篇论文分别被引用了至少 $h$ 次，直接返回 $h$ 即可。如果没有找到这样的 $h$ 值，说明所有的论文都没有被引用，返回 $0$。时间复杂度 $O(n \times \log n)$，空间复杂度 $O(\log n)$。其中 $n$ 是数组 `citations` 的长度。 #### Python3 ```python class Solution: def hIndex(self, citations: List[int]) -> int: citations.sort(reverse=True) for h in range(len(citations), 0, -1): if citations[h - 1] >= h: return h return 0 ``` #### Java ```java class Solution { public int hIndex(int[] citations) { Arrays.sort(citations); int n = citations.length; for (int h = n; h > 0; --h) { if (citations[n - h] >= h) { return h; } } return 0; } } ``` #### C++ ```cpp class Solution { public: int hIndex(vector& citations) { sort(citations.rbegin(), citations.rend()); for (int h = citations.size(); h; --h) { if (citations[h - 1] >= h) { return h; } } return 0; } }; ``` #### Go ```go func hIndex(citations []int) int { sort.Ints(citations) n := len(citations) for h := n; h > 0; h-- { if citations[n-h] >= h { return h } } return 0 } ``` #### TypeScript ```ts function hIndex(citations: number[]): number { citations.sort((a, b) => b - a); for (let h = citations.length; h; --h) { if (citations[h - 1] >= h) { return h; } } return 0; } ``` #### Rust ```rust impl Solution { #[allow(dead_code)] pub fn h_index(citations: Vec) -> i32 { let mut citations = citations; citations.sort_by(|&lhs, &rhs| rhs.cmp(&lhs)); let n = citations.len(); for i in (1..=n).rev() { if citations[i - 1] >= (i as i32) { return i as i32; } } 0 } } ``` ### 方法二：计数 + 求和我们可以使用一个长度为 $n+1$ 的数组 $cnt$，其中 $cnt[i]$ 表示引用次数为 $i$ 的论文的篇数。我们遍历数组 `citations`，将引用次数大于 $n$ 的论文都当作引用次数为 $n$ 的论文，然后将每篇论文的引用次数作为下标，将 $cnt$ 中对应的元素值加 $1$。这样我们就统计出了每个引用次数对应的论文篇数。接下来，我们从大到小枚举 $h$ 值，将 $cnt$ 中下标为 $h$ 的元素值加到变量 $s$ 中，其中 $s$ 表示引用次数大于等于 $h$ 的论文篇数。如果 $s \geq h$，说明至少有 $h$ 篇论文分别被引用了至少 $h$ 次，直接返回 $h$ 即可。时间复杂度 $O(n)$，空间复杂度 $O(n)$。其中 $n$ 是数组 `citations` 的长度。 #### Python3 ```python class Solution: def hIndex(self, citations: List[int]) -> int: n = len(citations) cnt = [0] * (n + 1) for x in citations: cnt[min(x, n)] += 1 s = 0 for h in range(n, -1, -1): s += cnt[h] if s >= h: return h ``` #### Java ```java class Solution { public int hIndex(int[] citations) { int n = citations.length; int[] cnt = new int[n + 1]; for (int x : citations) { ++cnt[Math.min(x, n)]; } for (int h = n, s = 0;; --h) { s += cnt[h]; if (s >= h) { return h; } } } } ``` #### C++ ```cpp class Solution { public: int hIndex(vector& citations) { int n = citations.size(); int cnt[n + 1]; memset(cnt, 0, sizeof(cnt)); for (int x : citations) { ++cnt[min(x, n)]; } for (int h = n, s = 0;; --h) { s += cnt[h]; if (s >= h) { return h; } } } }; ``` #### Go ```go func hIndex(citations []int) int { n := len(citations) cnt := make([]int, n+1) for _, x := range citations { cnt[min(x, n)]++ } for h, s := n, 0; ; h-- { s += cnt[h] if s >= h { return h } } } ``` #### TypeScript ```ts function hIndex(citations: number[]): number { const n: number = citations.length; const cnt: number[] = new Array(n + 1).fill(0); for (const x of citations) { ++cnt[Math.min(x, n)]; } for (let h = n, s = 0; ; --h) { s += cnt[h]; if (s >= h) { return h; } } } ``` ### 方法三：二分查找我们注意到，如果存在一个 $h$ 值满足至少有 $h$ 篇论文至少被引用 $h$ 次，那么对于任意一个 $h' \lt h$，都有至少 $h'$ 篇论文至少被引用 $h'$ 次。因此我们可以使用二分查找的方法，找到最大的 $h$ 值，使得至少有 $h$ 篇论文至少被引用 $h$ 次。我们定义二分查找的左边界 $l=0$，右边界 $r=n$。每次我们取 $mid = \lfloor \frac{l + r + 1}{2} \rfloor$，其中 $\lfloor x \rfloor$ 表示对 $x$ 向下取整。然后我们统计数组 `citations` 中大于等于 $mid$ 的元素的个数，记为 $s$。如果 $s \geq mid$，说明至少有 $mid$ 篇论文至少被引用 $mid$ 次，此时我们将左边界 $l$ 变为 $mid$，否则我们将右边界 $r$ 变为 $mid-1$。当左边界 $l$ 等于右边界 $r$ 时，我们找到了最大的 $h$ 值，即为 $l$ 或 $r$。时间复杂度 $O(n \times \log n)$，其中 $n$ 是数组 `citations` 的长度。空间复杂度 $O(1)$。 #### Python3 ```python class Solution: def hIndex(self, citations: List[int]) -> int: l, r = 0, len(citations) while l < r: mid = (l + r + 1) >> 1 if sum(x >= mid for x in citations) >= mid: l = mid else: r = mid - 1 return l ``` #### Java ```java class Solution { public int hIndex(int[] citations) { int l = 0, r = citations.length; while (l < r) { int mid = (l + r + 1) >> 1; int s = 0; for (int x : citations) { if (x >= mid) { ++s; } } if (s >= mid) { l = mid; } else { r = mid - 1; } } return l; } } ``` #### C++ ```cpp class Solution { public: int hIndex(vector& citations) { int l = 0, r = citations.size(); while (l < r) { int mid = (l + r + 1) >> 1; int s = 0; for (int x : citations) { if (x >= mid) { ++s; } } if (s >= mid) { l = mid; } else { r = mid - 1; } } return l; } }; ``` #### Go ```go func hIndex(citations []int) int { l, r := 0, len(citations) for l < r { mid := (l + r + 1) >> 1 s := 0 for _, x := range citations { if x >= mid { s++ } } if s >= mid { l = mid } else { r = mid - 1 } } return l } ``` #### TypeScript ```ts function hIndex(citations: number[]): number { let l = 0; let r = citations.length; while (l < r) { const mid = (l + r + 1) >> 1; let s = 0; for (const x of citations) { if (x >= mid) { ++s; } } if (s >= mid) { l = mid; } else { r = mid - 1; } } return l; } ```