Word Frequency - Problem
Imagine you're a text analyst tasked with understanding the most common words in a document. Your job is to create a bash script that counts how many times each word appears in a text file called words.txt.
The output should display each unique word followed by its frequency count, sorted by frequency in descending order. For words with the same frequency, sort them alphabetically.
Example: If words.txt contains:the quick brown fox jumps over the lazy dog the fox
Your script should output:the 3
fox 2
brown 1
dog 1
jumps 1
lazy 1
over 1
quick 1
Constraints:
- Input contains only lowercase letters and spaces
- Words are separated by one or more whitespace characters
- Output format:
word frequency(space-separated)
Input & Output
example_1.txt โ Basic word counting
$
Input:
words.txt contains:
the quick brown fox jumps over the lazy dog
โบ
Output:
the 2
brown 1
dog 1
fox 1
jumps 1
lazy 1
over 1
quick 1
๐ก Note:
The word 'the' appears twice, all other words appear once. Results are sorted by frequency (descending) then alphabetically.
example_2.txt โ Multiple spaces handling
$
Input:
words.txt contains:
hello world hello universe world
โบ
Output:
hello 2
world 2
universe 1
๐ก Note:
Multiple consecutive spaces are handled correctly. Words with same frequency (hello, world) are sorted alphabetically.
example_3.txt โ Single word file
$
Input:
words.txt contains:
test
โบ
Output:
test 1
๐ก Note:
Edge case with only one word should output that word with count 1.
Constraints
- Input file contains only lowercase letters (a-z) and space characters
- Words are separated by one or more whitespace characters
- File size can be up to 106 characters
- Output format: Each line should contain 'word count' separated by a single space
- Words with same frequency should be sorted alphabetically
Visualization
Tap to expand
Understanding the Visualization
1
Initialize Counter
Create an empty associative array to track word frequencies
2
Single Pass Processing
Read each word and increment its counter in the hash table
3
Sort Results
Sort by frequency (descending) then alphabetically for ties
4
Output Report
Display each word with its count in the specified format
Key Takeaway
๐ฏ Key Insight: Using associative arrays (hash tables) enables efficient single-pass counting, avoiding the need to repeatedly scan the input file. This transforms an O(nยฒ) problem into an optimal O(n + k log k) solution.
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code