Saturday, 28 September 2013

Efficient Algorithm (Hashing, Sorting)

Efficient Algorithm (Hashing, Sorting)

I have the got the following problem - I am given a text file containing
huge number of words with repetition allowed. I need to write an algorithm
that will output those 1000 words that appear most with their frequencies
in decreasing order. Here is an example
**input.txt**
aa aa bb cc bb bb bb dd dd
**output.txt** (note - frequencies can be repeated)
bb 4
aa 2
dd 2
cc 1
And here is my way of solving this problem. First read all the words and
store them in HashMap with word as the key. The end result of doing this
is that I have all the words with their frequencies.
Now I iterate over the HashMap and create an object {word, frequency} for
each key value pair and then I insert it in a SortedSet (I wrote a
comparator for that).
Now I just iterate over the SortedSet for 1000 times to get the result.
I would like to know a better way of solving this problem.
Thanks

No comments:

Post a Comment