Having a hard time getting started on a project for a C++ class. I will be reading a text file and counting the number of occurrences of each word in each line. The output will report each word that was found, followed by a listing of the line it was found on and the number of occurrences on that line (shown below).

So for a single word, "open", if it only occurred twice on line three, it would print out:

open : 3:2

And overall output would look like the following:

A : 48:1
a : 9:1, 10:1, 12:2, 14:1, 17:2, 19:1, 26:1, 27:1, 28:2,
: 39:1, 41:1, 43:1, 45:2, 46:2, 49:1, 50:2, 51:1, 56:3,
: 81:1, 82:1, 94:1, 111:1, 112:1, 114:1, 117:1, 132:1, 135:1,
: 138:1, 142:2, 143:1, 144:1, 152:1, 156:1, 161:2, 163:1, 164:1,
: 167:1, 169:1, 175:1, 182:2, 190:1, 192:1
about : 16:1, 29:1, 166:1, 190:1, 191:1
above : 137:1
accompanied : 6:1
across : 26:1
.
.
.

I am thinking of using a map as the data structure. Then, after reading/parsing each line is complete, I would move these values into a larger multimap that tracks the entire text file with the key being the word, and value being a string in the format #:#.

Before I head too far down this line of thought, does it make sense to do that way or can you recommend a better method that I am missing?

有帮助吗?

解决方案

You seem to be unclear on map. A map stores the data. It does not parse the data. You will need to:

  1. Read the words from the file. This can be done either one by one, or you can read the file one line at a time and tokenize the line. My suggestion is to read the words one at time.

  2. Come up with the data structure to store the data. My suggestion:

    std::map<std::string, std::vector<std::pair<int, int>>>

    The key in the map is, obviously, the word. The std::pair<int, int> holds a line number and the number of occurrences of that word in that line. The std::vector<std::pair<int, int>> allows you capture a list of those std::pairs.

Hope this helps you to move forward.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top