Question

I am sorry if this is a stupid question but I am new to Java linkedlists and arraylists.

What I wish to do is this: I have a text file that I run through word for word. I want to create an Arraylist of linkedlists, which each uniqye word in the text followed in the linked list by the words that it is followed by in the text.

Consider this piece of text: The cat walks to the red tree.

I want the Arraylist of LinkedLists to be like this:

The - cat - red

|

cat - walks

|

to - the

|

red - tree

What I have now is this:

while(dataFile.hasNext()){
     secondWord = dataFile.next();
     nWords++;
     if(nWords % 1000 ==0) System.out.println(nWords+" words");


   //and put words into list if not already there
   //check if this word is already in the list
   if(follows.contains(firstWord)){
    //add the next word to it's linked list 
 ((LinkedList)(firstWord)).add(secondWord);
   }
   else{
      //create new linked list for this word and then add next word

      follows.add(new LinkedList<E>().add(firstWord));


  ((LinkedList)(firstWord)).add(secondWord);
   }
   //go on to next word                     
   firstWord = secondWord;
   }

And it gives me plenty of errors. How can I do to this the best way? (With linkedlists, I know hashtables and binary trees are better but I need to use linked lists)

Was it helpful?

Solution

An ArrayList is not the best data structure for purpose of your outer list, and at least part of your difficulty stems from incorrect use of a list of lists.

In your implementation, presumably follows is an ArrayList of LinkedLists declared like this:

ArrayList<LinkedList<String>> follows = new ArrayList<>();

The result of follows.contains(firstWord) will never be true, because follows contains elements of type LinkedList, not String. firstWord is a String, and so would not be an element of follows, but would be the first element of an ArrayList which is an element of follows.

The solution offered below uses a Map, or more specifically a HashMap, for the outer list follows. A Map is preferable because when searching for the first word, the amortized look-up time will be O(1) using a map versus O(n) for a list.

String firstWord = dataFile.next().toLowerCase();
Map<String, List<String>> follows = new HashMap<>();
int nWords = 0;
while (dataFile.hasNext())
{
    String secondWord = dataFile.next().toLowerCase();
    nWords++;
    if (nWords % 1000 == 0)
    {
        System.out.println(nWords + " words");
    }

    //and put words into list if not already there
    //check if this word is already in the list
    if (follows.containsKey(firstWord))
    {
        //add the next word to it's linked list
        List list = follows.get(firstWord);
        if (!list.contains(secondWord))
        {
            list.add(secondWord);
        }
    }
    else
    {
        //create new linked list for this word and then add next word
        List list = new LinkedList<String>();
        list.add(secondWord);
        follows.put(firstWord, list);
    }

    //go on to next word
    firstWord = secondWord;
}

The map will look like this:

the: [cat, red]
cat: [walks]
to: [the]
red: [tree]
walks: [to]

I also made the following changes to your implementation:

  • Don't add duplicates to the list of following words. Note that a Set would be a more appropriate data structure for this task, but you clearly state that a requirement is to use LinkedList.

  • Use String.toLowerCase() to move all strings to lower case, so that "the" and "The" are treated equivalently. (Be sure you apply this to the initial value of firstWord as well, which doesn't appear in the code you provided.)

Note that both this solution and your original attempt assume that punctuation has already been removed.

OTHER TIPS

You should not work using direct classes implementation, instead using their interfaces to ease the development (among other reasons). So, instead do the type casting every when and now, declare your variable as List and just define the class when initializing it. Since you haven't posted the relevant code to redefine it, I could give you an example of this:

List<List<String>> listOfListOfString = new LinkedList<>(); //assuming Java 7 or later used
List<String> listOne = new ArrayList<>();
listOne.add("hello");
listOne.add("world");
listOfListOfString.add(listOne);
List<String> listTwo = new ArrayList<>();
listTwo.add("bye);
listTwo.add("world");
listOfListOfString.add(listTwo);
for (List<String> list : listOfListOfString) {
    System.out.println(list);
}

This will print:

[hello, world]
[bye, world]

Note that now you can change the implementation of any of listOne or listTwo to LinkedList:

List<String> listOne = new LinkedList<>();
//...
List<String> listTwo = new LinkedList<>();

And the code will behave the same. No need to do any typecast to make it work.

Related:

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top