Question

Assume I have a StringBuffer with values "1 \n 2 \n 3 \n...etc" where \n is a line break. How would I add these values to an existing CSV file as a column using Java? Specifically, this would be the last column.

For example, let's say I have a CSV file that looks like this:

5, 2, 5
2, 3, 1
3, 5, 2
..
etc.

The output should look like this given the StringBuffer after using the method to add the column to the csv file:

5, 2, 5, 1
2, 3, 1, 2
3, 5, 2, 3
..
etc.

I also plan to add columns with 1000s of values so I am looking for something that does not have high memory consumption.

Thanks ahead of time.

Edit: Columns may be different sizes. I see people saying to add it at the end of each line. The problem is, it will add the values to the wrong columns and I cannot have that happen. I thank you all for your suggestions though as they were very good.

Edit 2: I have received critique about my use of StringBuffer and yes, I agree, if this problem were isolated, I would also suggest StringBuilder. The context of this problem is a program that has synchronized threads (acting as scenarios) collecting response times given a range of concurrent threads. The concurrent threads execute concurrent queries to a database and once the query has been executed, the result is appended to a StringBuffer. All the response times for each synchronized thread is appended to a StringBuffer and written to a CSV document. There can be several threads with the same response time. I can use StringBuilder but then I would have to manually synchronize the threads appending the response times and in my case, I do not think it would make too much of a difference in performance and would add an unnecessary amount of code. I hope this helps and I once again, thank you all for your concerns and suggestions. If after reading this, you are still not convinced that I should use StringBuffer, then I ask that we please take this discussion offline.

Edit 3: I have figured out how to go around the issue of adding the columns if the rows are different sizes. I simply add commas for every missing column (also note, that my rows would be growing with each column). It looks like @BorisTheSpider's conceptual solution actually works with this modification. The problem is I am not sure how to add the text at the end of each line. My code so far (I removed code to conserve space):

//Before this code there is a statement to create a test.csv file (this file has no values before this loop occurs).

    for (int p = 0; p<(max+1); p = p + inc){
        threadThis2(p); 
     //threadThis2 appends to the StringBuffer with several comma delimited values. 
    //p represents the number of threads/queries to execute at the same time.
        comma = p/inc; //how many commas to put if there is nothing on the line.
        for (int i = 0; i < comma; i++) {
                  commas.append(",");
        } 
        br = new BufferedReader (new FileReader("test.csv"));
        List <String> avg = Arrays.asList(sb.toString().split(", "));
        for (int i = 0; i < avg.size(); i++) {
          if (br.readLine()==null)
            {w.write(commas.toString() + avg.get(i).toString() + ", \n");}                   
               else { w.write(avg.get(i).toString() + ", \n");}
        }
        br.close();
        sb.setLength(0);
        commas.setLength(0);

}

Please note this code is in its early stages (I will of course declare all the variables outside the for loop later on). So far this code works. The problem is that the columns are not side by side, which is what I want. I understand I may be required to create temporary files but I need to approach this problem very carefully as I might need to have a lot of columns in the future.

Was it helpful?

Solution

Apparently there are two basic requirements:

  1. Append a column to an existing CSV file
  2. Allow concurrent operation

To achieve Requirement #1, the original file has to be read and rewritten as a new file, including the new column, irrespective of its location (i.e., in a StringBuffer or elsewhere).

The best (and only generic) way of reading a CSV file would be via a mature and field-proven library, such as OpenCSV, which is lightweight and commercially-friendly, given its Apache 2.0 license. Otherwise, one has to either do many simplifications (e.g., always assume single-line CSV records), or re-invent the wheel by implementing a new CSV parser.

In either case, a simple algorithm is needed, e.g.:

  • Initialize a CSV reader or parser object from the library used (or from whatever custom solution is used), supplying the existing CSV file and the necessary parameters (e.g., field separator).
  • Read the input file record-by-record, via the reader or parser, as a String[] or List<String> structure.
  • Manipulate the structure returned for every record to add or delete any extra fields (columns), in memory.
  • Add blank fields (i.e., just extra separators, 1 per field), if desired or needed.
  • Use a CSV writer from the library (or manually implement a writer) to write the new record to the output file.
  • Append a newline character at the end of each record written to the output file.
  • Repeat for all the records in the original CSV file.

This approach is also scalable, as it does not require any significant in-memory processing.

For Requirement #2, there are many ways of supporting concurrency and in this scenario it is more efficient to do it in a tailored manner (i.e., "manually" in the application), as opposed to relying on a thread-safe data structure like StringBuffer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top