Question

Typical way of creating a CSV string (pseudocode):

  1. Create a CSV container object (like a StringBuilder in C#).
  2. Loop through the strings you want to add appending a comma after each one.
  3. After the loop, remove that last superfluous comma.

Code sample:

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();
    foreach (Contact c in contactList)
    {
        sb.Append(c.Name + ",");
    }

    sb.Remove(sb.Length - 1, 1);
    //sb.Replace(",", "", sb.Length - 1, 1)

    return sb.ToString();
}

I like the idea of adding the comma by checking if the container is empty, but doesn't that mean more processing as it needs to check the length of the string on each occurrence?

I feel that there should be an easier/cleaner/more efficient way of removing that last comma. Any ideas?

Was it helpful?

Solution

You could use LINQ to Objects:

string [] strings = contactList.Select(c => c.Name).ToArray();
string csv = string.Join(",", strings);

Obviously that could all be done in one line, but it's a bit clearer on two.

OTHER TIPS

Your code not really compliant with full CSV format. If you are just generating CSV from data that has no commas, leading/trailing spaces, tabs, newlines or quotes, it should be fine. However, in most real-world data-exchange scenarios, you do need the full imlementation.

For generation to proper CSV, you can use this:

public static String EncodeCsvLine(params String[] fields)
{
    StringBuilder line = new StringBuilder();

    for (int i = 0; i < fields.Length; i++)
    {
        if (i > 0)
        {
            line.Append(DelimiterChar);
        }

        String csvField = EncodeCsvField(fields[i]);
        line.Append(csvField);
    }

    return line.ToString();
}

static String EncodeCsvField(String field)
{
    StringBuilder sb = new StringBuilder();
    sb.Append(field);

    // Some fields with special characters must be embedded in double quotes
    bool embedInQuotes = false;

    // Embed in quotes to preserve leading/tralining whitespace
    if (sb.Length > 0 && 
        (sb[0] == ' ' || 
         sb[0] == '\t' ||
         sb[sb.Length-1] == ' ' || 
         sb[sb.Length-1] == '\t' ))
    {
        embedInQuotes = true;
    }

    for (int i = 0; i < sb.Length; i++)
    {
        // Embed in quotes to preserve: commas, line-breaks etc.
        if (sb[i] == DelimiterChar || 
            sb[i]=='\r' || 
            sb[i]=='\n' || 
            sb[i] == '"') 
        { 
            embedInQuotes = true;
            break;
        }
    }

    // If the field itself has quotes, they must each be represented 
    // by a pair of consecutive quotes.
    sb.Replace("\"", "\"\"");

    String rv = sb.ToString();

    if (embedInQuotes)
    {
        rv = "\"" + rv + "\"";
    }

    return rv;
}

Might not be world's most efficient code, but it has been tested. Real world sucks compared to quick sample code :)

Why not use one of the open source CSV libraries out there?

I know it sounds like overkill for something that appears so simple, but as you can tell by the comments and code snippets, there's more than meets the eye. In addition to handling full CSV compliance, you'll eventually want to handle both reading and writing CSVs... and you may want file manipulation.

I've used Open CSV on one of my projects before (but there are plenty of others to choose from). It certainly made my life easier. ;)

Don't forget our old friend "for". It's not as nice-looking as foreach but it has the advantage of being able to start at the second element.

public string ReturnAsCSV(ContactList contactList)
{
    if (contactList == null || contactList.Count == 0)
        return string.Empty;

    StringBuilder sb = new StringBuilder(contactList[0].Name);

    for (int i = 1; i < contactList.Count; i++)
    {
        sb.Append(",");
        sb.Append(contactList[i].Name);
    }

    return sb.ToString();
}

You could also wrap the second Append in an "if" that tests whether the Name property contains a double-quote or a comma, and if so, escape them appropriately.

You could instead add the comma as the first thing inside your foreach.

if (sb.Length > 0) sb.Append(",");

You could also make an array of c.Name data and use String.Join method to create your line.

public string ReturnAsCSV(ContactList contactList)
{
    List<String> tmpList = new List<string>();

    foreach (Contact c in contactList)
    {
        tmpList.Add(c.Name);
    }

    return String.Join(",", tmpList.ToArray());
}

This might not be as performant as the StringBuilder approach, but it definitely looks cleaner.

Also, you might want to consider using .CurrentCulture.TextInfo.ListSeparator instead of a hard-coded comma -- If your output is going to be imported into other applications, you might have problems with it. ListSeparator may be different across different cultures, and MS Excel at the very least, honors this setting. So:

return String.Join(
    System.Globalization.CultureInfo.CurrentCulture.TextInfo.ListSeparator,
    tmpList.ToArray());

I like the idea of adding the comma by checking if the container is empty, but doesn't that mean more processing as it needs to check the length of the string on each occurrence?

You're prematurely optimizing, the performance hit would be negligible.

Just a thought, but remember to handle comma's and quotation marks (") in the field values, otherwise your CSV file may break the consumers reader.

I wrote a small class for this in case someone else finds it useful...

public class clsCSVBuilder
{
    protected int _CurrentIndex = -1;
    protected List<string> _Headers = new List<string>();
    protected List<List<string>> _Records = new List<List<string>>();
    protected const string SEPERATOR = ",";

    public clsCSVBuilder() { }

    public void CreateRow()
    {
        _Records.Add(new List<string>());
        _CurrentIndex++;
    }

    protected string _EscapeString(string str)
    {
        return string.Format("\"{0}\"", str.Replace("\"", "\"\"")
                                            .Replace("\r\n", " ")
                                            .Replace("\n", " ")
                                            .Replace("\r", " "));
    }

    protected void _AddRawString(string item)
    {
        _Records[_CurrentIndex].Add(item);
    }

    public void AddHeader(string name)
    {
        _Headers.Add(_EscapeString(name));
    }

    public void AddRowItem(string item)
    {
        _AddRawString(_EscapeString(item));
    }

    public void AddRowItem(int item)
    {
        _AddRawString(item.ToString());
    }

    public void AddRowItem(double item)
    {
        _AddRawString(item.ToString());
    }

    public void AddRowItem(DateTime date)
    {
        AddRowItem(date.ToShortDateString());
    }

    public static string GenerateTempCSVPath()
    {
        return Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString().ToLower().Replace("-", "") + ".csv");
    }

    protected string _GenerateCSV()
    {
        StringBuilder sb = new StringBuilder();

        if (_Headers.Count > 0)
        {
            sb.AppendLine(string.Join(SEPERATOR, _Headers.ToArray()));
        }

        foreach (List<string> row in _Records)
        {
            sb.AppendLine(string.Join(SEPERATOR, row.ToArray()));
        }

        return sb.ToString();
    }

    public void SaveAs(string path)
    {
        using (StreamWriter sw = new StreamWriter(path))
        {
            sw.Write(_GenerateCSV());
        }
    }
}

I've used this method before. The Length property of StringBuilder is NOT readonly so subtracting it by one means truncate the last character. But you have to make sure your length is not zero to start with (which would happen if your list is empty) because setting the length to less than zero is an error.

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();

    foreach (Contact c in contactList)       
    { 
        sb.Append(c.Name + ",");       
    }

    if (sb.Length > 0)  
        sb.Length -= 1;

    return sb.ToString();  
}

I use CSVHelper - it's a great open-source library that lets you generate compliant CSV streams one element at a time or custom-map your classes:

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();
    using (StringWriter stringWriter = new StringWriter(sb))
    {
        using (var csvWriter = new CsvHelper.CsvWriter(stringWriter))
        {
            csvWriter.Configuration.HasHeaderRecord = false;
            foreach (Contact c in contactList)
            {
                csvWriter.WriteField(c.Name);
            }
        }
    }
    return sb.ToString();
}

or if you map then something like this: csvWriter.WriteRecords<ContactList>(contactList);

How about some trimming?

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();

    foreach (Contact c in contactList)
    {
        sb.Append(c.Name + ",");
    }

    return sb.ToString().Trim(',');
}

How about tracking whether you are on the first item, and only add a comma before the item if it is not the first one.

public string ReturnAsCSV(ContactList contactList)
{
    StringBuilder sb = new StringBuilder();
    bool isFirst = true;

    foreach (Contact c in contactList) {
        if (!isFirst) { 
          // Only add comma before item if it is not the first item
          sb.Append(","); 
        } else {
          isFirst = false;
        }

        sb.Append(c.Name);
    }

    return sb.ToString();
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top