Domanda

I need a function that will take a string and "pascal case" it. The only indicator that a new word starts is an underscore. Here are some example strings that need to be cleaned up:

  1. price_old => Should be PriceOld
  2. rank_old => Should be RankOld

I started working on a function that makes the first character upper case:

public string FirstCharacterUpper(string value)
{
 if (value == null || value.Length == 0)
  return string.Empty;
 if (value.Length == 1)
  return value.ToUpper();
 var firstChar = value.Substring(0, 1).ToUpper();
 return firstChar + value.Substring(1, value.Length - 1);
}

The thing the above function doesn't do is remove the underscore and "ToUpper" the character to the right of the underscore.

Also, any ideas about how to pascal case a string that doesn't have any indicators (like the underscore). For example:

  1. companysource
  2. financialtrend
  3. accountingchangetype

The major challenge here is determining where one word ends and another starts. I guess I would need some sort of lookup dictionary to determine where new words start? Are there libraries our there to do this sort of thing already?

Thanks,

Paul

È stato utile?

Soluzione

You can use the TextInfo.ToTitleCase method then remove the '_' characters.

So, using the extension methods I've got:

http://theburningmonk.com/2010/08/dotnet-tips-string-totitlecase-extension-methods

you can do somethingl ike this:

var s = "price_old";
s.ToTitleCase().Replace("_", string.Empty);

Altri suggerimenti

Well the first thing is easy:

string.Join("", "price_old".Split(new [] { '_' }, StringSplitOptions.RemoveEmptyEntries).Select(s => s.Substring(0, 1).ToUpper() + s.Substring(1)).ToArray());

returns PriceOld

Second thing is way more difficult. As companysource could be CompanySource or maybe CompanysOurce, can be automated but is quite faulty. You will need an English dictionary, and do some guessing (ah well, I mean alot) on which combination of words is correct.

Try this:

public static string GetPascalCase(string name)
{
    return Regex.Replace(name, @"^\w|_\w", 
        (match) => match.Value.Replace("_", "").ToUpper());
}

Console.WriteLine(GetPascalCase("price_old")); // => Should be PriceOld
Console.WriteLine(GetPascalCase("rank_old" )); // => Should be RankOld

With underscores:

s = Regex.Replace(s, @"(?:^|_)([a-z])",
      m => m.Groups[1].Value.ToUpper());

Without underscores:

You're on your own there. But go ahead and search; I'd be surprised if nobody has done this before.

For your 2nd problem of splitting concatenated words, you could utilize our best friends Google & Co. If your concatenated input is made up of usual english words, the search engines have a good hit rate for the single words as an alternative search query

If you enter your sample input, Google and Bing suggest the following:

original             | Google                | Bing
=====================================================================
companysource        | company source        | company source 
financialtrend       | financial trend       | financial trend
accountingchangetype | accounting changetype | accounting change type

See this exaple.

Writing a small screen scraper for that should be fairly easy.

for those who needs a non regex solution

 public static string RemoveAllSpaceAndConcertToPascalCase(string status)
        {
            var textInfo = new System.Globalization.CultureInfo("en-US").TextInfo;
            var titleCaseStr = textInfo.ToTitleCase(status);
            string result = titleCaseStr.Replace("_","").Replace(" ", "");

            return result;
        }
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top