Given any string, I'd like to create an intelligent acronym that represents the string. If any of you have used JIRA, they accomplish this pretty well.

For example, given the word: Phoenix it would generate PHX or given the word Privacy Event Management it would create PEM.

I've got some code that will accomplish the latter:

 string.Join(string.Empty, model.Name
                .Where(char.IsLetter)
                .Where(char.IsUpper))

This case doesn't handle if there is only one word and its lower case either.

but it doesn't account for the first case. Any ideas? I'm using C# 4.5

有帮助吗?

解决方案 2

I was able to extract out the JIRA key generator and posted it here. pretty interesting, and even though its JavaScript it could easily be converted to c#.

其他提示

For the Phoenix => PHX, I think you'll need to check the strings against a dictionary of known abbreviations. As for the multiple word/camel-case support, regex is your friend!

var text = "A Big copy DayEnergyFree good"; // abbreviation should be "ABCDEFG"
var pattern = @"((?<=^|\s)(\w{1})|([A-Z]))";
string.Join(string.Empty, Regex.Matches(text, pattern).OfType<Match>().Select(x => x.Value.ToUpper()))

Let me explain what's happening here, starting with the regex pattern, which covers a few cases for matching substrings.

// must be directly after the beginning of the string or line "^" or a whitespace character "\s"
(?<=^|\s)
// match just one letter that is part of a word
(\w{1})
// if the previous requirements are not met
|
// match any upper-case letter
([A-Z])

The Regex.Matches method returns a MatchCollection, which is basically an ICollection so to use LINQ expressions, we call OfType() to convert the MatchCollection into an IEnumerable.

Regex.Matches(text, pattern).OfType<Match>()

Then we select only the value of the match (we don't need the other regex matching meta-data) and convert it to upper-case.

Select(x => x.Value.ToUpper())

Here is a simple function that generates an acronym. Basically it puts letters or numbers into the acronym when there is a space before of this character. If there are no spaces in the string the the string is returned back. It does not capitalize letters in the acronym, but it is easy to amend.

You can just copy it in your code and start using it.

Results are the following. Just an example:

Deloitte Private Pty Ltd - DPPL Clearwater Investment Co Pty Ltd (AC & CC Family Trust) - CICPLACFT ASIC - ASIC


    private string Acronym(string value)
    {
        if (string.IsNullOrWhiteSpace(value))
        {
            return value;
        } else
        {
            var builder = new StringBuilder();
            foreach(char c in value)
            {
                if (char.IsWhiteSpace(c) || char.IsLetterOrDigit(c))
                {
                    builder.Append(c);
                }
            }
            string trimmedValue = builder.ToString().Trim();
            builder.Clear();
            if (trimmedValue.Contains(' '))
            {
                for(int charIndex = 0; charIndex < trimmedValue.Length; charIndex++)
                {
                    if (charIndex == 0)
                    {
                        builder.Append(trimmedValue[0]);
                    } else
                    {
                        char currentChar = trimmedValue[charIndex];
                        char previousChar = trimmedValue[charIndex - 1];
                        if (char.IsLetterOrDigit(currentChar) && char.IsWhiteSpace(previousChar))
                        {
                            builder.Append(trimmedValue[charIndex]);
                        }
                    }
                }
                return builder.ToString();
            } else
            {
                return trimmedValue;
            }
        }
    }

I need a not repeating code,So I create the follow method.

If you use like this,you will get

HashSet<string> idHashSet = new HashSet<string>();
for (int i = 0; i < 100; i++)
{
    var eName = "China National Petroleum";
    Console.WriteLine($"count:{i+1},short name:{GetIdentifierCode(eName,ref idHashSet)}");
}

no repeat code

the method is this.

/// <summary>
/// 根据英文名取其简写Code,优先取首字母,然后在每个单词中依次取字母作为Code,最后若还有重复则使用默认填充符(A)填充
/// todo 当名称为中文时,使用拼音作为取Code的源
/// </summary>
/// <param name="name"></param>
/// <param name="idHashSet"></param>
/// <returns></returns>
public static string GetIdentifierCode(string name, ref HashSet<string> idHashSet)
{
    var words = name;
    var fillChar = 'A';
    if (string.IsNullOrEmpty(words))
    {
        do
        {
            words += fillChar.ToString();

        } while (idHashSet.Contains(words));
    }

    //if (IsChinese)
    //{
    //    words = GetPinYin(words);
    //}

    //中国石油天然气集团公司(China National Petroleum)
    var sourceWord = new List<string>(words.Split(' '));
    var returnWord = sourceWord.Select(c => new List<char>()).ToList();


    int index = 0;
    do
    {
        var listAddWord = sourceWord[index];
        var addWord = returnWord[index];
        //最后若还有重复则使用默认填充符(A)填充
        if (sourceWord.All(c => string.IsNullOrEmpty(c)))
        {
            returnWord.Last().Add(fillChar);
            continue;
        }
        //字符取完后跳过
        else if (string.IsNullOrEmpty(listAddWord))
        {
            if (index == sourceWord.Count - 1)
                index = 0;
            else
            {
                index++;
            }
            continue;
        }

        if (addWord == null)
            addWord = new List<char>();
        string addString = string.Empty;

        //字符全为大写时,不拆分
        if (listAddWord.All(a => char.IsUpper(a)))
        {
            addWord = listAddWord.ToCharArray().ToList();
            returnWord[index] = addWord;
            addString = listAddWord;
        }
        else
        {
            addString = listAddWord.First().ToString();
            addWord.Add(listAddWord.First());
        }

        listAddWord = listAddWord.Replace(addString, "");
        sourceWord[index] = listAddWord;
      
        if (index == sourceWord.Count - 1)
            index = 0;
        else
        {
            index++;
        }
    } while (idHashSet.Contains(string.Concat(returnWord.SelectMany(c => c))));

    words = string.Concat(returnWord.SelectMany(c => c));
    idHashSet.Add(words);

    return words;

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top