Challenge: elegantly LINQify this procedural code
-
04-07-2019 - |
Question
string[] filesOfType1 = GetFileList1();
string[] filesOfType2 = GetFileList2();
var cookieMap = new Dictionary<string, CookieContainer>();
Action<string, Func<string, KeyValuePair<string, CookieContainer>>> addToMap = (filename, pairGetter) =>
{
KeyValuePair<string, CookieContainer> cookiePair;
try
{
cookiePair = pairGetter(filename);
}
catch
{
Console.WriteLine("An error was encountered while trying to read " + file + ".");
return;
}
if (cookieMap.ContainsKey(cookiePair.Key))
{
if (cookiePair.Value.Count > cookieMap[cookiePair.Key].Count)
{
cookieMap[cookiePair.Key] = cookiePair.Value;
}
}
else
{
cookieMap.Add(cookiePair.Key, cookiePair.Value);
}
};
foreach (string file in filesOfType1)
{
addToMap(file, GetType1FileCookiePair);
}
foreach (string file in filesOfType2)
{
addToMap(file, GetType2FileCookiePair);
}
Salient features that must be preserved:
- Files of type 1 are more important than files of type 2; i.e. if a file of type 1 maps to a (key, value1) combination and a file of type 2 maps to a (key, value2) combination, then we add (key, value1) to cookieMap and not (key, value2). Edit: as pointed out by Bevan, this is not satisfied by my original procedural code.
- Secondarily,
CookieContainer
s with a higherCount
have higher priority, i.e. if there are two (key, value) combos for the same key and both from the same filetype, we choose the one with highervalue.Count
. - Per-case exception handling is a must; screwing up a single file-reading should just allows us to note that and continue.
My best attempt started like this:
var cookieMap = (filesOfType1.Select(file => GetType1FileCookiePair(file))
.Concat(filesOfType2.Select(file => GetType2FileCookiePair(file))))
.GroupBy(pair => pair.Key)
.Select(/* some way of selecting per the above bullets */)
.ToDictionary(pair => pair.Key, pair => pair.Value);
But it's inelegant and filling in that comment block seems like a bitch. Right now I'm happy to stay procedural, but I thought that it might be a fun challenge to see if people can come up with something really clever.
Solution
Here's my attempt - seemed simplest to split the task into three distinct statements.
I'm using a helper function that returns null if the action throws an exception - for consistency with the answer from Omer van Kloeten, I've called this Swallow()
Also, I'm not using the LINQ syntax, just the extension methods provided by System.Linq.Enumerable
Lastly, note that this is uncompiled - so take it as intent.
// Handle all files of type 1
var pairsOfType1 =
filesOfType1
.Select( file => Swallow( pairGetter(file)))
.Where( pair => pair != null);
// Handle files of type 2 and filter out those with keys already provided by type 1
var pairsOfType2 =
filesOfType2
.Select( file => Swallow( pairGetter(file)))
.Where( pair => pair != null);
.Where( pair => !pairsOfType1.Contains(p => p.Key == pair.Key));
// Merge the two sets, keeping only the pairs with the highest count
var cookies =
pairsOfType1
.Union( pairsOfType2)
.GroupBy( pair => pair.Key)
.Select( group => group.OrderBy( pair => pair.Value.Count).Last());
.ToDictionary( pair => pair.Key);
OTHER TIPS
using CookiePair = KeyValuePair<string, CookieContainer>;
using CookieDictionary = Dictionary<string, CookieContainer>;
Func<string[], Func<string, CookiePair>, IEnumerable<CookiePair>> getCookies =
( files, pairGetter ) =>
files.SelectMany( filename => {
try { return new[] { pairGetter( filename ) }; }
catch { Console.WriteLine( "..." ); return new CookiePair[0]; }
} );
var type1Cookies = getCookies( filesOfType1, GetType1FileCookiePair ).ToArray( );
var type1CookieNames = type1Cookies.Select( p => p.Key ).ToArray( );
var type2Cookies = getCookies( filesOfType2, GetType2FileCookiePair )
.Where( p => !type1CookieNames.Contains( p.Key ) );
var cookieMap = type1Cookies.Concat( type2Cookies )
.Aggregate( new CookieDictionary( ), ( d, p ) => {
if( !d.ContainsKey( p.Key ) || p.Value.Count > d[p.Key].Count )
d[p.Key] = p.Value;
return d;
} );
Edit: Updated cookie retrieval to satisfy the "files of type 1 are more important than those of type 2" requirement.
Excuse my not actually going ahead and compiling this, but this is the way I'd go around to doing it:
var cookieMap = (from pair in
(from f1 in filesOfType1
select Swallow(() => GetType1FileCookiePair(f1)))
.Concat(from f2 in filesOfType2
select Swallow(() => GetType2FileCookiePair(f2)))
.SelectMany(dict => dict)
group pair by pair.Key into g
select g)
.ToDictionary(g => g.Key, g => g.Select(pair => pair.Value)
.OrderByDescending(value => value.Count)
.First());
Swallow
follows:
private static T Swallow<T>(Func<T> getT)
{
try { return getT(); } catch { }
return default(T);
}
Love me a good LINQ.
- Edit: Added a
Swallow
method that will swallow all exceptions. - Edit 2: Compiled, altered, etc. Added
Swallow
. Now works as intended.