There are thousands of crawlers, the user agent parser included .NET framework is able to handle only few of them and it doesn't keep an updated list of them.
Install this .nuget package, it provides a semantic parser and the library is very active.
You can initialize the parser with this code:
public static class YauaaSingleton
{
private static UserAgentAnalyzer.UserAgentAnalyzerBuilder Builder { get; }
private static readonly Lazy<UserAgentAnalyzer> analyzer = new Lazy<UserAgentAnalyzer> (() => Builder.Build());
public static UserAgentAnalyzer Analyzer
{
get
{
return analyzer.Value;
}
}
static YauaaSingleton()
{
Builder = UserAgentAnalyzer.NewBuilder();
Builder.DropTests();
Builder.DelayInitialization();
Builder.WithCache(100);
Builder.HideMatcherLoadStats();
Builder.WithAllFields();
}
}
Then very easy:
private bool IsValidCrawler(HttpRequestBase request)
{
var ua = YauaaSingleton.Analyzer.Parse(request.UserAgent);
var devideClass = UserAgentClassifier.GetDeviceClass(ua);
if (devideClass == DeviceClass.Robot || devideClass == DeviceClass.RobotMobile || devideClass == DeviceClass.RobotImitator)
return true;
return false;
}
Robot: Normal crawler
RobotMobile: A crawler that emulates a mobile device
RobotImitator: This is not a crawler, but something that emulates a crawler
if you want you can also use:
var isHuman = UserAgentClassifier.IsHuman(ua);
In this case you handle also hacked user agents and other cases.