online url classifier

https://stackoverflow.com/questions/1881815

18-09-2019
|

Question

I want to write an online application that:

reads the URL from address bar of the browser
extracts its lexical features (like n-grams)
extracts its host based features (fetch DNS records online, its A, PTR, TTL fields)
classify the URL into malicious or benign (using machine learning)

Can anyone help me with 1 and 3?

Solution

I don't believe this (application) is a task you can accomplish, as you can't really determine site content based on url.

See something like Mozilla Phishing Protection Design Documentation and Google Safe Browsing spec instead

OTHER TIPS

No idea what language you may be looking at.

For Item 1 here is a .net library that maybe helpful

http://msdn.microsoft.com/en-us/library/system.web.httputility.aspx

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow