Question

I am trying to write a function to just get the users profile id or username from Facebook. They enter there url into a form then I'm trying to figure out if it's a Facebook profile page or other page. The problem is that if they enter an app page or other page that has a subdomain I would like to ignore that request.

Right now I have:

    $author_url = http://facebook.com/profile?id=12345;
            if(preg_match("/facebook/i",$author_url)){
            $parse_author_url = (parse_url($author_url));
            $parse_author_url_q = $parse_author_url['query'];
                if(preg_match('/id[=]([0-9]*)/', $parse_author_url_q, $match)){
                    $fb_id = "/".$match[1];}
                else{ $fb_id = $parse_author_url['path'];
                }
            $grav_url= "http://graph.facebook.com".$fb_id."/picture?type=square";
}
echo $gav_url;

This works if $author_url has "id=" then use that as the profile id if not then it must be a user name or page name so use that instead. I need to run one more check that if the url contains facebook but is a subdomain ignore it. I belive I can do that in the first preg_match preg_match("/facebook/i",$author_url)

Thanks!

Was it helpful?

Solution

To ignore facebook subdomains you can ensure that

$parse_author_url['host']

is facebook.com.

If its anything else like login.facebook.com or apps.facebook.com you need not proceed.

Alternatively you can also ensure that the URL begins with http://facebook.com as:

if(preg_match("@(?:http://)?facebook@i",$author_url)){

OTHER TIPS

This isn't a direct solution for what you were asking but the parts are here to do what you need to do.

I found that a subdomain resulted in an issue with parse_url. Namely it returned an array with only $result['path'] and no 'host' or 'scheme'.

My theory here is if there is no 'host' or 'scheme' results from parse_url and it has domain suffix ( .ext ) in the string, it is a subdomain.

Here is the code: (the $src is a url I had to sort out the relative src from subdomains ):

$srcA = parse_url( $src );
//..if no scheme or host test if subdomain.
if( !$srcA['scheme'] && !$srcA['host'] ){
    //..this string / array is set elsewhere but for this example I will put it here
    $tld = "AC,AD,AE,AERO,AF,AG,AI,AL,AM,AN,AO,AQ,AR,ARPA,AS,ASIA,AT,AU,AW,AX,AZ,BA,BB,BD,BE,BF,BG,BH,BI,BIZ,BJ,BM,BN,BO,BR,BS,BT,BV,BW,BY,BZ,CA,CAT,CC,CD,CF,CG,CH,CI,CK,CL,CM,CN,CO,COM,COOP,CR,CU,CV,CW,CX,CY,CZ,DE,DJ,DK,DM,DO,DZ,EC,EDU,EE,EG,ER,ES,ET,EU,FI,FJ,FK,FM,FO,FR,GA,GB,GD,GE,GF,GG,GH,GI,GL,GM,GN,GOV,GP,GQ,GR,GS,GT,GU,GW,GY,HK,HM,HN,HR,HT,HU,ID,IE,IL,IM,IN,INFO,INT,IO,IQ,IR,IS,IT,JE,JM,JO,JOBS,JP,KE,KG,KH,KI,KM,KN,KP,KR,KW,KY,KZ,LA,LB,LC,LI,LK,LR,LS,LT,LU,LV,LY,MA,MC,MD,ME,MG,MH,MIL,MK,ML,MM,MN,MO,MOBI,MP,MQ,MR,MS,MT,MU,MUSEUM,MV,MW,MX,MY,MZ,NA,NAME,NC,NE,NET,NF,NG,NI,NL,NO,NP,NR,NU,NZ,OM,ORG,PA,PE,PF,PG,PH,PK,PL,PM,PN,POST,PR,PRO,PS,PT,PW,PY,QA,RE,RO,RS,RU,RW,SA,SB,SC,SD,SE,SG,SH,SI,SJ,SK,SL,SM,SN,SO,SR,ST,SU,SV,SX,SY,SZ,TC,TD,TEL,TF,TG,TH,TJ,TK,TL,TM,TN,TO,TP,TR,TRAVEL,TT,TV,TW,TZ,UA,UG,UK,US,UY,UZ,VA,VC,VE,VG,VI,VN,VU,WF,WS,XXX,YE,YT,ZA,ZM,ZW";

    $tldA = explode( ',' , strtolower( $tld ) );

    $isSubdomain = false;
    foreach( $tldA as $tld ){
        if( strstr( $src , '.'.$tld)!=false){
            $isSubdomain = true;
            break;
        }            
    }
    //..prefixing with the $host if it is not a subdomain.
    $src = $isSubdomain ? $src : $src = $host . '/' . $srcA['path'];

}

Could write a further confirmation by parsing the subdomain==true strings before the first '/' and testing against characters with a RegEx.

Hope this helps some people out.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top