Question

On a custom PHP framework, I have implemented a mailing class that let's me know when a 404 occurs. It mails me the url, referrer and UA string.

I am getting two types of unexplained 404 reports for urls that are not linked anywhere on the site. This is happening quite often. I have tested on the exact browser versions as where the reports originate from. I can not find anything wrong, both in html as in javascript. These pages generally contain a only a little bit of javascript btw.

Type1 examples:

source: http://www.example.com/articles/example-article
target (404): http://www.example.com/articles/undefined
User agents who have reported this:
- Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36
(Chrome 30.0.1599.101 on win7)
- Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0; BOIE9;ENUSMSE)
(IE9 on win vista)
- Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident/6.0;WUID=78780BB80C56415F887179239977F107;WTB=6581)
(IE10 on win 8)

Type2 examples:

source: http://www.example.com/articles/example-article
target (404): http://www.example.com/articles
User agents who have reported this:
- Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET CLR 1.1.4322; .NET4.0C; .NET4.0E)
(IE8 on win7)
- Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30707; MS-RTC LM 8)
(IE7 on windows server 2003)
- Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.5; .NET CLR 1.1.4322 ; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
(IE8 on winXP)
- Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; GTB7.5; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; SLCC1; .NET CLR 2.0.50727; .NET C LR 1.1.4322; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C; HYVES)
(IE8 on win vista)

Could anyone help me explain these? Is there perhaps a buggy windows browser plugin that can be the cause? I have not seen any of these reports happening on other operating systems then windows. Allthough the sites do get quite a lot visits from other OS'es as well.

Cheers!

EDIT #1 I used useragentstring.com to explain the UA strings

EDIT #2 The answers of Palec, Fabio Beltramini and Artur have helped me further understand the issue and I feel they all contributed as much. Since I can only accept/reward one answer, I have chosen to accept Palec's answer because he answered first. Thank you all very much for thinking along. If I come across anything noteworthy during debuggin, I will add it here.

Was it helpful?

Solution

Possible explanations are:

  • broken JS code, which is hard to trigger
  • broken browser plugin
  • empty link target (<a href="">)
  • a bot trying weird URLs for some reason

Undefined in URL is a typical sign of broken JavaScript. Unwanted reference to (logical) folder containing current document is often caused by IE's infamous bug – it interprets empty path not as current document, but as . (containing folder), so empty link target works differently in IE and other browsers. Bot related errors are a story in itself – I could only add that it is not uncommon for them to make up both requested path and referer.

Two Stack Overflow questions supporting my conjecture about broken plugin:

Details on IE’s empty href bug (official resource linked there):

OTHER TIPS

I know that problem as I've intigated it in the past.

I administrate a server where this is visible. There is small group of users that generate these requests (constant set of users). They always orginate from MSIE 7, 8, 9 browsers. Thousands other users may use the same browser - look at the same site, do the same but all will work as epected.

It is solely host+browser+libraries related hence there is nothing you can do about it a it is on user's side.

As 99.9% of web issues it is related to Internet Explorer - period. You must live with that.

Small update:

Although you see these .../undefined links that suggest users go there - they are completely unaware of that. I've asked users from who these queries originate and none is aware of that or seen anything like .../undefined in their browser with 404 error. So it is most probably background stuff.

It's hard to say without seeing an actual example page. Most likely a javascript function is dynamically determining the URL of some on-page resource or link, and the different JS functionality in some browsers is causing the variable to be undefined.

Note that this does not mean the user necessarily navigated to the page.. Maybe the browser just tried to load an image from this URL.

Test this in a browser and see what network requests you see:

var a=undefined;
var i=new Image();
i.src=a;

One helpful way to narrow down the source of the problem would be to also log the "Accept" http header. That way you can differentiate whether the request is the result of navigation (it would have a value similar to "Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8") or whether it's some on page resource such as an image (Accept:image/webp,/;q=0.8)

I my case i was having following error thrown in IE9&IE10 User Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0) Timestamp: Wed, 30 Jul 2014 17:07:13 UTC Message: Syntax error Line: 1 Char: 1 Code: 0 URI: https://your.website.com/js/jquery/plugins/jqgrid/v452/js/i18n/grid.locale-en.js

After debugging it boiled down to a simple path issue. The resource that is being referred "grid.locale-en.js" was residing in a different path. i.e., "js" after "v452" was not there. By correcting to point to the right path resolved the issue.

Hope this helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top