Pergunta

I have a client with a website that looks as if it has been hacked. Random pages throughout the site will (seemingly at random) automatically forward to a youtube video. This happens for a while (not sure how long yet... still trying to figure that out) and then the redirect disappears. May have something to do with our site caching though. Regardless, the client isn't happy about it.

I'm searching the code base (this is a Wordpress site, but this question was generic enough that I put it here instead of in the Wordpress groups...) for "base64_decode" but not having any luck.

So, since I know the specific url that the site is getting forwarded to every time, I thought I'd search for the video id that is in the youtube url. This method could also be pertinent when the hack-inserted base64'd string is defined to a variable and then that variable is decoded (so a grep for "base64_decode" wouldn't necessarily come up with any answers that looked suspicious).

So, what I'm wondering is if there's a way to search for a substring of a string that has been base64'd and then inserted into the code. Like, take the substring I'm searching for, base64 it, and then search the code base for the resultant string. (Maybe after manipulating it slightly?)

Is there a way to do that? Is that method even valid? I don't really have any idea how the whole base64 algorithm works, or if this is possible, so I thought I'd quickly throw the question out here to see if anyone else did.

Foi útil?

Solução

Nothing to it (for somebody with the chutzpah to call himself "Programmer Dan").

Well, maybe a little. Your have to know the encoding for the values 0 to 63.

In general, encoding to Base64 is done by taking three 8-bit characters of plain text at a time, breaking those bits into four sets of 6-bit numbers, and creating four characters of encoded text by converting the numbers (0 to 63) to arbitrary characters. Actually, the encoded characters aren't completely arbitrary, as they must be acceptable to pretty much ANY method of transmission, since that's the original reason for using Base64 encoding. The system I usually work with uses {A..Z,a..z,0..9,+,/} in that order.

If one wanted to be nasty (which one might expect in the case you're dealing with), one might change the order, or even the characters, during the process. Of course, if you have examples of the encoded Base64, you can see what the character set is (unless the encoding uses more than 64 characters). But you still have the possibility of things like changing the order as you encode or decode (simple rotation, for example). But, I digress. The question is about searching for encoded text, not deciphering deliberate obfuscation. I could tell you a lot about that, too.

Simple methodology:

  1. Encode the plain text you're looking for. If the encoding results in one or two equal signs (padding) at the end, eliminate them and the last encoded character that precedes them. Search for the result.

  2. Same as (1) except stick a blank on the front of your plain text. Eliminate the first two encoded characters. Search for the result.

  3. Same as (2) except with two blanks on the front. Again, eliminate the first two encoded characters. Search for the result.

These three searches will find all files containing the encoding of the plain text you're looking for.

This is all “air code”, meaning off the top of my head, at best. Some might suggest I pulled it out of somewhere else. I can think of three possible problems with this algorithm, excluding any issues of efficiency. But, that’s what you get at this price.

Let me know if you want the working version. Or send me yours. Good luck.

Cplusman

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top