문제

I have a CGI script (written in Bash) that is going to log some information about people how visit my website. I have this kind of information through $HTTP_USER_AGENT, but I want to log log it in my database using different columns for OS, Browser Type, Browser Version, etc. Here is how the string looks like in my browser:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36

In this case I would like to log that the access was made from a Mac OS X 10_9_1 using Chrome version 32.0.1700.107.

I guess someone have already done this string processing job and I'm not being able to search it through the right keywords here in StackOverflow. Does anyone know how to do it? I can port it from other languages to Bash, I guess it won't be a problem!

Thank you all in advance!

올바른 솔루션이 없습니다

다른 팁

As devnull already commented, it will be tricky to automate parsing these strings. There are many, many browsers out there and barely any of them structure the User Agent string the same.

If you're interested in parsing text with Bash though I would recommend learning to use regular expressions and the linux command line tool sed.

Using sed for example you could pull out any instance of Mac OS X followed by the version number like:

echo "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36" | sed 's/.*\(Mac\ OS\ X\ [0-9]\+_[0-9]\+_[0-9]\+\).*/\1/'

The sed command above matches the string Mac OS X #_#_#. The numbers are represented with [0-9] and the plus sign following that indicates it is allowed to repeat. The command above returns the following:

Mac OS X 10_9_1

You could run something similar to parse out Chrome and it's version number like:

sed 's/.*\(Chrome\/[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+\).*/\1/'

There's some more regex magic going on here such as matching groups (surrounding the portion we want to keep in parentheses and passing it over with \1) and escaping reserved characters such a space, plus, period, and forward slash.

raptastics has the best answer here in this case but if you want you could very well use "Perl" and split the $HTTP_USER_AGENT on the "/" delimiter and process them using key/value pair. Again, the answer to your question really depends on what you're doing.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top