The Regex
From what I can see of what you are trying to do, I came up with the following:
(\d{2}/\w{3}/\d{4})(.+)(GET|POST)\s(http://|https://)(\w+)?\.?([\w\d]+)\.(\w+).*?200
The Breakdown
I'll breakdown the regex so that if it's not 100% what you're looking for hopefully it will put you on your way
group1
(\d{2}/\w{3}/\d{4})
captures the date on the log entry, format is DD/MMM/YYYY
group2
(.+)
capture the filler inbetween this and the next group. from your first example, this will match :02:31:06 -0500] "
Note: that if POST
or GET
group3
(GET|POST)
pretty self-explanitory
filler
\s
matching a single white-space character that we don't care about
group4
(http://|https://)
also pretty straight forward
group5
this is where your regex broke down I think.
(\w+)?\.?
This will match the www or hpcgi1 portion of the log entry. Note the ? character making this group optional. This is for cases such as
[14/Mar/2004:02:31:06 -0500] "GET http://searchanytime.com" 200 - "-" "-"
group6
([\w\d]+)
The middle portion (i.e. canada44, nifty) or the first portion (i.e. searchanytime)
group7
([\w\d]+)
The end portion (i.e. com, org)
filler
.*?
Any character (as few as possible) between the 'com', 'org', etc. and the 200. If you want to reference any of this you should capture it.
the end
200
match a 200. Note, because of the ? in the filler above, this will be the first 200 the match encounters after group7
Disclaimer
I have not actually tested this regex on your log messages outside of an online regex tool. I am not sure of the grouping you want/need, but hopefully this helps a little.