Question

I have been doing some snooping around and found what I thought was the right solution to my problem, non-greedy, but it is failing to work as expected.

I am attempting to segregate drop down menus that have the same content (for a LoadRunner script). The HTML code looks like this;

<input type="hidden" name="advanceDiscount" value="0"  /><table border="0" cellspacing="5"><tr><td align="left">Departure City :</td> <td><select name="depart" >
<option selected="selected" value="Denver">Denver</option>
<option value="Frankfurt">Frankfurt</option>
<option value="London">London</option>
<option value="Los Angeles">Los Angeles</option>
<option value="Paris">Paris</option>
<option value="Portland">Portland</option>
<option value="San Francisco">San Francisco</option>
<option value="Seattle">Seattle</option>
<option value="Sydney">Sydney</option>
<option value="Zurich">Zurich</option>
</select></td> <td align="left">Departure Date :</td> <td><input type="text" name="departDate" value="05/07/2014" size="10" maxlength="10" /> 
<!-- Departure Date Applet -->
<APPLET CODEBASE="/WebTours/classes/" CODE="FormDateUpdate.class" MAYSCRIPT Width=26 Height=28 BORDER=0>
   <PARAM NAME=CalenderTitle  VALUE="Select Departure Date">
   <PARAM NAME=HtmlFormIndex  VALUE=0>
   <PARAM NAME=HtmlEditIndex  VALUE=2>
   <PARAM NAME=AutoClose      VALUE=1>
   <PARAM NAME=Label          VALUE="...">
</APPLET>
</td></tr> <tr><td align="left">Arrival City :</td> <td><select name="arrive" >
<option selected="selected" value="Denver">Denver</option>
<option value="Frankfurt">Frankfurt</option>
<option value="London">London</option>
<option value="Los Angeles">Los Angeles</option>
<option value="Paris">Paris</option>
<option value="Portland">Portland</option>
<option value="San Francisco">San Francisco</option>
<option value="Seattle">Seattle</option>
<option value="Sydney">Sydney</option>
<option value="Zurich">Zurich</option>
</select></td> <td align="left">Return Date :</td> <td><input type="text" name="returnDate" value="05/08/2014" size="10" maxlength="10" /> 
<!-- Return Date Applet -->

The content I wish to capture is from <select name="depart" > to </select></td>

The regular expression I attempted was;

\Q<td><select name=\E"(.*\r\n)*(\Q</select></td>\E?)

But unfortunately, it captures up to the last </select></td> even though I have specified a non-greedy "?" within the third argument: (\Q</select></td>\E?)

Could anyone kindly alert me to my mistake, and possibly align me to a solution?

As an extension, what would be the way to say "only the second occurrence onwards"? So starting from the second <select name=".*> .

Cheers!!

The answer to my problem was to use <td><select name="(.*\r\n)*?(</select></td>) in case someone else wanted to know.

Thanks MikeH-R!

Was it helpful?

Solution

I'm reposting as an answer since you said the comment solved your problem, but I need to reiterate Joeytje50's comment first don't parse html with regex's.

Now that we've got that out of the way and you promise to only use this for educational purposes and never ever in production; here's the solution, you had the ? in the wrong place, you wanted to turn the * from greedy into non-greedy:

\Q<td><select name=\E"(.*\r\n)*?(\Q</select></td>\E)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top