Question

I have a sentence with a uniform structure that I would like to use regex to pick out certain words from the sentence. For example the sentence structure is as follows:

["Take the"] + [train] + ["bound train to"] + [stop]

where words in quotes are hard coded and words without quotes are variable. For example, based on that sentence structure, the following sentences are applicable:

- Take the L bound train to 1st street.
- Take the 1 bound train to neverland. 

I need help coming up with a regex pattern that would match against this and allow me to parse out the [train] and [stop]. My regex kunfu is weak, and I could use some help.

Was it helpful?

Solution

Very simple regexp: '^Take the (.*) bound train to (.*)\.$' that stores [train] in the first capture group and [stop] in the second.

^               # Match the start of the string
Take the        # Match the literal string
(.*)            # Capture the [train]
bound train to  # Match the literal string
(.*)            # Capture the [stop]
\.              # Match the fullstop 
$               # Match the end of string

OTHER TIPS

preg_match("/^Take\sthe\s([\d\w]+)\sbound\strain\sto\s([\w\d]+)$/", $string, $hits);

Something like this should work

From my understanding it looks like you want to do some kind of templating which would require a rework of your sentence structure and formatting.

I see the following:

Take the %start% bound train to %stop%

Which is very easy to replace with the specific words that you need.

/%stop%/Union Station
/%stop%/East Station

I know this went around your question, but it would make for a better solution than a catch all regular expression that would/could become difficult to maintain in the future.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top