Question

I'm trying to modify an XML file which contains elements holding opening times for branches of a business. The XML file is inconsistent because for some branches it has just an opening time and a closing time, others have an opening time, a closing time for lunch, a post-lunch opening time and a closing time.

Examples of both types below:

<monday>10.00,17.00</monday>
<monday>09.00,12.30,13.30,17.00</monday>

I want to reformat these strings to a better format such as the ones below:

<monday>
  <open>10.00</open>
  <lunch></lunch>
  <close>17.00</close>
</monday>

<monday>
  <open>09.00</open>
  <lunch>12.30 - 13.30</lunch>
  <close>17.00</close>
</monday>

I've been trying to use BBEdit regular expressions on my Mac to make the changes but I'm having difficulty, specifically I think because I'm not sure how I can get the regular expression to replace a subset of the text I tell it to match on. For example, in pseudo code I want the regular expression to do this:

replace <monday>time1,time2</monday>
with <monday><open>time1</open><lunch></lunch><close>time2</close></monday>

replace <monday>time1,time2,time3,time4</monday>
with <monday><open>time1</open><lunch>time2 - time3</lunch><close>time4</close></monday>

I'm not too familiar with regular expressions so I'm making some errors I'm sure but so far I've been trying the below:

replace >#+\.#+,#+\.#+< with ><open>#+\.#+<open><lunch></lunch><close>#+.\#+<

I understand this isn't going to work anyway because I'm telling the regex to replace the numbers it matches with #+ with the strings '#+' etc.

How can I achieve what I want to do by regex or other means and also how to I tell the regular expression to use an expression for comparison but only replace a subset of the characters it matches?

Was it helpful?

Solution

Well I figured it out quicker than I expected. Here are the expressions I used:

I used the following find string:

(<[a-z]+day>)([0-9]+\.[0-9]+),([0-9]+\.[0-9]+)(</[a-z]+day>)

...and the following replace string:

\1<open>\2</open><lunch></lunch><close>\3</close>\4

to match the following lines:

<monday>10.00,17.00</monday>

which resulted in the following output:

<monday><open>10.00</open><lunch></lunch><close>17.00</close></monday>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top