Question

I've been trying to create a regular expression to split up a variety of scripture references.

Here's what I've got so far:

$split_references = preg_split("/((, | or | and ))[^0-9(]/i", $reference);

A sample of text that this may need to deal with is:

Genesis 1:1-2:4a and Psalm 136:1-9, 23-26, Genesis 7:1-5, 11-18; 8:6-18; 9:8-13 and Psalm 46, Genesis 22:1-18 and Psalm 16, Exodus 14:10-31; 15:20-21 and Exodus 15:1b-13, 17-18, Isaiah 55:1-11 and Isaiah 12:2-6, Baruch 3:9-15, 3:32-4:4 or Proverbs 8:1-8, 19-21; 9:4b-6 and Psalm 19, Ezekiel 36:24-28 and Psalm 42, 43, Ezekiel 37:1-14 and Psalm 143, Zephaniah 3:14-20 and Psalm 98Genesis 1:1-2:4a and Psalm 136:1-9, 23-26, Genesis 7:1-5, 11-18; 8:6-18; 9:8-13 and Psalm 46, Genesis 22:1-18 and Psalm 16, Exodus 14:10-31; 15:20-21 and Exodus 15:1b-13, 17-18, Isaiah 55:1-11 and Isaiah 12:2-6, Baruch 3:9-15, 3:32-4:4 or Proverbs 8:1-8, 19-21; 9:4b-6 and Psalm 19, Ezekiel 36:24-28 and Psalm 42, 43, Ezekiel 37:1-14 and Psalm 143, Zephaniah 3:14-20 and Psalm 98

The type of result I'm currently getting is:

Array(
[0] => Genesis 1:1-2:4a
[1] => salm 136:1-9, 23-26
[2] => enesis 7:1-5, 11-18; 8:6-18; 9:8-13
[3] => salm 46
[4] => enesis 22:1-18
[5] => salm 16
[6] => xodus 14:10-31; 15:20-21
[7] => xodus 15:1b-13, 17-18
[8] => saiah 55:1-11
[9] => saiah 12:2-6
[10] => aruch 3:9-15, 3:32-4:4
[11] => roverbs 8:1-8, 19-21; 9:4b-6
[12] => salm 19
[13] => zekiel 36:24-28
[14] => salm 42, 43
[15] => zekiel 37:1-14
[16] => salm 143
[17] => ephaniah 3:14-20
[18] => salm 98
)

Notice the first letter is being cut off? I am trying to match [comma letter] to prevent breaking arrays on [comma verse]

Thanks for any direction you can provide! :)

Jason Silver

Was it helpful?

Solution

Use a lookahead:

$split_references = preg_split("/((, | or | and ))(?=[^0-9(])/i", $reference);

This says "not followed by one of these characters" without "consuming" those characters. Alternative (identical - but negative lookahead rather than positive lookahead on negative character class):

$split_references = preg_split("/((, | or | and ))(?!=[0-9(])/i", $reference);

Result of the above:

Array
(
[0] => Genesis 1:1-2:4a
[1] => Psalm 136:1-9, 23-26
[2] => Genesis 7:1-5, 11-18; 8:6-18; 9:8-13
[3] => Psalm 46
[4] => Genesis 22:1-18
[5] => Psalm 16
[6] => Exodus 14:10-31; 15:20-21
[7] => Exodus 15:1b-13, 17-18
[8] => Isaiah 55:1-11
[9] => Isaiah 12:2-6
[10] => Baruch 3:9-15, 3:32-4:4
[11] => Proverbs 8:1-8, 19-21; 9:4b-6
[12] => Psalm 19
[13] => Ezekiel 36:24-28
[14] => Psalm 42, 43
[15] => Ezekiel 37:1-14
[16] => Psalm 143
[17] => Zephaniah 3:14-20
[18] => Psalm 98Genesis 1:1-2:4a
[19] => Psalm 136:1-9, 23-26
[20] => Genesis 7:1-5, 11-18; 8:6-18; 9:8-13
[21] => Psalm 46
[22] => Genesis 22:1-18
[23] => Psalm 16
[24] => Exodus 14:10-31; 15:20-21
[25] => Exodus 15:1b-13, 17-18
[26] => Isaiah 55:1-11
[27] => Isaiah 12:2-6
[28] => Baruch 3:9-15, 3:32-4:4
[29] => Proverbs 8:1-8, 19-21; 9:4b-6
[30] => Psalm 19
[31] => Ezekiel 36:24-28
[32] => Psalm 42, 43
[33] => Ezekiel 37:1-14
[34] => Psalm 143
[35] => Zephaniah 3:14-20
[36] => Psalm 98

)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top