Frage

To test a method that would transform Text elements in an XML document I wrote two very simple Selectors and applied map/toUpperCase on the resulting Zipper. The result should be that all text elements except those excluded via the first Selector are transformed to upper case. But it only works for furthest-down Text elements. Here's the code:

scala> import com.codecommit.antixml._
import com.codecommit.antixml._

scala> val elemSelector = Selector({case x:Elem if x.name != "note" => x})
elemSelector: com.codecommit.antixml.Selector[com.codecommit.antixml.Elem] = <function1>

scala> val textSelector = Selector({case x:Text => x})
textSelector: com.codecommit.antixml.Selector[com.codecommit.antixml.Text] = <function1>

scala> val xml = XML.fromString("<tei><div><p>this<note>not<foreign lang=\"greek\">that</foreign>not</note></p><p>those<hi>these</hi></p></div></tei>")
xml: com.codecommit.antixml.Elem = <tei><div><p>this<note>not<foreign lang="greek">that</foreign>not</note></p><p>those<hi>these</hi></p></div></tei>

scala> val zipper = xml \\ elemSelector \ textSelector
zipper: com.codecommit.antixml.Zipper[com.codecommit.antixml.Text] = thisthatthosethese

scala> val modified = zipper.map(t => new Text(t.text.toUpperCase))
modified: com.codecommit.antixml.Zipper[com.codecommit.antixml.Text] = THISTHATTHOSETHESE

scala> val result = modified.unselect.unselect
result: com.codecommit.antixml.Zipper[com.codecommit.antixml.Node] = <tei><div><p>this<note>not<foreign lang="greek">THAT</foreign>not</note></p><p>those<hi>THESE</hi></p></div></tei>

So, in the second to last command, the upper case is applied to all targeted Text elements, but after stepping out of the zipper, only two of the four elements are transformed. I've tried it with <hi/> instead of <hi>these</hi> and then those gets capitalized. Any idea what's the problem here?

I am using the arktekk.no fork for Scala 2.10.3.

War es hilfreich?

Lösung

The problem you have comes from a merge conflict in the unselection process.

Just to simplify your problem a little bit, I'll use the following data:

val textSelector = Selector { case x: Text => x }

val xml = XML.fromString("<root><a>foo<b>bar</b></a></root>")

val zipper = xml \\ * \ textSelector

val modified = zipper.map(t => t.copy(text = t.text.toUpperCase))

val result = modified.unselect.unselect
// => <root><a>foo<b>BAR</b></a></root>

When you select all the elements in the tree with the * selector you get the a and b Elems in your results set. The second shallow selector looks only at direct children of either a or b and takes the Text values. So we get foo from a and bar from b.

After the modification the first unselect contains the individual Elems with their updates:

<a>FOO<b>bar</b></a>
<b>BAR</b>

Now the next unselect needs to merge b back into a to form a new version of a. The current version of b implies a new a such that:

<a>foo<b>BAR</b></a>

And there's your conflict, you can either have a with the children List(FOO, <b>bar</b>) or with the children List(foo, <b>BAR</b>). As there is no generic way to determine which list is better (they were both updated at the same time), the selection is implementation dependent. In this case, it takes the modification that came from the deeper level in the tree.

You can solve this by not selecting Elems and modifying the Text nodes directly, thus avoiding any possible conflicts (as they can only occur on Elems). So you write:

val zipper = xml \\ textSelector

val modified = zipper.map(t => t.copy(text = t.text.toUpperCase))

val result = modified.unselect
// => <root><a>FOO<b>BAR</b></a></root>

If that's not an option for your use case, it may be possible to define a custom merging strategy for unselect to use for this specific case; one which will manage to somehow disambiguate the different parts of the children lists. Even if possible, I doubt that it'll be worth the effort.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top