Pergunta

I would like to be able to find out the position (also known as: offset, index) of the user's selection of text (highlighted) within a web page, relative to a particular parent node.

I have been using Rangy to try to help me out, but it doesn't seem to have offset properties built in to the API. Here is what I have so far:

function GetSelectionIndexRelativeTo(parentNodeId) 
{
    var parentNode = document.getElementById(parentNodeId);

    var txt = rangy.getSelection().toString();
    var code = rangy.getSelection().toHtml();
    var len = code.length;

    var x1 =         ;       // THIS IS WHAT I REALLY WANT. WHAT DO I DO?
    var x2 = x1 + len;

    return {
        text: txt,
        html: code,
        start: x1,
        end: x2,
        length: len
    }
}

To give some examples:

PAGETEXT:      <poem><title>Galaxy</title>Twinkle Twinkle Little Star</poem>

1. SELECTION:  Galaxy
   RELATIVETO: title
   RESULT:     start = 0

2. SELECTION:  Star
   RELATIVETO: poem
   RESULT:     start = 43

3. SELECTION:  Twinkle (the second one)
   RELATIVETO: poem
   RESULT:     start = 28
Foi útil?

Solução

In your answer to my comment you indicated that this is the goal:

I want to allows users to select certain words in a poem and add annotations. I want to save each annotation as a tuple .

In light of this goal, I would not use the approach you presented in your question. There are a few approaches that I've used to save ranges. Which one would be appropriate for you depends on the specifics of your project.

Use Rangy's Selection Save and Restore Module

Documented here. This would not work if any kind of persistence is required. This would also fail if the DOM is modified in anyway that deletes the markers that Rangy puts in. For instance an operation that would refresh the poems on the page by deleting them and redisplaying them would delete the markers. This could happen, for instance, if the poems can be filtered by keywords dynamically.

Very unlikely to be useful in the case at hand here. (It is useful in other circumstances.)

Use Rangy's Serializer Module

Documented here. This allows saving selections in a persistent way, and selections saved in this way will survive a DOM refresh.

Note that the DOM tree in which the selection existed before serialization must not change in structure. If the DOM tree changes structure, then the range may not be deserializable. (This depends on the specific circumstances.)

If you have a poem in this way:

<poem><title>Galaxy</title>Twinkle Twinkle Little Star</poem>

and this structure never changes, then you are fine. Even if some dynamic operation refreshes the poem by deleting it from the page and redisplaying it, as long as its structure is the same, the serializer will be able to deserialize the selection. However, if you save a selection, then decide to add an author like this:

<poem><title>Galaxy</title><author>John Doe</author>Twinkle Twinkle Little Star</poem>

then the structure has changed and the serializer won't be able to deserialize.

Note that the serializer is able to record a serialized range relative to a specific element, just like what you want. The format of the call looks like this:

rangy.serializeSelection(sel, false, root)

You'd pass the selection as sel. The second argument is set to false to let the serializer compute a checksum that will allow it to detect if the DOM has changed between serialization and deserialization. The root argument would be the element against which you'd want to serialize.

I would advise always using the same kind of element against which to serialize and deserialize. Pick the topmost element you care about. Presumably, this would be poem if you only care about selections in poems. So all annotations in a given poem would be serialized against this poem's poem element.

Make the Annotations Part of Your Markup

If you need to be able to change your data structure in the future, then you have to make your annotations part of the markup so if Bob adds an annotation, you could code it like this:

<poem><title>Galaxy</title>Twinkle Twinkle <ann-start ann="#ANN.1"/>Little Star<ann-end ann="#ANN.1"/></poem>
<ann id="ANN.1" creator="Bob">In private correspondence, the author referred to his daughter as his "Little Star"</ann>

(The above assumes an XML notation and is for the purpose of illustration only. A complete solution would do some things differently.)

Having <ann-start> to mark the start and <ann-end> to mark the end of an annotation in the text is necessary because annotations could straddle other annotations or markup constructs but XML (and HTML) don't allow tags to straddle. The ann attribute is an ID reference indicating which annotation the ann-start or ann-end element belong to.

By having the annotations be part of the data to which they refer, you could change the structure at will. So adding an author like this:

<poem><title>Galaxy</title><author>John Doe</author>Twinkle Twinkle <ann-start ann="#ANN.1"/>Little Star<ann-end ann="#ANN.1"/></poem>
<ann id="ANN.1" creator="Bob">In private correspondence, the author referred to his daughter as his "Little Star"</ann>

is not a problem because your application can still find the start and end of an annotation.

Why Not the Method Presented in The Question?

Because it is extremely fragile. Let's take the 3rd example given:

PAGETEXT:      <poem><title>Galaxy</title>Twinkle Twinkle Little Star</poem>

3. SELECTION:  Twinkle (the second one)
   RELATIVETO: poem
   RESULT:     start = 28

The offset 28 will become incorrect if any of the following happens:

  • A typo is discovered before the start of an annotation and the text of the poem is fixed.

  • The developer of the application decides to show more information and changes the structure of the poem. For instance, as shown above, adding an author name after the title.

  • The markup is changed to support more functionality. For instance, if it is necessary to refer to a title with a hyperlink:

    <poem><title id="...">Galaxy</title>Twinkle Twinkle Little Star</poem>
    

As soon as id is added, the offset becomes incorrect. Other examples would be footnotes, styling (for a one-off case it does not make sense to use a sytlesheet), language information (the poem is in English but the poet decided to use a French word for the title).

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top