Split an XML according to a maximum number of characters using XSL

Question

I have tried to tackle this with XSLT 2.0 and for-each-group but I had difficulties finding a grouping expression, I always needed/wanted to compute the string-length for the following element and I don't know of a way in XSLT 2.0 to do that. So I looked at other options and XQuery 3.0 with its window feature allows that.

Using Saxon 9.5 PE and the XQuery

xquery version "3.0";

declare variable $size as xs:integer external := 200;

declare function local:pair($element) {
  ($element, $element/following-sibling::*[1])
};

let $start-elements := //title-en | //p-en | //li-en
let $elements := $start-elements | //title-es | //p-es | //li-es
for tumbling window $table in $start-elements
    start $start when true()
    end $end next $enext when 
      sum(
        (local:pair($start)/string-length(), 
         $elements[$start << .
                   and . << $enext]/string-length(),
         local:pair($enext)/string-length())) gt $size
return <table>
         { for $el in $table
           return <tr>
                    {
                      for $pair in local:pair($el)
                      return <td class="{local-name($pair/..)}">{$pair}</td>
                    }
                  </tr>
         }
       </table>

with your sample input I get the result

<?xml version="1.0" encoding="UTF-8"?>
<table>
   <tr>
      <td class="title">
         <title-en>Document title in english</title-en>
      </td>
      <td class="title">
         <title-es>Título del documento en español</title-es>
      </td>
   </tr>
   <tr>
      <td class="title">
         <title-en>Section 1 title in english</title-en>
      </td>
      <td class="title">
         <title-es>Título de la sección 1 en español</title-es>
      </td>
   </tr>
   <tr>
      <td class="p">
         <p-en>Some text 1,<br/>more text</p-en>
      </td>
      <td class="p">
         <p-es>Texto 1,<br/>más texto</p-es>
      </td>
   </tr>
</table>
<table>
   <tr>
      <td class="li">
         <li-en>List text 1. See section <a href="2">2</a>
         </li-en>
      </td>
      <td class="li">
         <li-es>Texto de lista 1. Ver sección <a href="2">2</a>
         </li-es>
      </td>
   </tr>
   <tr>
      <td class="li">
         <li-en>List text 2</li-en>
      </td>
      <td class="li">
         <li-es>Texto de lista 2</li-es>
      </td>
   </tr>
   <tr>
      <td class="li">
         <li-en>List text 3</li-en>
      </td>
      <td class="li">
         <li-es>Texto de lista 3</li-es>
      </td>
   </tr>
   <tr>
      <td class="p">
         <p-en>Some text 2.</p-en>
      </td>
      <td class="p">
         <p-es>Texto 2.</p-es>
      </td>
   </tr>
   <tr>
      <td class="p">
         <p-en>Some text 3.</p-en>
      </td>
      <td class="p">
         <p-es>Texto 3.</p-es>
      </td>
   </tr>
</table>
<table>
   <tr>
      <td class="title">
         <title-en>Section 2 title in english</title-en>
      </td>
      <td class="title">
         <title-es>Título de la sección 2 en español</title-es>
      </td>
   </tr>
   <tr>
      <td class="p">
         <p-en>Some text 4. <b>Bold text</b>
         </p-en>
      </td>
      <td class="p">
         <p-es>Texto 4. <b>Texto en negrita</b>
         </p-es>
      </td>
   </tr>
   <tr>
      <td class="p">
         <p-en>Some text 5.</p-en>
      </td>
      <td class="p">
         <p-es>Texto 5.</p-es>
      </td>
   </tr>
</table>

which I think has the structure you want. There is fine-tuning left to get the right class attributes for instance but let us first know whether XQuery 3.0 like provided by Saxon PE or EE or other XQuery engines is an option for you.