Question

I have a web.xml I want to update through XPATH. I noticed that the desired elements are modified correctly but a bunch of junk is added to the beginning of the document. I noticed that I get that junk even when I don't modify any elements, just parse and print.

The code:

require Cwd;
use File::Temp qw/ tempfile tempdir/;
use lib 'menu/perl-modules/lib/site_perl';
use XML::XPath;
use XML::XPath::NodeSet;
#use strict;

$file = "/tmp/web.xml";
my $xp   = XML::XPath->new( filename => $file );
my $root = $xp->find('/')->get_nodelist;
#$xp->setNodeText( $xpath, $newValue );

open( XPATH_FILE, "> $file" );
foreach my $nodes ( $xp->find('/')->get_nodelist ) {
  print XPATH_FILE $nodes->toString;
}
close(XPATH_FILE);

Input document:

<!DOCTYPE web-app PUBLIC
 "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
  "http://java.sun.com/dtd/web-app_2_3.dtd" >
<web-app>
   <filter>
      <filter-name>LocaleFilter</filter-name>
      ....
</web-app>

The output: about 700 lines of comments at the beginning of the document which looks like some sort of expansion of the referenced dtd or something. I'm only including the first few lines for readibility:

<!--
DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER.

Copyright 2000-2007 Sun Microsystems, Inc. All rights reserved.

The contents of this file are subject to the terms of either the GNU
General Public License Version 2 only ("GPL") or the Common Development
and Distribution License("CDDL") (collectively, the "License").  You
may not use this file except in compliance with the License. You can obtain
a copy of the License at https://glassfish.dev.java.net/public/CDDL+GPL.html
or glassfish/bootstrap/legal/LICENSE.txt.  See the License for the specific
language governing permissions and limitations under the License.

When distributing the software, include this License Header Notice in each
file and include the License file at glassfish/bootstrap/legal/LICENSE.txt.
Sun designates this particular file as subject to the "Classpath" exception
as provided by Sun in the GPL Version 2 section of the License file that
accompanied this code.  If applicable, add the following below the License
Header, with the fields enclosed by brackets [] replaced by your own
identifying information: "Portions Copyrighted [year]
[name of copyright owner]"

Contributor(s):

If you wish your version of this file to be governed by only the CDDL or
only the GPL Version 2, indicate your decision by adding "[Contributor]
elects to include this software in this distribution under the [CDDL or GPL
Version 2] license."  If you don't indicate a single choice of license, a
recipient has the option to distribute your version of this file under
either the CDDL, the GPL Version 2 or to extend the choice of license to
its licensees as provided above.  However, if you add GPL Version 2 code
and therefore, elected the GPL Version 2 license, then the option applies
only if the new code is made subject to such option by the copyright
holder.
--><!--
This is the XML DTD for the Servlet 2.3 deployment descriptor.
Was it helpful?

Solution

I don't understand why this module is taking any account at all of the linked DTD document, as it is doing no validity check as far as I can see.

In addition, while the module allows for changing and adding to the nodes of a document, there is no obvious method to remove nodes.

However, the comments that you want to exclude are children of the root nodes, so they can be removed effectively by re-rooting the document on the only element child of the root node.

This code demonstrates

use strict;
use warnings;
use autodie;
use 5.010;

use XML::XPath;

my $xp   = XML::XPath->new( ioref => *DATA );
my ($new_root) = $xp->findnodes('/*');

print $new_root->toString, "\n";

__DATA__
<!DOCTYPE web-app PUBLIC
 "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
  "http://java.sun.com/dtd/web-app_2_3.dtd" >
<web-app>
  <filter>
    <filter-name>LocaleFilter</filter-name>
  </filter>
</web-app>

output

<web-app>
  <filter>
    <filter-name>LocaleFilter</filter-name>
  </filter>
</web-app>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top