سؤال

I'm attempting to create a mirror of a WordPress site with clean URLs (i.e. http://example.org/foo not http://example.org/foo.php). When Wget mirrors the site, it gives all pages and links a ".html" extension (i.e. http://example.org/foo.html).

Is it possible to set options for Wget to create a clean URL structure, so that the mirrored file corresponding to the page "http:example.org/foo" would be "/foo/index.html" and the link to that page would be "http:example.org/foo"? If so, how?

هل كانت مفيدة؟

المحلول

If I understand your question correctly, you're asking for what is the default behaviour of Wget.

Wget will only add the extension to the local copy, if the --adjust-extension option has been passed to it. Quoting the man page for Wget:

   --adjust-extension
       If a file of type application/xhtml+xml or text/html is downloaded and the URL does not end with the regexp \.[Hh][Tt][Mm][Ll]?, this option will cause the suffix .html to be appended to the
       local filename.  This is useful, for instance, when you're mirroring a remote site that uses .asp pages, but you want the mirrored pages to be viewable on your stock Apache server.  Another good
       use for this is when you're downloading CGI-generated materials.  A URL like http://example.com/article.cgi?25 will be saved as article.cgi?25.html.

However, what you seem to be asking for, that Wget saves example.org/foo as /foo/index.html is actually the default option. If you're seeing some other output, you should post the complete output of Wget with the --debug switch.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top