質問

I'm using the following command on a site I'm building on my local machine:

wget --page-requisites --html-extension --convert-links --restrict-file-names=windows --no-parent http://daosawan.dev

I'm using MAMP Pro to serve the pages locally. The URL http://daosawan.dev points to a directory on my local machine: /Applications/MAMP/htdocs/daosawan/

Here's the header of the resulting /index.html file:

<!DOCTYPE html>
<html lang="en-US">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, user-scalable=no">
<title>Daosawan</title>

    <link rel="stylesheet" type="text/css" media="all" href="wp-content/themes/daosawan_theme/style.css" />
<link rel='stylesheet' id='q-a-plus-css'  href='wp-content/plugins/q-and-a/css/q-a-plus.css@ver=1.0.6.2.css' type='text/css' media='screen' />
<script type='text/javascript' src='http://daosawan.dev/wp-includes/js/jquery/jquery.js?ver=1.10.2'></script>
<script type='text/javascript' src='http://daosawan.dev/wp-includes/js/jquery/jquery-migrate.min.js?ver=1.2.1'></script>
<script type='text/javascript' src='wp-content/themes/daosawan_theme/js/daosawan.js@ver=3.8.1'></script>
<link rel="EditURI" type="application/rsd+xml" title="RSD" href="http://daosawan.dev/xmlrpc.php?rsd" />
<link rel="wlwmanifest" type="application/wlwmanifest+xml" href="http://daosawan.dev/wp-includes/wlwmanifest.xml" /> 
<meta name="generator" content="WordPress 3.8.1" />
<!-- Q & A -->
        <noscript><link rel="stylesheet" type="text/css" href="wp-content/plugins/q-and-a/css/q-a-plus-noscript.css@ver=1.0.6.2.css" /></noscript><!-- Q & A -->
<meta http-equiv="Content-Language" content="en-US" />
<style type="text/css" media="screen">
.qtrans_flag span { display:none }
.qtrans_flag { height:12px; width:18px; display:block }
.qtrans_flag_and_text { padding-left:20px }
.qtrans_flag_en { background:url(wp-content/plugins/qtranslate/flags/gb.png) no-repeat }
.qtrans_flag_fr { background:url(wp-content/plugins/qtranslate/flags/fr.png) no-repeat }
</style>
<link hreflang="fr" href="http://daosawan.dev/fr/" rel="alternate" />
</head>

Notice how some of the <link> and <script> tags are converted to relative, but some keep the http://, and breaks when I try to publish the saved website/page to a public location.

What am I doing wrong?

役に立ちましたか?

解決

It appears that certain absolute URLs cannot be converted by wget. In my case, Wordpress rewrites certain URLs, which seems to confuse wget and show up as absolute URLs (http://...) in the output.

As a hacky work-around, I used Wordpress filters to make the app refer to relative URLs, and these are handled as expected by wget.

他のヒント

It's easy. You should use robots=off params.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top