Pergunta

I found some code online to help me play around with HTTP in Java. This code in particular I found at Apache HttpCore tutorial site.

The funny thing is when I put the hostname as www.google.com, the response is a 6 line HTTP 302 saying the page has moved.

But when I put in another, random website, like www.booya.com, I get a full response of the whole HTML page, as I'd expect?

What's going on? Does Google have some kind of blocking mechanism against non-browsers?

Here's the code:

/*
 * ====================================================================
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 * ====================================================================
 *
 * This software consists of voluntary contributions made by many
 * individuals on behalf of the Apache Software Foundation.  For more
 * information on the Apache Software Foundation, please see
 * <http://www.apache.org/>.
 *
 */



import java.net.Socket;

import org.apache.http.ConnectionReuseStrategy;
import org.apache.http.HttpHost;
import org.apache.http.HttpResponse;
import org.apache.http.impl.DefaultBHttpClientConnection;
import org.apache.http.impl.DefaultConnectionReuseStrategy;
import org.apache.http.message.BasicHttpRequest;
import org.apache.http.protocol.HttpCoreContext;
import org.apache.http.protocol.HttpProcessor;
import org.apache.http.protocol.HttpProcessorBuilder;
import org.apache.http.protocol.HttpRequestExecutor;
import org.apache.http.protocol.RequestConnControl;
import org.apache.http.protocol.RequestContent;
import org.apache.http.protocol.RequestExpectContinue;
import org.apache.http.protocol.RequestTargetHost;
import org.apache.http.protocol.RequestUserAgent;
import org.apache.http.util.EntityUtils;

/**
 * Elemental example for executing multiple GET requests sequentially.
 */
public class ElementalHttpGet {

    public static void main(String[] args) throws Exception {
        HttpProcessor httpproc = HttpProcessorBuilder.create()
            // Required protocol interceptors
            .add(new RequestContent())
            .add(new RequestTargetHost())
            // Recommended protocol interceptors
            .add(new RequestConnControl())
            .add(new RequestUserAgent("Test/1.1"))
            // Optional protocol interceptors
            .add(new RequestExpectContinue(true)).build();

        HttpRequestExecutor httpexecutor = new HttpRequestExecutor();

        HttpCoreContext coreContext = HttpCoreContext.create();
        HttpHost host = new HttpHost("www.booya.com", 80);
        coreContext.setTargetHost(host);

        DefaultBHttpClientConnection conn = new DefaultBHttpClientConnection(8 * 1024);
        ConnectionReuseStrategy connStrategy = DefaultConnectionReuseStrategy.INSTANCE;

        try {

            String[] targets = {
                    "/",
                    };

            for (int i = 0; i < targets.length; i++) {
                if (!conn.isOpen()) {
                    Socket socket = new Socket(host.getHostName(), host.getPort());
                    conn.bind(socket);
                }
                BasicHttpRequest request = new BasicHttpRequest("GET", targets[i]);
                System.out.println(">> Request URI: " + request.getRequestLine().getUri());

                httpexecutor.preProcess(request, httpproc, coreContext);
                HttpResponse response = httpexecutor.execute(request, conn, coreContext);
                httpexecutor.postProcess(response, httpproc, coreContext);

                System.out.println("<< Response: " + response.getStatusLine());
                System.out.println(EntityUtils.toString(response.getEntity()));
                System.out.println("==============");
                if (!connStrategy.keepAlive(response, coreContext)) {
                    conn.close();
                } else {
                    System.out.println("Connection kept alive...");
                }
            }
        } finally {
            conn.close();
        }
    }

}
Foi útil?

Solução

When some thing work for some servers and not for others, it probably is how they are configured.

In this case, it happens that Google no longer serves http but https, in a different port. The 302 is a code (google for "Http code") that instructs the client (the web browser or, in this case, your program), to try to connect to the alternate direction.

Go to your browser and type in the URL http://www.google.com, you will see how you will be redirected to https://www.google.com (or maybe a regional variation).

The important thing to learn from this is the meaning of the HTTP codes (at least the most usual -200, 302, 401, 404, 500-)

Outras dicas

From wikipedia:

The HTTP response status code 302 Found is a common way of performing a redirection.

An HTTP response with this status code will additionally provide a URL in the Location header field. The User Agent (e.g. a web browser, [or in this case, your java program]) is invited by a response with this code to make a second, otherwise identical, request, to the new URL specified in the Location field.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top