문제

I am running some web crawling jobs on an AWS hosted server. The crawler scrapes data from an eCommerce website but recently the crawler gets "timeout errors" from the website. The website might have limited my visiting frequency based on my IP address. Allocating a new Elastic-IP address solves the problem, but not for long.

My Question: Is there any service that I can use to automatically and dynamically allocate & associate new IPs to my instance? Thanks!

도움이 되었습니까?

해결책

To change the EIP you can just use Python boto

Something like this:

#!/usr/bin/python

import boto.ec2

conn = boto.ec2.connect_to_region("us-east-1",
    aws_access_key_id='<key>',
    aws_secret_access_key='<secret>')


reservations = ec2_conn.get_all_instances(filters={'instance-id' : 'i-xxxxxxxx'})
instance = reservations[0].instances[0]

old_address = instance.ip_address
new_address = conn.allocate_address().public_ip

conn.disassociate_address(old_address)
conn.associate_address('i-xxxxxxxx', new_address)

다른 팁

If you want use TOR network just execute:

sudo apt-get install tor 
sudo /etc/init.d/tor start

 netstat -ant | grep 9050 #  Tor port

and in your java project you set the proxy as:

public static void main(String[] args) {
    System.setProperty("socksProxyHost", "127.0.0.1");
    System.setProperty("socksProxyPort", "9050");

you can scheduler a cron job that each XX time reboot your application and tor.

Easy and secure.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top