Question

I have the following example of a /etc/hosts file:

127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

## vagrant-hostmanager-start
<ipaddress1>   namenode
<ipaddress2> secondarynamenode
<ipaddress3>   slave1
<ipaddress4>  slave2

< it can be more slave nodes here>
## vagrant-hostmanager-end

As I'm trying to configure a hadoop cluster, I need to split the /etc/hosts file in the following configuration:

namenode and secondarynamenode go to a file called masters slaves* go to a file called slaves There will be at least the namenode, and 0 or more slaves. What's the easiest way to parse that file as mentioned in bash, and write the parts to the respective parts as mentioned (masters and slaves)? I could do it easily in Python, but I need it done in bash. masters would look like:

<ipaddress1>
<ipaddress2>

while slaves be like:

<ipaddress3>
<ipaddress4>

that's it, two files containing only the ip addresses, not the name of the machine ... The reason of that is Hadoop won't work if the machine's name is present. How can I accomplish that using awk or bash? I have the following command:

awk '/namenode/{print >"masters"; next} /slave[0-9]/{print > "slaves"}' /etc/hosts

but that keeps the machine name on ...

Was it helpful?

Solution

Here is one way of doing it with awk:

awk '
$2~/namenode/ { print $1 > "masters" } 
$2~/slave/ { print $1 > "slaves"}
' /etc/hosts
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top