Question

I have the following example of a /etc/hosts file:

127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

## vagrant-hostmanager-start
<ipaddress1>   namenode
<ipaddress2> secondarynamenode
<ipaddress3>   slave1
<ipaddress4>  slave2
< it can be more slave nodes here>
## vagrant-hostmanager-end

As I'm trying to configure a hadoop cluster, I need to split the /etc/hosts file in the following configuration:

  1. namenode and secondarynamenode go to a file called masters
  2. slaves* go to a file called slaves
  3. There will be at least the namenode, and 0 or more slaves.

What's the easiest way to parse that file as mentioned in bash, and write the parts to the respective parts as mentioned (masters and slaves)? I could do it easily in Python, but I need it done in bash. masters would look like:

    <ipaddress1>   namenode
    <ipaddress2> secondarynamenode

while slaves be like:

    <ipaddress3>   slave1
    <ipaddress4>  slave2
Was it helpful?

Solution

Using awk you can do:

awk '/namenode/{print >"masters"; next} /slave[0-9]/{print > "slaves"}' /etc/hosts

OTHER TIPS

Now, to do it right, you have to ensure that comments containing "slave" and "namenode" are ignored. I think this will work.

sed -n "/namenode/ s/^\([0-9][0-9]*.[0-9][0-9]*.[0-9][0-9]*.[0-9][0-9]*\)  *s*[a-z]*namenode.*/\1/gp" /etc/hosts > masters
sed -n "/slave/ s/^\([0-9][0-9]*.[0-9][0-9]*.[0-9][0-9]*.[0-9][0-9]*\)  *slave.*/\1/gp" /etc/hosts  > slaves

To get the whole line would be the same, but replace the /\1/ with /&/.

The answer provided by anubhava will also work, but could deliver wrong strings, specially if the IP address has an error such as 127.0.0.d1.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top