Question

Defining userdata for instances in AWS seems really useful for doing all kinds of bootstrap-type actions. Unfortunately, I have to use a custom CentOS AMI that didn't originate from one of the provided AMIs for PCI reasons, so cloud-init is not already installed and configured. I only really want it to set a hostname and run a small bash script. How do I get it working?

Was it helpful?

Solution

cloud-init is a very powerful, but very undocumented tool. Even once it's installed, there are lot of modules active by default that overwrite things you may have already defined on your AMI. Here are instructions for a minimal setup from scratch:

Instructions

  1. Install cloud-init from a standard repository. If you're worried about PCI, you probably don't want to use AWS's custom repositories.

    # rpm -Uvh https://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
    # yum install cloud-init
    
  2. Edit /etc/cloud/cloud.cfg, a yaml file, to reflect your desired configuration. Below is a minimal configuration with documentation for each module.

    #If this is not explicitly false, cloud-init will change things so that root
    #login via ssh is disabled. If you don't want it to do anything, set it false.
    disable_root: false
    
    #Set this if you want cloud-init to manage hostname. The current
    #/etc/hosts file will be replaced with the one in /etc/cloud/templates.
    manage_etc_hosts: true
    
    #Since cloud-init runs at multiple stages of boot, this needs to be set so
    #it can log in all of them to /var/log/cloud-init.
    syslog_fix_perms: null
    
    #This is the bit that makes userdata work. You need this to have userdata
    #scripts be run by cloud-init.
    datasource_list: [Ec2]
    datasource:
      Ec2:
        metadata_urls: ['http://169.254.169.254']
    
    #modules that run early in boot
    cloud_init_modules:
     - bootcmd  #for running commands in pre-boot. Commands can be defined in cloud-config userdata.
     - set-hostname  #These 3 make hostname setting work
     - update-hostname
     - update-etc-hosts
    
    #modules that run after boot
    cloud_config_modules:
     - runcmd  #like bootcmd, but runs after boot. Use this instead of bootcmd unless you have a good reason for doing so.
    
    #modules that run at some point after config is finished
    cloud_final_modules:
     - scripts-per-once  #all of these run scripts at specific events. Like bootcmd, can be defined in cloud-config.
     - scripts-per-boot
     - scripts-per-instance
     - scripts-user
     - phone-home  #if defined, can make a post request to a specified url when done booting
     - final-message  #if defined, can write a specified message to the log
     - power-state-change  #can trigger stuff based on power state changes
    
    system_info:
      #works because amazon's linux AMI is based on CentOS
      distro: amazon
    
  3. If there is a defaults.cfg in /etc/cloud/cloud.cfg.d/, delete it.

  4. To take advantage of this configuration, define the following userdata for new instances:

    #cloud-config
    hostname: myhostname
    fqdn: myhostname.mydomain.com
    runcmd:
     - echo "I did this thing post-boot"
     - echo "I did this too"
    

    You can also simply run a bash script by replacing #cloud-config with #!/bin/bash and putting the bash script in the body, but if you do, you should remove all of the hostname-related modules from cloud_init_modules.


Additional Notes

Note that this is a minimal configuration, and cloud-init is capable of managing users, ssh keys, mount points, etc. Look at the references below for more documentation on those specific features.

In general, it seems that cloud-init does stuff based on the modules specified. Some modules, like "disable-ec2-metadata", do stuff simply by being specified. Others, like "runcmd", only do stuff if their parameters are specified, either in cloud.cfg, or in cloud-config userdata. Most of the documentation below only tell you what parameters are possible for each module, not what the module is called, but the default cloud.cfg should have a complete module list to begin with. The best way I've found to disable a module is simply to remove it from the list.

In some cases, "rhel" may work better for the "distro" tag than "amazon". I haven't really figured out when.


References

OTHER TIPS

Here is a brief tutorial on how to run scripts during startup using cloud-init on AWS EC2 (CentOS).

This tutorial explains:

  • how to set configuration file /etc/cloud/cloud.cfg
  • how the cloud path /var/lib/cloud/scripts looks like
  • the script files under the cloud path using an example, and
  • how to check if the script files are executed during startup of the instance

Configuration file

The configuration file below is on AWS CentOS6. For Amazon Linux, see here.

# cat /etc/cloud/cloud.cfg
manage_etc_hosts: localhost
user: root
disable_root: false
ssh_genkeytypes: [ rsa, dsa ]

cloud_init_modules:
 - resizefs
 - update_etc_hosts
 - ssh

cloud_final_modules:
 - scripts-per-once
 - scripts-per-boot
 - scripts-per-instance
 - scripts-user

Directory Tree

Here is what the cloud path /var/lib/cloud/scripts looks like:

# cd /var/lib/cloud/scripts
# tree `pwd`
/var/lib/cloud/scripts
├── per-boot
│     └── per-boot.sh
├── per-instance
│     └── per-instance.sh
└── per-once
       └── per-once.sh

Content of the Script Files

Here are the contents of the example script files.
The files have to be under user root. See my way on creating the boot script.

# cat /var/lib/cloud/scripts/per-boot/per-boot.sh
#!/bin/sh
echo per-boot: `date` >> /tmp/per-xxx.txt

# cat /var/lib/cloud/scripts/per-instance/per-instance.sh
#!/bin/sh
echo per-instance: `date` >> /tmp/per-xxx.txt

# cat /var/lib/cloud/scripts/per-once/per-once.sh   
#!/bin/sh
echo per-once: `date` >> /tmp/per-xxx.txt

Result of Execution

In the case of initial start-up

# cat /tmp/per-xxx.txt
per-once: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:30:16 JST 
per-instance: 1 January 3, 2013 Thursday 17:30:16 JST

In the case of a reboot

# cat /tmp/per-xxx.txt
per-once: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:30:16 JST 
per-instance: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:32:24 JST

In the case of start from in the AMI

# cat /tmp/per-xxx.txt
per-once: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:30:16 JST 
per-instance: 1 January 3, 2013 Thursday 17:30:16 JST 
per-boot: 1 January 3, 2013 Thursday 17:32:24 JST 
per-boot: 1 January 3, 2013 Thursday 17:44:08 JST

Reference
The timing at which the script is run in cloud-init (CentOS6) was examined (translated)

Expanding on the prior answer for anyone trying to create a CentOS AMI that is cloud-init enabled (and capable of actually executing your CloudFormation scripts), you might have some success by doing the following:

  1. launch a marketplace CentOS AMI w/Updates - make sure cloud-init is present or sudo yum install -y cloud-init
  2. rm -rf /var/lib/cloud/data
  3. rm -rf /var/lib/cloud/instance
  4. rm -rf /var/lib/cloud/instances/*
  5. replace /etc/cloud/cloud.cfg with the configuration in the answer above but make sure you set distro: rhel
  6. Add the CloudFormation helpers (http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-helper-scripts-reference.html)
  7. create an AMI image from this instance

Had a heck of a time trying to figure out why my UserData was not being invoked until I realized that the images in the marketplace naturally only run your UserData once per instance AND of course they had already run. Removing the indicators that those had already been executed along with changing the distro: rhel in the cloud.cfg file did the trick.

For the curious, the distro: value should correspond to one of the python scripts in /usr/lib/python2.6/site-packages/cloudinit/distros. As it turns out the AMI I launched had no amazon.py, so you need to use rhel for CentOS. Depending on the AMI you launch and the version of cloud-init, YMMV.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top