when do certain records make it into the index OR how can i parallelize chef deployments dependent on search?

https://stackoverflow.com/questions/22823187

26-06-2023
|

Question

I've got a few cookbooks that rely on the search(:node, "query") call in Chef. My question deals specifically with the order in which things are added to the index and if this is a modifiable behavior.

Here's a generic version of the query that I depend on. Hopefully this will give you additional insight into my question.

Query

some_global_property:cluster01 AND recipes:MyCookBook\:\:SpecificRecipe AND some_sub_property:uniqueString AND chef_environment:DEV

Background

I am using knife ec2 to spin multiple (three, for testing purposes) instances up in a VPC. I have found that the query above will return no results until after one node successfully completes everything in it's runlist. Once one node completes (plus a few seconds), the search query returns that node. Once another node completes, two result records come back... etc.

My problem is that i would really like all three nodes to come up in parallel. At the appropriate time, the search() query should return all three records so each of the three nodes can continue their setup in parallel. Essentially, I am using the search() call to make all nodes that are part of a select group aware of each other during configuration. Immediately after configuration, they are set to communicate with one another. Unfortunately, when the nodes are spun up in parallel, they all reach the configuration generation phase at the same time. The search() query returns no instances and all three are configured as if they are singletons; not as part of a cluster. By the time the communication phase occurs, they all complain that they have no peers!

My question

When does a node report to the indexer what cookbooks / recipes the node has run? My (limited) testing indicates that this does not happen until after chef-client completes successfully. If this is in fact the behavior, can it be modified? How can I get n nodes, started in parallel, to communicate certain information to chef-solr before chef-client terminates successfully? Is sequential deployment my only option?

Solution

Buried within the Chef Client is the answer to my question.

When all of the actions identified by resources in the resource collection have been done, and when the chef-client run finished successfully, the chef-client updates the node object on the Chef server with the node object that was built during this chef-client run. (This node object will be pulled down by the chef-client during the next chef-client run.) This makes the node object (and the data in the node object) available for search.

The answer to my question is that the server (and thus the solr index) will be updated at at the end of a successful run only.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow