Terry : Chef

Chef

Concept - Infrastructure as Code

Chef Architecture

Chef Server, Nodes, and Workstations

When using Chef to manage infrastructure, you will be dealing with three types of hosts: a Chef Server, Nodes, and Chef Workstations.

Chef Server

The Chef Server is the central store of your infrastructure's configuration data. The Chef Server stores the data necessary to configure your nodes and provides search, a powerful tool that allows you to dynamically drive node configuration based on data. A REST API makes this data accessible to nodes and chef workstations. Optionally, you can use the Chef Server's WebUI to manage your infrastructure via a web interface. Hosted Chef users have a hosted Chef Server, managed as a service by Opscode with a WebUI interface that is accessible at http://manage.opscode.com.

Nodes

A Node is any host that is configured using chef-client. Chef-client runs on your nodes, contacting the Chef Server for the information necessary to configure the node. From the point of view of the Chef Server, a node is little more than a run list, the list of recipes and roles that will be applied to the node and define its configuration, and attributes, a set of data about the node itself. Since a Node is a machine that runs the chef-client software, nodes are sometimes referred to as "clients". (This is separate from API clients, that are authenticating against the Chef-Server API.)

When nodes are referenced as "clients", it implies the link between the chef-client executable, the API call made for authentication and authorization using the identity information in the API client object, to save the node object on the server.

Chef Workstations

A Chef Workstation is the host you use to modify your cookbooks and other configuration data--typically the local workstation of a system administrator. Such a workstation has two key components:

  1. the knife executable shipped with chef and
  2. a repository containing your infrastructure's configuration documents.

As we will see, these configuration documents include cookbooks, data bags, roles, and more. Ideally, these configuration documents are managed by a version control system. Using knife, your workstation is used to upload configuration data to the Chef Server as well as communicate with individual nodes over SSH when necessary.

Types of Chef

Different Flavors of Chef

  • Chef Solo
    Chef Solo is an open source standalone version of Chef that runs locally on your node, detached from a Chef server. This means that all the information and Cookbooks required to configure the node have to be present on the local disk. They can be retrieved via a remote URL with a command-line or config file option.
  • Chef Client and Chef Server
    The open source Chef Server is a client/server delivery of Chef functionality and power. Chef Client is a Chef agent that runs locally on the systems ("nodes") you are managing with Chef. Chef-client connects to a Chef Server to be told what to do on the node. This allows for more dynamic and flexible configuration management. For example, nodes can have roles applied to sets of them for consistency, and they can query information stored centrally on the server by other nodes, to dynamically configure themselves according to changing conditions elsewhere in your infrastructure. 
  • Hosted Chef
  • Private Chef

Chef Concepts

Chef Overview

Chef is a systems and cloud infrastructure automation framework that makes it easy to deploy servers and applications to any physical, virtual, or cloud location, no matter the size of the infrastructure. Chef relies on abstract definitions (known as cookbooks and recipes) that are written in Ruby and are managed like source code. Each definition describes how a specific part of your infrastructure should be built and managed. Chef then applies those definitions to servers and applications, as specified, resulting in a fully automated infrastructure. When a new node is brought online, the only thing that Chef needs to know is which cookbooks and recipes to apply.

The following diagram shows the relationships between the various elements of a Chef organization, including the nodes, the server, and the workstations. These elements work together to provide Chef the information and instruction that it needs so that it can do its job. As you are reviewing the rest of this doc, use the icons in the tables to refer back to this image.


Chef comprises 3 main elements: a server, one (or more) nodes, and at least one workstation.

  • The Chef Server acts as a hub that is available to every node in the Chef organization. This ensures that the right cookbooks (and recipes) are available, that the right policies are being applied, that the node object used during the previous Chef run is available to the current Chef run, and that all of the nodes that will be maintained by Chef are registered and known to the Chef Server.
  • The workstation is the location from which cookbooks (and recipes) are authored, policy data (such as roles, environments, and data bags) are defined, data is synchronized with the Chef repository, and data is uploaded to the Chef Server.
  • Each node contains a chef-client that performs the various infrastructure automation tasks that each node requires.

Cookbooks are also a very important element of Chef and will be treated as a separate component (alongside the server, nodes, and the workstation) across the documentation. In general, the cookbooks are authored and managed from the workstation, moved to the Chef server, and then are pulled down to nodes by the chef-client during each Chef run.

Nodes

A node is any server or virtual server that is configured to be maintained by a chef-client. A node can be physical or cloud-based. A Chef organization comprises any combination of physical and cloud-based nodes. A chef-client runs on each node. Ohai is used to collect data about the system so that it is available to the chef-client during every Chef run.

Workstations

A workstation is a computer that is configured to run Knife, to synchronize with the Chef repository, and interact with a single Chef Server. The workstation is the location from which most users of Chef will do most of their work, including:

  • Developing cookbooks and recipes (and authoring them using Ruby)
  • Keeping the Chef repository synchronized with version source control
  • Using Knife to upload items from the Chef repository to the Chef Server
  • Configuring organizational policy, including defining roles and environments and ensuring that critical data is stored in data bags
  • Interacting with nodes, as (or when) required, such as performing a bootstrap operation

NOTE: Tool knife-solo

knife-solo adds a handful of commands that aim to make working with chef-solo as powerful as chef-server. It currently adds 5 subcommands to knife:

  • knife solo init is used to create a new directory structure (i.e. "kitchen") that fits with Chef's standard structure and can be used to build and store recipes.
  • knife solo prepare installs Chef on a given host. It's structured to auto-detect the target OS and change the installation process accordingly.
  • knife solo cook uploads the current kitchen (Chef repo) to the target host and runs chef-solo on that host.
  • knife solo bootstrap combines the two previous ones (prepare and cook).
  • knife solo clean removes the uploaded kitchen from the target host.

Documentation: http://matschaffer.github.io/knife-solo/

Chef Server

Node Objects

NOTE: Chef Solo uses local JSON attribute file.

A node is any server or virtual server that is configured to be maintained by a chef-client. A node can be physical or cloud-based. A Chef organization comprises any combination of physical and cloud-based nodes. A chef-client runs on each node. Ohai is used to collect data about the system so that it is available to the chef-client during every Chef run.

For Chef, two important aspects of nodes are groups of attributes and run-lists.

An attribute is a specific piece of data about the node, such as a network interface, a file system, the number of clients a service running on a node is capable of accepting, and so on.

A run-list is an ordered list of recipes and/or roles that are run in an exact order. The node object consists of the run-list and node attributes, which is a JSON file that is stored on the Chef Server (local JSON file for solo). The chef-client gets a copy of the node object from the Chef Server during each Chef run and places an updated copy on the Chef Server at the end of each Chef run.

Attribute

An attribute is a specific detail about a node.

Attributes are used by Chef to understand:

  • The current state of the node
  • What the state of the node was at the end of the previous Chef run
  • What the state of the node should be at the end of the current Chef run

Attributes are defined by:

  • The state of the node itself
  • Cookbooks (in attribute files and/or recipes)
  • Roles
  • Environments

During every Chef run, the chef-client builds the attribute list using:

 

  • Data about the node collected by Ohai
  • The node object that was saved to the Chef Server at the end of the previous Chef run
  • The rebuilt node object from the current Chef run, after it is updated for changes to cookbooks (attribute files and/or recipes), roles, and/or environments, and updated for any changes to the state of the node itself

After the node object is rebuilt, all of attributes are compared, and then the node is updated based on attribute precedence. At the end of every Chef run, the node object that defines the current state of the node is uploaded to the Chef Server so that it can be indexed for search.

Run-list

Run-list: A run-list is an ordered list of roles and/or recipes that are run in an exact order. A run-list is always specific to the node on which it runs, though it is possible for many nodes to have run-lists that are similar or even identical. The items within a run-list are maintained using Knife and are uploaded to the Chef Server and stored as part of the node object for each node. Chef always configures a node in the exact order specified by its run-list and will never run the same recipe twice.

Policy

Policy settings can be used to map the capabilities of Chef to business and operational requirements, such as process and workflow. Roles define server types, such as "web server” or "database server”. Environments define process, such as "dev", "staging”, or "production”. Certain types of data, such as passwords, user account data, and other sensitive items can be placed in data bags, which are located in a secure sub-area of Chef that can only be accessed by nodes that have the correct SSL certificates.

  • A role is a way to define certain patterns and processes that exist across nodes in a Chef organization as belonging to a single job function. Each role consists of zero (or more) attributes and a run list. Each node can have zero (or more) roles assigned to it. When a role is run against a node, the configuration details of that node are compared against the attributes of the role, and then the contents of that role's run list are applied to the node’s configuration details. When a chef-client runs, it merges its own attributes and run lists with those contained within each assigned role.
  • A data bag is a global variable that is stored as JSON data and is accessible from a Chef Server. A data bag is indexed for searching and can be loaded by a recipe or accessed during a search. The contents of a data bag can vary, but they often include sensitive information (such as database passwords).
  • An environment is a way to map an organization’s real-life workflow to what can be configured and managed when using Chef Server. Every Chef organization begins with a single environment called the _default environment, which cannot be modified (or deleted). Additional environments can be created, such as production, staging, testing, and development. Generally, an environment is also associated with one (or more) cookbook versions.

Cookbooks

A cookbook is the fundamental unit of configuration and policy distribution in Chef. Each cookbook defines a scenario, such as everything needed to install and configure MySQL, and then it contains all of the components that are required to support that scenario, including:

  • Attribute values that are set on nodes
  • Definitions that allow the creation of reusable collections of resources
  • File distributions
  • Libraries that extend Chef and/or provide helpers to Ruby code
  • Recipes that specify which resources to manage and the order in which those resources will be applied
  • Custom resources and providers
  • Templates
  • Metadata about recipes (including dependencies), version constraints, supported platforms, and so on

Chef uses Ruby as its reference language for creating cookbooks and defining recipes, with an extended DSL for specific resources. Chef provides a reasonable set of resources, enough to support many of the most common infrastructure automation scenarios; however, this DSL can also be extended when additional resources and capabilities are required.

Some important components of cookbooks include:

An attribute can be defined in a cookbook (or a recipe) and then used to override the default settings on a node. When a cookbook is loaded during a Chef run, these attributes are compared to the attributes that are already present on the node. When the cookbook attributes take precedence over the default attributes, Chef will apply those new settings and values during the Chef run on the node.

A recipe is the most fundamental configuration element within the Chef environment.

A recipe

  • Is authored using Ruby, which is a programming language designed to read and behave in a predictable manner
  • Is mostly a collection of resources in a Ruby syntax with some helper code around it
  • Must define everything that is required to configure part of a system
  • MUST be stored in a cookbook
  • May be included in a recipe
  • May use the results of a search query and read the contents of a data bag (including an encrypted data bag)
  • May have a dependency on one (or more) recipes
  • May be tagged to facilitate the creation of arbitrary groupings that exist outside of the normal naming conventions an organization may have
  • Must be added to a run-list before it can be used by Chef
  • Is always executed in the same order as listed in a run-list

A cookbook version represents a specific set of functionality that is different from the cookbook on which it is based. A version may exist for many reasons, such as ensuring that the correct version of third-party component is being used appropriately or providing an update to a cookbook that fixes a bug or adds a new improvement. A cookbook version can be defined using syntax and operators, it can be associated with environments, cookbook metadata, or run-lists, and it can be frozen (to prevent unwanted updates from being made). A cookbook version is handled just a cookbook with regard to how the repository sees a cookbook version, how cookbook versions are stored on the Chef Server, how cookbook versions are pushed out to nodes, and how cookbook versions are used during a Chef run.

Chef will run a recipe ONLY when asked. When Chef runs the same recipe more than once, the results will be the same system state each time. When a recipe is run against a system, but nothing has changed on either the system or in the recipe, Chef won’t change anything.

In addition to attributes, recipes, and versions, the following items are also part of cookbooks:

  • Resources and providers
    A resource is a package, a service, a group of users, and so on. A resource tells Chef which provider to use during a Chef run for various tasks like installing packages, running Ruby code, or accessing directories and file systems. The resource is generic: "install program A" while the provider knows what to do with that process on Debian and Ubuntu and Microsoft Windows. A provider defines the steps that are required to bring that piece of the system into the desired state. Chef includes default providers that cover all of the most common scenarios.
  • File distributions
    A file distribution is a specific type of resource that tells a cookbook how to distribute files, including by node, by platform, or by file version.
  • Definitions
    A definition is used to create new resources by stringing together one (or more) existing resources.
  • Libraries
    A library allows the use of arbitrary Ruby code in a cookbook, either as a way to extend the Chef language or to implement a new class.
  • Templates
    A template is a file written in markup language that uses Ruby statements to solve complex configuration scenarios.
  • Configuration files
    A metadata file to ensure that each cookbook is correctly deployed to each node.

Chef Bootstrap

Chef Solo Bootstrap (Shell Script)

Chef Solo Bootstrap script for Oracle Linux / CentOS 5.x

#!/bin/bash
# --------------------------------------
#
#     Title: Chef Solo Bootstrap
#    Author: Terry Wang
#     Email: i (at) terry (dot) im
#  Homepage: http://terry.im
#      File: bootstrap-ubuntu.sh
#   Created: Feb, 2013
#
#   Purpose: Bootstrap Ruby and Chef on Ubuntu
#
# --------------------------------------
########### Setup Variables #############
RUBY_VERSION="1.9.3-p448"
########### Setup Variables #############

# Keep system up-to-date
sudo apt-get -y update

# Install dependencies
sudo apt-get -y install build-essential libssl-dev zlib1g-dev \
libreadline6-dev libyaml-dev

# Check if rbenv is already installed
if [ -d ~/.rbenv ]
then
  echo "rbenv already installed, remove ~/.rbenv if you want to install."
  exit 1
fi

# Install rbenv and ruby-build
git clone https://github.com/sstephenson/rbenv.git ~/.rbenv
git clone https://github.com/sstephenson/ruby-build.git ~/.rbenv/plugins/ruby-build

# Set up PATH
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.profile
# Enable shims and autocompletion
echo 'eval "$(rbenv init -)"' >> ~/.profile

# ~/.profile will NOT be read if ~/.bash_profile presents
if [ -f ~/.bash_profile ] || [ -h ~/.bash_profile ]
	then
	echo "~/.bash_profile found, backing up to ~/.bash_profile.old"
	mv ~/.bash_profile{,.old}
fi

# ~/.profile will NOT be read if ~/.bash_login presents
if [ -f ~/.bash_login ] || [ -h ~/.bash_login ]
then
	echo "~/.bash_login found, backing up to ~/.bash_login.old"
	mv ~/.bash_login{,.old}
fi

# Reload ~/.profile for Ubuntu
source ~/.profile

# Install ruby from source via ruby-build
rbenv install $RUBY_VERSION

# Set system wide
rbenv global $RUBY_VERSION

# Skip to avoid issue CHEF-3933
# echo "Updating rubygems..."
# gem update --system

# Install gems
gem install rbenv-rehash bundler chef berkshelf ruby-shadow --no-ri --no-rdoc

ret=$?
if [ $ret -ne 0 ]; then
    echo "Unfortunately something went wrong..." >&2
else
    echo "Ready to cook!"
fi

# Restart shell as login shell
exec $SHELL -l

exit 0

NOTE: The here document part is for Oracle Linux 5.7 base install use case ONLY, the requirement was to stay on 5.7 release, no updates should be applied.

Chef Solo Bootstrap script

Ubuntu 12.04, 13.04

Enhanced by using rbenv, Gist => https://gist.github.com/terrywang/5070354

chef_solo_bootstrap.sh
#!/bin/bash
# --------------------------------------
#
#     Title: Chef Solo Bootstrap
#    Author: Terry Wang
#     Email: i (at) terry (dot) im
#  Homepage: http://terry.im
#      File: bootstrap-ubuntu.sh
#   Created: Feb, 2013
#
#   Purpose: Bootstrap Ruby and Chef on Ubuntu
#
# --------------------------------------
########### Setup Variables #############
RUBY_VERSION="1.9.3-p448"
########### Setup Variables #############

# Keep system up-to-date
sudo apt-get -y update

# Install dependencies
sudo apt-get -y install build-essential libssl-dev zlib1g-dev \
libreadline6-dev libyaml-dev

# Check if rbenv is already installed
if [ -d ~/.rbenv ]
then
  echo "rbenv already installed, remove ~/.rbenv if you want to install."
  exit 1
fi

# Install rbenv and ruby-build
git clone https://github.com/sstephenson/rbenv.git ~/.rbenv
git clone https://github.com/sstephenson/ruby-build.git ~/.rbenv/plugins/ruby-build

# Set up PATH
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.profile
# Enable shims and autocompletion
echo 'eval "$(rbenv init -)"' >> ~/.profile

# ~/.profile will NOT be read if ~/.bash_profile presents
if [ -f ~/.bash_profile ] || [ -h ~/.bash_profile ]
	then
	echo "~/.bash_profile found, backing up to ~/.bash_profile.old"
	mv ~/.bash_profile{,.old}
fi

# ~/.profile will NOT be read if ~/.bash_login presents
if [ -f ~/.bash_login ] || [ -h ~/.bash_login ]
then
	echo "~/.bash_login found, backing up to ~/.bash_login.old"
	mv ~/.bash_login{,.old}
fi

# Reload ~/.profile for Ubuntu
# source ~/.profile

# Restart shell as login shell
exec $SHELL -l

# Install ruby from source via ruby-build
rbenv install $RUBY_VERSION -v

# Set system wide
rbenv global $RUBY_VERSION

# Skip to avoid issue CHEF-3933
# echo "Updating rubygems..."
# gem update --system

# Install gems
gem install rbenv-rehash bundler chef ruby-shadow --no-ri --no-rdoc

# Restart shell as login shell
# exec $SHELL -l

ret=$?
if [ $ret -ne 0 ]; then
	echo "Unfortunately something went wrong..." >&2
fi
echo "Ready to cook!"
exit $ret

NOTE: libreadline-gplv2-dev (depends on libreadline5-dev) VS libreadline6-dev

Bootstrap a node using knife

The Chef Server acts as a hub for configuration data. The Chef Server stores cookbooks, the policies that are applied to cookbooks, and metadata that describes each registered node in the infrastructure. Nodes use the chef-client to ask the Chef Server for configuration details, such as recipes, templates, and file distributions. The chef-client then does as much of the configuration work as possible on the nodes themselves (and not on the Chef Server). This scalable approach distributes the configuration effort throughout the organization.

Steps on how to bootstrap a node

  1. Identify the FQDN or IP address for the node
  2. Run the knife bootstrap command
  3. Verify the node on the Chef Server
Identify the FQDN or IP Address

The knife bootstrap command requires the FQDN or the IP address for the node in order to complete the bootstrap operation.

Run the knife bootstrap command

Once the workstation is configured, it can be used to install Chef on one (or more) nodes across the organization using a Knife bootstrap operation. The knife bootstrap command is used to SSH into the target machine, and then do what is needed to allow the chef-client to run on the node. It will install the chef-client executable (if necessary), generate keys, and register the node with the Chef Server. The bootstrap operation requires the IP address or FQDN of the target system, the SSH credentials (username, password or identity file) for an account that has root access to the node, and (if the operating system is not Ubuntu, which is the default distribution used by knife bootstrap) the operating system running on the target system.

To install Chef on a node using knife bootstrap

# FQDN
knife bootstrap FQDN -x username -P password --sudo
# IP
knife bootstrap IP_ADDR -x username -P password --sudo

Example, bootstrap a opscode bento vagrant base box using knife

$ knife bootstrap localhost \
  --ssh-user vagrant \
  --ssh-password vagrant \
  --ssh-port 2222 \
  --run-list "recipe[apt],recipe[aliases],recipe[apache2],recipe[networking_basic]" \
  --sudo

NOTE: knife bootstrap

Verify the node

After a bootstrap operation has finished, verify that the node is recognized by the Chef Server. Use one of the following Knife subcommands

# show only the node that was just bootstrapped
knife client show NAME_OF_NODE

where NAME_OF_NODE is the name of the node that was just bootstrapped.

The Chef Server will return something like below

admin: false
chef_type: client
json_class: Chef::ApiClient
name: name_of_node
public_key:

To show the full list of nodes (and workstations) that are registered with the Chef Server, run the following command

knife client list

Reference: Bootstrap a Node

Resources

 

Attachments:

chef-basics.png (image/png)