A common data manipulation task in R involves merging to data frames together. One of the simplest ways to do this is with the cbind function. The cbind function – short for column bind – can be used to combine two data frames with the same number of rows into a single data frame.
While simple, cbind addresses a fairly common issue with small datasets: missing or confusing codes. Consider, for example, we are going to look at our manufacturing productivity. Our basic data is very simple: we have a database of the number of widgets made per hour by different machine operators. Unfortunately, this database doesn’t track operators by their actual name but rather a code: “EDS01”. That will make our analysis a little challenging. Cbind to the rescue!
This example is going to cover a couple of topics. First, we’re going to use cbind to join two sets of columns together into a single dataframe. This will address the problem we have above, that of getting information from a legacy system with weird / unreadable codes.
Next we’re going to show to how you can use cbind to quickly append information to an existing data frame on the fly. In this case, we’re going to add notes on where the operators were hired. This will allow an analyst to examine performance by operator. Cbind is good for these sorts of exercises, where you want to quickly derive an attribute from notes or history and append it to your data.
We will start by setting up the data.
# cbind in r - data for example activity <-data.frame(opid=c("Op01","Op02","Op03", "Op04","Op05","Op06","Op07"), units=c(23,43,21,32,13,12,32)) names <- data.frame(operator=c("Larry","Curly","Moe", "Jack","Jill","Kim","Perry")) > activity opid units 1 Op01 23 2 Op02 43 3 Op03 21 4 Op04 32 5 Op05 13 6 Op06 12 7 Op07 32 > names operator 1 Larry 2 Curly 3 Moe 4 Jack 5 Jill 6 Kim 7 Perry
and now to combine it…
# how to use cbind in r blended <- cbind(activity, names) > blended opid units operator 1 Op01 23 Larry 2 Op02 43 Curly 3 Op03 21 Moe 4 Op04 32 Jack 5 Op05 13 Jill 6 Op06 12 Kim 7 Op07 32 Perry
There we go… much easier to read. We can see how everyone is doing.
Cbind Examples – append data attributes
Continuing our example a little further, we likely collected this data because we want to analyze it a bit. Perhaps we should want to look at productivity based on where the worker was recruited?
We will use cbind to append another column below. Since we hired our employees due to their roles in classic movies (the three stooges), nursery books (Jack and Jill), and cartoons (Kim Possible and Phineas and Ferb), we will note the source of the hire.
# cbind in r column names sourceofhire <- data.frame(found=c("Movie","Movie","Movie", "Book","Book","TV","TV")) blended <- cbind(activty, names, sourceofhire) > blended opid units operator found 1 Op01 23 Larry Movie 2 Op02 43 Curly Movie 3 Op03 21 Moe Movie 4 Op04 32 Jack Book 5 Op05 13 Jill Book 6 Op06 12 Kim TV 7 Op07 32 Perry TV
As you can see, we can use cbind to slap an additional set of attributes onto the dataset in a couple of seconds.
In fact, since cbind can join multiple sets of columns at once, we could have done this in one shot.
# r merge multiple data frames blended <- cbind(blended, sourceofhire) > blended opid units operator found 1 Op01 23 Larry Movie 2 Op02 43 Curly Movie 3 Op03 21 Moe Movie 4 Op04 32 Jack Book 5 Op05 13 Jill Book 6 Op06 12 Kim TV 7 Op07 32 Perry TV
Up till now we have been looking at simple merges where you rely on columns being in the same order. For more complicated joins, take a look at our article about merging dataframes.