Computers have obviously changed the world in countless ways. But anyone interested in data science can attest to the extent that computers have changed our relationship with math. Computers can work with a sheer volume of numerical data that would quickly overwhelm any human. And the machines can perform those calculations in the blink of an eye. But at the same time, there’s also a question of how to actually go about implementing any given procedure. Statistics-oriented languages like R provide us with a wealth of options. And we often have more than one way to reach a singular end goal. For example, how would we go about calculating the dot product of a vector in R? And how do the various methods stack up against each other?
Dot Products and Definitions
It’s important to pause and define what a dot product is before we look at how to write practical implementations of a dot product calculation. The term dot product can mean different things in different fields. But in this case, we define it as a summed value of multiple numerical collections. That summed value is itself created using values derived from multiplying elements in one vector with their corresponding element in the other.
The classic example of a dot product involves looking at an object in motion. We can model an object’s movement by defining two elements – force and motion. Both of these properties can be defined as a sequence of numbers. Or, we can approach that motion from a different perspective by looking at the dual properties of a vector’s position relative to coordinate axes. A dot product can even be used to chart whether two vectors will join at a right angle.
In short, we multiply corresponding elements in two numerical collections to derive a new collection of numbers. We then take the sum of those values to use as the dot product. Calculating a dot product can sound like a complex process. But the methodology used to calculate it becomes a lot clearer when you actually see it in action.
Starting Out With a Basic Solution
The previous description of dot products might make it sound like a lengthy procedure. And, to be sure, that’s often true when the calculations are being done by hand. But take a look at how efficiently the process can be handled in R.
ourVec1 <- c(1, 2, 3, 4, 5)
ourVec2 <- c(6, 7, 8, 9, 10)
ourResult <- ourVec1 %*% ourVec2
print(ourResult)
We begin by defining two vectors, ourVec1 and ourVec2. Each vector consists of five numbers. We always need to make sure that the vectors have an equivalent amount of numbers. If you recall the previous description, a dot product is calculated by first multiplying corresponding numbers from each vector. This only works if each number in one vector has a corresponding element in the other. Again, think of the earlier example of an object in motion. At any given point a measurement will always produce values equivalent to the two metrics under observation.
With everything properly defined we can now move on to performing the calculation. We begin this process by creating ourResult to hold the results. Next, we use a function in R called matrix multiplication. This simply involves encasing the matrix multiplication operator between our two vectors. Matrix multiplication has different behaviors depending on the data formats provided to it. But the matrix multiplication operator simply returns the inner product when we supply two vectors of equivalent size. When working with this type of data the matrix multiplication operator %*% behaves in a similar manner to the standard multiplication operator. The main difference is that the matrix multiplication operator will automatically work with and then sum the values of two vector containers.
This method does produce the end result we’re looking for. But we can actually make this already concise process even more efficient through the use of a 3rd party library called pracma.
Making Things Easier With Pracma
R libraries tend to focus on a single idea or collection of related functions. For example, the tidyr library focuses on methods to create and maintain tidy datasets. But the pracma library is more like a toolbox of useful functions. The library consists of functions that implement commonly used formulas, numerical optimizations, and many other tools which streamline the coding process. And as luck would have it, we also find a dot product function within pracma. Take a look at the following code to see it in action.
library(pracma)
ourVec1 <- c(1, 2, 3, 4, 5)
ourVec2 <- c(6, 7, 8, 9, 10)
ourResult <- dot(ourVec1, ourVec2)
print(ourResult)
We begin by importing the pracma library and proceed to define our variables in the same way as in the first example. But this time around we simply need to call pracma’s dot function to get the dot product of ourVec1 and ourVec2. We just pass those two vectors as arguments and can then assign the result to ourResult. And finally, we print ourResult to screen.
Using dot isn’t a massive leap forward in overall usability. But it does produce much neater code that’s easier to scale up and maintain. As such, it’s generally the best way to go about calculating a dot product if you’re going to perform the operation regularly within your code. However, we can also expand on both of these ideas to scale up what they’re capable of.
Taking a Different Approach to Data
Up until this point, we’ve been using two simple vectors for our dot product calculation. But it’s important to keep in mind that these vectors are just a collection of numbers. As such, we can use other R structures which contain numbers in the same way as the current variables. Take a look at the following example to see how we could build on our first code sample to use a data frame.
df <- data.frame(ourVec1= c(1, 2, 3, 4, 5),
ourVec2=c(6, 7, 8, 9, 10))
ourResult <- df$ourVec1 %*% df$ourVec2
print(ourResult)
In this example, we begin by creating a data frame containing the same information previously assigned to ourVec1 and ourVec2. Each column is equivalent to the prior individual vectors of the same name. The data frame contains the same values. And we can perform the dot product on those values by simply referencing them when using our matrix multiplication operator on the df data frame. The following code sample shows that the methodology also holds up when using pracma’s dot function.
library(pracma)
df <- data.frame(ourVec1= c(1, 2, 3, 4, 5),
ourVec2=c(6, 7, 8, 9, 10))
ourResult <- dot(df$ourVec1, df$ourVec2)
print(ourResult)
We once again create the data frame with the same information found in our initial vector assignments. And, once again, we essentially just need to replace the variable names with the appropriate data frame reference. We simply pass the appropriate data frame column to the dot function.
This concept largely holds true for most of R’s standard syntax. We can generally supply vectors for dot product calculations through any of R’s standard processes. For example, a slice of data could be specified with the colon operator. We’d simply need to use the x:y format, such as 4:8, to provide a sequence of numbers for the dot function or matrix multiplication operator to use.