How to solve the R error - error in svd(x, nu = 0) : infinite or missing values in 'x'

An error in svdx nu 0 : infinite or missing values in ‘x’ essentially means that your R code is failing to properly run singular value decomposition (SVD) on a matrix (x) due to infinite or missing values within the collection. To be a little more concise, you can think of it as R trying, and failing, to run a mathematical operation on a non-numeric variable.

An Overview of Singular Value Decomposition

Variable mutability is one of R’s defining traits. The language makes it easy to manipulate complex datasets from one form to another. And this makes R a natural fit for singular value decomposition (SVD). SVD describes a method to decompose a matrix into separate components. This can then be extended to factorize the matrices into singular values. The technique is incredibly value for anyone working with data analysis. For example, SVD is often used as a method of informational compression or collaborative filtering within recommendation systems. But it’s also used for image compression, noise reduction, and really anything that involves reducing huge and complex sets of information into smaller forms that still maintain the original’s overall meaning.

However, this description also highlights where the R error comes from. For SVD to work it needs to align with similarities in the dataset. If a value is essentially non-numeric it can’t align with the numeric values used by SVD. And in particular, this error points out that a value is either null (NA) or infinite. Values can’t be used if they essentially signify nothing. Or, conversely, it’s equally useless if it signifies a container holding every value (inf).

Fixing SVD Compatibility

The previous discussion might paint a rather complex picture. But the actual cause of the error really does simply come down to the same basic issue with missing values or incompatible variables that you’re probably more than used to debugging. Before going on to describe how to fix the error, take a look at how it can be triggered.

ourMatrix <- matrix(c(2, 4, NA, 8, Inf, 12), ncol=2)
svd(ourMatrix, nu = 0)

Note that we’re not using svdx or any special libraries here. We’re simply creating a matrix called ourMatrix and sending it to svd as an argument. The problem is obvious in this example as we’re explicitly creating values of inf and NA. But even if it’s not as explicitly declared in your own code, something similar has seeped into it if you’re seeing this error. And, as you’d expect, running svd against ourMatrix does indeed trigger the expected error message. The main issue simply comes down to the fact that both a total void and infinity count as missing data. The svd function is equally unable to work with each of those values so they’re treated as missing.

ourMatrix <- matrix(c(2, 4, NA, 8, Inf, 12), ncol=2)
cat(“Original matrix:\n”)
print(ourMatrix)
cat(“\nSVD with our original matrix:\n”)
try({
svd(ourMatrix, nu = 0)
})
ourMatrix[is.na(ourMatrix)] <- 0
ourMatrix[is.infinite(ourMatrix)] <- max(ourMatrix[!is.infinite(ourMatrix)])
cat(“\nCorrected matrix:\n”)
print(ourMatrix)
cat(“\nSVD on corrected matrix:\n”)
svdResult <- svd(ourMatrix, nu = 0)
print(svdResult)

We begin in the same way as the first example with a matrix called ourMatrix. And, once again, we populate it with an NA and inf variable. And, once again, we find that we can’t properly process the matrix through the svd function. There are a few possible methods that could be used to correct the matrix so that it can be properly processed. But in this example, we’ll just do a simple check for both “na” and “inf”. We then correct the values and reassign them. Next, we print out the corrected ourMatrix so that we can see that the changes have properly fixed the initial problems. Finally, we run svdResult again with the corrected matrix and assign it to svdResult before printing it to the screen.

In real-world conditions, you might want to try wrapping this concept into a full function if the inf or null values come up a lot in your code. This is generally best performed shortly after data is first imported. And preferably as part of the process that actually creates or imports the data. It’s generally a good idea to fully clean your data of any potential issues like this before it’s exposed to the larger ecosystem of your R code. The general rule is to keep data formatted into a predictable state. Both the R interpreter and you should always know how your data is formatted at all times.