# How to Use A Log Transformation in R To Rescale Your Data

While a lot of statistics deals with linear relationships, we live in a very non-linear world. There are power law distributions (80/20 relationships, the Pareto principal) in many areas of business, economics, and the social sciences. A handful of observations at the fringes of your distribution rise in a very non-linear fashion, making it difficult to fit a linear trend line through the data series. This can make it difficult to see patterns and distort your analysis. Fortunately, log transforms can help. By taking the logarithm of your data, you can reduce the range of values and make it easier to see patterns and relationships. Additionally, log transforms can help make your data more normally distributed, which is often necessary for statistical analysis. In this article, we’ll show you how to use R to perform log transforms, and explain why they are helpful and sometimes necessary for working with data.

You can use logarithmic transformation to change the dependent variable and independent variable, and counter any skewed data that may mess with your linear regression, arcsine transformation, geometric mean, negative value, or other linear relationship in your original data. By doing a logarithmic transformation on your original data distribution, you can give it a better normality assumption, making it an easier linear model to perform any statistical test one as transformed data.

## Introducing the log() function in R

A Log transformation in R is handled via the log() function This function takes the format log(value, base) and returns the logarithm of the value in the specified base. This function will default to the natural logarithm of the value. Log transformations can help to make your data more normally distributed, remove skewness, and create a numeric variable that better fits regression analysis and scatter plots. While log transformations may not be the simplest data transformation method, they can produce some of the best outcomes compared to other linear transformations, such as logit, square root, arcsine, reciprocal, or inverse transformations. In addition, there are shortcut variations available for base 2 and base 10.

> log(9,3)
[1] 2

This is the basic logarithm function with 9 as the value and 3 as the base. The results are 2 because 9 is the square of 3.

> log(5)
[1] 1.609438

Here, the second perimeter has been omitted resulting in a base of e producing the natural logarithm of 5.

> log(100,10)
[1] 2
> log10(100)
[1] 2

Here, we are comparing a base 10 log of 100 with its shortcut. For both cases, the answer is 2.

> log(8,2)
[1] 3
> log2(8)
[1] 3

Here, we have a comparison of the base 2 logarithm of 8 obtained by the basic logarithm function and by its shortcut. For both cases, the answer is 3 because 8 is 2 cubed.

## How Does A Log Transformation Help Us Analyze Data?

There are several reasons why you might want to do a log transform of your data:

• Reduce skewness: This is useful for statistical analysis, since many statistical tests assume normality.
• Reduce variance: If your data has unequal variances across different groups or levels, a log transform can help stabilize the variances and make them more equal.
• Make patterns visible: Sometimes, it can be easier to see patterns in data on a log scale than on a linear scale.
• To simplify interpretation: In some cases, a log transform can help simplify the interpretation of the data

### How To Apply a log transformation to an R Vector

To perform a a log transformation on vectors, add 1 to the vector and apply the log() function. The new vector will be less skewed than the original.

> v = c(100,10,5,2,1,0.5,0.1,0.05,0.01,0.001,0.0001)
> q=log(v+1)
> q
[1] 4.6151205168 2.3978952728 1.7917594692 1.0986122887 0.6931471806 0.4054651081
[7] 0.0953101798 0.0487901642 0.0099503309 0.0009995003 0.0000999950
> plot(v)
> plot(q)

A close look at the numbers above shows that v is more skewed than q. This fact is more evident by the graphs produced from the two plot functions including this code.

## How To Apply a log transformation to an R Data Frame

Applying a log transformation to an R data frame can be a bit trickier than a vector. You usually need to apply the log transformation to a specific column rather than the entire data structure.

This can be addressed via R’s column operations, where you create a new column in the data frame with a log transformed value. This is also good from a collaboration perspective, since this R code is relatively easy for a colleague or new analyst to understand (in the future, after your hand off a project).

> ChickWeight\$logweight=log(ChickWeight\$weight)
weight Time Chick Diet logweight
1     42    0     1    1  3.737670
2     51    2     1    1  3.931826
3     59    4     1    1  4.077537
4     64    6     1    1  4.158883
5     76    8     1    1  4.330733
6     93   10     1    1  4.532599