How to Read A Fixed Width Text file in R (with examples)

Sometimes when doing data processing, you may need to load in a fixed-width text file. While R programming has a formula for doing the job, it requires a lot of effort to get the parameters just right, otherwise, you will just load in a meaningless bunch of characters. If you make sure you get the parameters correct you will get the data that you are looking for.

Reading Fixed Width Text files

The function that you use to load in a fixed width text file is the read.fwf function which has the format of read.fwf(file, skip, widths). “File” denotes the file that is to be loaded, “skip” determines the number of rows to skip before loading, and “widths” is a vector consisting of the width of each of the columns. At first glance, this looks quite simple, but you can easily mess up the process by not getting the skip and widths arguments right. If you do not get these parameters correct, the data will not come out the way you want it to.

How the read.fwf function works

When you load in a file using the read.fwf function, the file does not provide the width information necessary to properly format the data. Because of this, the function requires you to enter this information in order to properly format the data. You first have to use the skip parameter to tell the function where to start looking for the data. You then have to provide it with a vector consisting of the widths of each of the columns. This also provides the number of columns that you are going to have. The read.fwf function provides your data frame with default column names. To get the correct column name he will have to load in and rename the columns separately. All this however makes using the read.fwf function is a tricky process, that may not necessarily produce error messages, but rather faulty data formatting.

Examples of using read.fwf

Here we have four examples of the process of loading in a fixed-width text file in the R programming language. The first one shows how to load the file online. The second shows how to load it from your computer. The remaining two show the effects of changing some of the parameters in the read.fwf function.

> df = read.fwf(
+ file=url(“http://www.cpc.ncep.noaa.gov/data/indices/wksst8110.for”),
+ skip=4,
+ widths=c(12, 7, 4, 9, 4, 9, 4, 9, 4))
> head(df)
V1 V2 V3 V4 V5 V6 V7 V8 V9
1 03JAN1990 23.4 -0.4 25.1 -0.3 26.6 0.0 28.6 0.3
2 10JAN1990 23.4 -0.8 25.2 -0.3 26.6 0.1 28.6 0.3
3 17JAN1990 24.2 -0.3 25.3 -0.3 26.5 -0.1 28.6 0.3
4 24JAN1990 24.4 -0.5 25.5 -0.4 26.5 -0.1 28.4 0.2
5 31JAN1990 25.1 -0.2 25.8 -0.2 26.7 0.1 28.4 0.2
6 07FEB1990 25.8 0.2 26.1 -0.1 26.8 0.1 28.4 0.3

This example shows how to load a fixed-width text file from online. The importance of the skip and the widths arguments needs to be noted. If you do not get these parameters correct you will not get the information loaded properly.

> df = read.fwf(
+ file=”wksst8110.txt”,
+ skip=4,
+ widths=c(12, 7, 4, 9, 4, 9, 4, 9, 4))
> head(df)
V1 V2 V3 V4 V5 V6 V7 V8 V9
1 03JAN1990 23.4 -0.4 25.1 -0.3 26.6 0.0 28.6 0.3
2 10JAN1990 23.4 -0.8 25.2 -0.3 26.6 0.1 28.6 0.3
3 17JAN1990 24.2 -0.3 25.3 -0.3 26.5 -0.1 28.6 0.3
4 24JAN1990 24.4 -0.5 25.5 -0.4 26.5 -0.1 28.4 0.2
5 31JAN1990 25.1 -0.2 25.8 -0.2 26.7 0.1 28.4 0.2
6 07FEB1990 25.8 0.2 26.1 -0.1 26.8 0.1 28.4 0.3

This example shows how to load a fixed-width text file from your computer. Note that for this example to work the text file needs to be in the same folder as your R script file. To get this file simply save the file at the web address used in the above example as a text file and place it in the proper folder.

> df = read.fwf(
+ file=”wksst8110.txt”,
+ skip=4,
+ widths=c(1,2,3,4,5,6,7))
> head(df)
V1 V2 V3 V4 V5 V6 V7
1 NA 3 JAN 1990 NA 23.4-0 0.4
2 NA 10 JAN 1990 NA 23.4-0 0.8
3 NA 17 JAN 1990 NA 24.2-0 0.3
4 NA 24 JAN 1990 NA 24.4-0 0.5
5 NA 31 JAN 1990 NA 25.1-0 0.2
6 NA 7 FEB 1990 NA 25.8 0 0.2

In this example, we show what happens if you use a different widths argument. The skip argument is left alone because that would really mess up the results. In this case, you at least get meaningful dates from the file.

> df = read.fwf(
+ file=”wksst8110.txt”,
+ skip=4,
+ widths=c(2,3,4,6,7))
> head(df)
V1 V2 V3 V4 V5
1 0 3JA N199 0 23.4-0.
2 1 0JA N199 0 23.4-0.
3 1 7JA N199 0 24.2-0.
4 2 4JA N199 0 24.4-0.
5 3 1JA N199 0 25.1-0.
6 0 7FE B199 0 25.8 0.

In this final example, we show how you can really mess up your input if you do not get the skip and the widths arguments correct. In this case, it literally cuts the dates into pieces.

Opening up a world of public data

The main application of being able to read fixed-width text files is that a lot of data comes in plain text. It increases the range of data that you can access in your program tremendously. All that it requires is that within the text there is some degree of formatting such that the width of the columns will be consistent. Now you may have to look over the file to figure out what the parameters are going to be, but you can still access the data.

Accessing a fixed-width text file may be tricky because you have to provide the format data yourself, but it still allows you to access the content of the file. This is a very helpful tool to have when you are trying to access data that is not a typical save file.

Scroll to top