MATLAB™ is a widely used professional tool for numerical processing used across multiple divers disciplines like Physics, Chemistry, and Mathematics. You can encounter multiple public data sets which are published in MATLAB™ format. This article gives a brief example of such data set and reading it from R.
The data set given here is provided by the NASA and contains multiple so called Flight Data Records (FDR), which are recordings of airplane data collected before, after and during each flight. The NASA has published a sample data set of multiple flight here. You can use this download script or take this adapted script which tracks download progress.
To read the data through R you first have to unzip them. Each flight contains multiple files and across multiple directories. To read for example all the files of flight 652, at least the first part we can iterate through all the files listed:
dir <- "./Tail_652_1/" files <- as.character(list.files(path=dir)) for(i in 1:length(files){ file <- paste(dir, files[i], sep="") ... }
R Packages
CRAN holds at least two projects designed for working with MATLAB™ and it’s file format. There is R.matlab (PDF) or RMatlab (PDF) which provide functional to R developers working with MATLAB™. R.matlab has a very simple interface to work with files that is why it has picked for this example. RMatlab on the other hands seems to provide very mature interfaces to interact with MATLAB™ directly.
Reading
Reading one of the files can simply be achieved like this:
fdr <- readMat(paste(dir, "652200107281436.mat", sep=""))
Summary
Each files contains a collection of 86 parameter being collected over a certain amount of time in different intervals. Each parameter is a list of a values each being data, Ratio, Unit, Description, and Alpha. A quick summary of the file reveals the contained parameters.
> summary(fdr) Length Class Mode VAR.1107 5 -none- list VAR.2670 5 -none- list VAR.5107 5 -none- list VAR.6670 5 -none- list FPAC 5 -none- list BLAC 5 -none- list CTAC 5 -none- list ....
Exploring LONG
For the long parameter we get the following list elements:
> fdr$LONG , , 1 [,1] data Numeric,19840 Rate 4 Units "G" Description "LONGITUDINAL ACCELERATION" Alpha "LONG"
Listing the data field values:
> fdr$LONG[[1]][1:20] [1] -0.0043079853 -0.0032919645 -0.0073559284 -0.0114198923 -0.0032919645 -0.0124359131 -0.0068479776 -0.0109119415 -0.0119279623 0.0068680048 0.0048360825 [12] 0.0104240179 -0.0048159361 -0.0022759438 -0.0109119415 -0.0002439022 -0.0032919645 -0.0053238869 -0.0093879700 -0.0093879700