Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to import CSV files in MATLAB

I've written a script that saves its output to a CSV file for later reference, but the second script for importing the data takes an ungainly amount of time to read it back in.

The data is in the following format:

Item1,val1,val2,val3
Item2,val4,val5,val6,val7
Item3,val8,val9

where the headers are on the left-most column, and the data values take up the remainder of the row. One major difficulty is that the arrays of data values can be different lengths for each test item. I'd save it as a structure, but I need to be able to edit it outside the MATLAB environment, since sometimes I have to delete rows of bad data on a computer that doesn't have MATLAB installed. So really, part one of my question is: Should I save the data in a different format?

Second part of the question: I've tried importdata, csvread, and dlmread, but I'm not sure which is best, or if there's a better solution. Right now I'm using my own script using a loop and fgetl, which is horribly slow for large files. Any suggestions?

function [data,headers]=csvreader(filename); %V1_1
 fid=fopen(filename,'r');
 data={};
 headers={};
 count=1;
 while 1
      textline=fgetl(fid);
      if ~ischar(textline),   break,   end
      nextchar=textline(1);
      idx=1;
      while nextchar~=','
        headers{count}(idx)=textline(1);
        idx=idx+1;
        textline(1)=[];
        nextchar=textline(1);
      end
      textline(1)=[];
      data{count}=str2num(textline);
      count=count+1;
 end
 fclose(fid);

(I know this is probably terribly written code - I'm an engineer, not a programmer, please don't yell at me - any suggestions for improvement would be welcome, though.)

like image 973
Doresoom Avatar asked Jan 11 '10 17:01

Doresoom


People also ask

How do I import csv data into MATLAB?

Launch MATLAB and click “File” in the menu bar at the top of the window. Click “Set Path” and search the pop-up file browser for the folder to set as your MATLAB path variable. Alternatively, leave the path set to the default folder. In Windows Explorer, drag and drop a CSV file in any folder on the MATLAB path.

Is Readtable faster than Xlsread?

In MATLAB 2019a for my Excel file, xlsread takes about 1.2 seconds to get the output; readtable takes about 12 seconds; readcell takes about 42 seconds. It appears to me that xlsread is superior to the other two methods when it comes to speed.

What is the recommended command for importing .csv files MATLAB?

M = csvread( filename ) reads a comma-separated value (CSV) formatted file into array M . The file must contain only numeric values.

Are csv files faster to handle?

csv files can be much faster, and it also consumes less memory. An Excel not only stores data but can also do operations on the data whereas a . csv file is just a text file, it stores data but does not contain formatting, formulas, macros, etc. that's why it is also known as flat files.


1 Answers

It would probably make the data easier to read if you could pad the file with NaN values when your first script creates it:

Item1,1,2,3,NaN
Item2,4,5,6,7
Item3,8,9,NaN,NaN

or you could even just print empty fields:

Item1,1,2,3,
Item2,4,5,6,7
Item3,8,9,,

Of course, in order to pad properly you would need to know what the maximum number of values across all the items is before hand. With either format above, you could then use one of the standard file reading functions, like TEXTSCAN for example:

>> fid = fopen('uneven_data.txt','rt');
>> C = textscan(fid,'%s %f %f %f %f','Delimiter',',','CollectOutput',1);
>> fclose(fid);
>> C{1}

ans = 

    'Item1'
    'Item2'
    'Item3'

>> C{2}

ans =

     1     2     3   NaN  %# TEXTSCAN sets empty fields to NaN anyway
     4     5     6     7
     8     9   NaN   NaN
like image 145
gnovice Avatar answered Oct 01 '22 15:10

gnovice