Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance issue with reading DICOM data into cell array

I need to read 4000 or more DICOM files. I have written the following code to read the files and store the data into a cell array so I can process them later. A single DICOM file contains 128 * 931 data. But once I execute the code it took more than 55 minutes to complete the iteration. Can someone point out to me the performance issue of the following code?

% read the file information form the disk to memory
readFile=dir('d:\images','*.dcm');

for i=1:4000

   % Read the information form the dicom files in to arrays

   data{i}=dicomread(readFile(i).name);
   info{i}=dicominfo(readFile(i).name);

   data_double{i}=double(data{1,i}); % convert 16 bit data into double
   first_chip{i}=data_double{1,i}(1:129,1:129); % extracting first chip data into an array

end
like image 377
Manoj Avatar asked May 15 '18 03:05

Manoj


1 Answers

You are reading 128*931*4000 pixels into memory (assuming 16-bit values, that's nearly 1 GB), converting that to doubles (4 GB) and extracting a region (129*129*4000*8 = 0.5 GB). You are keeping all three of these copies, which is a terrible amount of data! Try not keeping all that data around:

readFile = dir('d:\images','*.dcm');
first_chip = cell(size(readFile));
info = cell(size(readFile));
for ii = 1:numel(readFile)
   info{ii} = dicominfo(readFile(ii).name);
   data = dicomread(info{ii});
   data = (1:129,1:129); % extracting first chip data
   first_chip{ii} = double(data); % convert 16 bit data into double
end

Here, I have pre-allocated the first_chip and info arrays. If you don't do this, the arrays will be re-allocated every time you add an element, causing expensive copies. I have also extracted the ROI first, then converted to double, as suggested by Rahul in his answer. Finally, I am re-using the DICOM info structure to read the file. I don't know if this makes a big difference in speed, but it saves the dicomread function some effort.

But note that this process will still take a considerable amount of time. Reading DICOM files is complex, and takes time. I suggest you read them all in once, then save the first_chip and info cell arrays into a MAT-file, which will be a lot faster to read in at a later time.

like image 76
Cris Luengo Avatar answered Oct 12 '22 23:10

Cris Luengo