Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

matlab parse file into cell array

I have a file in the following format in matlab:

user_id_a: (item_1,rating),(item_2,rating),...(item_n,rating)
user_id_b: (item_25,rating),(item_50,rating),...(item_x,rating)
....
....

so each line has values separated by a colon where the value to the left of the colon is a number representing user_id and the values to the right are tuples of item_ids (also numbers) and rating (numbers not floats).

I would like to read this data into a matlab cell array or better yet ultimately convert it into a sparse matrix wherein the user_id represents the row index, and the item_id represents the column index and store the corresponding rating in that array index. (This would work as I know a-priori the number of users and items in my universe so ids cannot be greater than that ).

Any help would be appreciated.

I have thus far tried the textscan function as follows:

c = textscan(f,'%d %s','delimiter',':')   %this creates two cells one with all the user_ids
                                          %and another with all the remaining string values.

Now if I try to do something like str2mat(c{2}), it works but it stores the '(' and ')' characters also in the matrix. I would like to store a sparse matrix in the fashion that I described above.

I am fairly new to matlab and would appreciate any help regarding this matter.

like image 595
anonuser0428 Avatar asked Feb 20 '26 12:02

anonuser0428


1 Answers

f = fopen('data.txt','rt'); %// data file. Open as text ('t')
str = textscan(f,'%s'); %// gives a cell which contains a cell array of strings
str = str{1}; %// cell array of strings
r = str(1:2:end);
r = cellfun(@(s) str2num(s(1:end-1)), r); %// rows; numeric vector
pairs = str(2:2:end); 
pairs = regexprep(pairs,'[(,)]',' ');
pairs = cellfun(@(s) str2num(s(1:end-1)), pairs, 'uni', 0);
%// pairs; cell array of numeric vectors
cols = cellfun(@(x) x(1:2:end), pairs, 'uni', 0);
%// columns; cell array of numeric vectors
vals = cellfun(@(x) x(2:2:end), pairs, 'uni', 0);
%// values; cell array of numeric vectors
rows = arrayfun(@(n) repmat(r(n),1,numel(cols{n})), 1:numel(r), 'uni', 0);
%// rows repeated to match cols; cell array of numeric vectors
matrix = sparse([rows{:}], [cols{:}], [vals{:}]);
%// concat rows, cols and vals into vectors and use as inputs to sparse

For the example file

1: (1,3),(2,4),(3,5)
10: (1,1),(2,2)

this gives the following sparse matrix:

matrix =
   (1,1)        3
  (10,1)        1
   (1,2)        4
  (10,2)        2
   (1,3)        5
like image 145
Luis Mendo Avatar answered Feb 22 '26 03:02

Luis Mendo