Domanda

I get some data from an instrument that is formatted in a specific way. I need to load the data into MATLAB, manipulate some values, then save it back with the same format to load back into the instrument software for further analysis...

The issue I am having is the data is of mixed value types and they are kind of all over the place.

The file is tab delimited, I have added arrows eg --> to show the location of the tabs (like notepad++ does)

Scan-42/01
Temperature [K] :-->   295.00
Time [s]        :-->     60

"Linspace"
0.01-->   0.96
0.02-->   0.95
0.03-->   0.95

"Logspace"
0.01-->   0.96
0.02-->   0.95
0.04-->   0.94

The data keeps going down but I have cut it off after 3 rows.

The data I need to manipulate will be the Temperature, and some of the values under Linspace and Logspace.

I am currently importing the data like this:

filename = 'test.asc';
delimiter = '\t';
formatSpec = '%s%s%[^\n\r]';
fileID = fopen(filename,'r');
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'ReturnOnError', false);

Data in MATLAB looks like this:

.asc file loaded into MATLAB

Even if I could set up some kind of template in MATLAB where I could get the values nesessary, and then save them in excactly this format would work fine. The file must be saved as .asc, or the instrument will reject it.

Help is greatly appreciated.

Thanks

È stato utile?

Soluzione

Hope this would work for you.

Code

%%// Note: file1 is your input .asc filename and file2 is the output .asc.
%%// Please specify their names before running this.

%%// **** Read in file data  **** 
fid = fopen(file1,'r');
A = importdata(file1,'\n')

%%// Delimiters (mind these assumptions)
linlog_delim1 = '-->   ';
temperature_delim1 = 'Temperature [K] :-->   ';

sep1 = cellfun(@(x) isequal(x,''),A)
sep1 = [sep1 ;1]
sep_ind = find(sep1)
full_data = regexp(A,linlog_delim1,'split')

%%// Temperature value
temp_ind = find(~cellfun(@isempty,strfind(A,'Temperature [K] :-->')))
temp_val = str2num(cell2mat(full_data{temp_ind,:}(1,2)))

%%// Linspace values
sep_linspace = cellfun(@(x) isequal(x,'"Linspace"'),A)
lin_start_ind = find(sep_linspace)+1
lin_stop_ind = sep_ind(find(sep_ind>lin_start_ind,1,'first'))-1

linspace_data = vertcat(full_data{lin_start_ind:lin_stop_ind})
linspace_valid_ind = cellfun(@str2num,linspace_data(:,1))
linspace_valid_val = cellfun(@str2num,linspace_data(:,2))

%%// Logspace values
sep_linspace = cellfun(@(x) isequal(x,'"Logspace"'),A)
log_start_ind = find(sep_linspace)+1
log_stop_ind = sep_ind(find(sep_ind>log_start_ind,1,'first'))-1

logpace_data = vertcat(full_data{log_start_ind:log_stop_ind})
logspace_valid_ind = cellfun(@str2num,logpace_data(:,1))
logspace_valid_val = cellfun(@str2num,logpace_data(:,2))

%%// ****  Let us modify some data ****
temp_val = temp_val + 10;
linspace_valid_val_mod1 = linspace_valid_val+[1 2 3]'; %%//'
logspace_valid_val_mod1 = logspace_valid_val+[1 20 300]'; %%//'

%%// **** Write back file data  ****

%%// Write back temperature data
A(temp_ind) = {[temperature_delim1,num2str(temp_val)]}

%%// Write back linspace data
mod_lin_val = cellfun(@strtrim,cellstr(num2str(linspace_valid_val_mod1)),'uni',0)
mod_lin_ind = cellstr(num2str(linspace_valid_ind))
sep_lin = repmat({linlog_delim1},numel(mod_lin_val),1)
A(lin_start_ind:lin_stop_ind)=cellfun(@horzcat,mod_lin_ind,sep_lin,mod_lin_val,'uni',0)

%%// Write back logspace data
mod_log_val = cellfun(@strtrim,cellstr(num2str(logspace_valid_val_mod1)),'uni',0)
mod_log_ind = cellstr(num2str(logspace_valid_ind))
sep_log = repmat({linlog_delim1},numel(mod_log_val),1)
A(log_start_ind:log_stop_ind)=cellfun(@horzcat,mod_log_ind,sep_log,mod_log_val,'uni',0)

%%// Remove leading whitespaces
A = strtrim(A)

%%// Write the modified data 
fid2 = fopen(file2,'w');
for row = 1:numel(A)
    fprintf(fid2,'%s\n',A{row,:});
end

fclose(fid);
fclose(fid2);

Changes for the demo:

  • Temperature has 10 added.
  • "Linspace" has 1 2 and 3 added to it's elements respectively.
  • "Logspace" has 1 20 and 300 added to it's elements respectively.

Results

Before -

Scan-42/01
Temperature [K] :-->   295.00
Time [s]        :-->     60

"Linspace"
0.01-->   0.96
0.02-->   0.95
0.103-->   0.95

"Logspace"
0.01-->   0.96
0.02-->   0.95
0.04-->   0.94

After -

Scan-42/01
Temperature [K] :-->   305
Time [s]        :-->     60

"Linspace"
0.01-->   1.96
0.02-->   2.95
0.103-->   3.95

"Logspace"
0.01-->   1.96
0.02-->   20.95
0.04-->   300.94

Edit 1:

Code

%%// I-O filenames
input_filename = 'gistfile1.txt';
output_file = 'gistfile1_out.txt';

%%// Get data from input filename
delimiter = '\t';
formatSpec = '%s%s%[^\n\r]';
fid = fopen(input_filename,'r');
dataArray = textscan(fid, formatSpec, 'Delimiter', delimiter, 'ReturnOnError', false);

%%// Get data into A
A(:,1) = dataArray{1,1}
A(:,2) = dataArray{1,2}

%%// Find separator indices
ind1 = find([cellfun(@(x) isequal(x,''),A(:,2));1])
temperature_ind = find(~cellfun(@isempty,strfind(A,'Temperature')))
temperature_val = str2num(cell2mat(A(temperature_ind,2)))

%%// Linspace values
sep_linspace = cellfun(@(x) isequal(x,'"Linspace"'),A(:,1))
lin_start_ind = find(sep_linspace)+1
lin_stop_ind = ind1(find(ind1>lin_start_ind,1,'first'))-1

linspace_valid_ind = cellfun(@str2num,A(lin_start_ind:lin_stop_ind,1))
linspace_valid_val = cellfun(@str2num,A(lin_start_ind:lin_stop_ind,2))

%%// Logspace values
sep_logspace = cellfun(@(x) isequal(x,'"Logspace"'),A(:,1))
log_start_ind = find(sep_logspace)+1
log_stop_ind = ind1(find(ind1>log_start_ind,1,'first'))-1

logspace_valid_ind = cellfun(@str2num,A(log_start_ind:log_stop_ind,1))
logspace_valid_val = cellfun(@str2num,A(log_start_ind:log_stop_ind,2))

%%// ****  Let us modify some data ****
temp_val_mod1 = temperature_val + 10;
linspace_valid_val_mod1 = linspace_valid_val+[1:numel(linspace_valid_val)]';
logspace_valid_val_mod1 = logspace_valid_val+10.*[1:numel(logspace_valid_val)]';

%%// **** Write back file data into A  ****
A(temperature_ind,2) = cellstr(num2str(temp_val_mod1))
A(lin_start_ind:lin_stop_ind,2) = cellstr(num2str(linspace_valid_val_mod1))
A(log_start_ind:log_stop_ind,2) = cellstr(num2str(logspace_valid_val_mod1))

%%// Write the modified data 
fid2 = fopen(output_file,'w');
for row = 1:size(A,1)
    fprintf(fid2,'%s\t%s\n',A{row,1},A{row,2});
end

%%// Close files
fclose(fid);
fclose(fid2);

Results

Before -

Scan-42/01
Temperature [K] :   295.00
Time [s]        :   60

"Linspace"
0.01    0.96
0.02    0.95
0.03    0.95

"Logspace"
0.01    0.96
0.02    0.95
0.04    0.94

After -

Scan-42/01  
Temperature [K] :   305
Time [s]        :   60
"Linspace"  
0.01    1.96
0.02    2.95
0.03    3.95
"Logspace"  
0.01    10.96
0.02    20.95
0.04    30.94

Please note that the only formatting difference between input and output files is that there is no whitespaced row between "Linspace" and the previous row in the output file, as was there in the input file. This is seen similarly for "Logspace".

Altri suggerimenti

I've solved a nearly identical problem once before. The solution goes something like this:

First, you're already splitting your data up into chunks, so that's good. Judging by your comment, it seems that the data is consistently formatted from file to file, but inconsistently formatted in each individual file. That's fine.

What you need to do is iterate through dataArray, and find each unique label (Such as "Linspace") and track that labels index. What you'll end up with is a vector of indices that tell you exactly where in dataArray these labels appear. Once you have all of the labels indices, you need to look at the dataArray, and see how the data between each label is formatted. Then you'll write some code to break dataArray into sub-arrays. You'll need to write a different sub-array parser for each format.

I know that's a little abstract, so let me try to give you an example.

 timeIndex      = find(strcmp(dataArray, 'Time'), 1);
 linespaceIndex = find(strcmp(dataArray, '"linSpace"'), 1);
 logespaceIndex = find(strcmp(dataArray, '"logSpace"'), 1);

 linSpaceData = dataArray(linspaceIndex+3:logspaceIndex-1);   % This is the "sub-array" I was refering to. It's a little piece of dataArray that contains only the linspace data values.

This is just an example, and will probably not plug-and-play, it's just meant to be a thought-provoker. Note the +3 and -1, those were just guessed. You'll have to empirically determine those for each range, as lings like tabs, colons, and spaces can get in the way. That should be enough to get you started on your problem. Let me know if you need clarification, or if this isn't helpful. Good luck!

-Fletch

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top