There are certainly see a number of ways to speed this up significantly.
First of all, read in all numeric data in as numbers. Matlab is not optimized to work with strings, and even cells should generally be avoided as much as possible. If you want to keep everything as strings, use another language (python or perl)
Once you have the state, county and site read in as numbers, then create a number instead of a string for the siteID. One approach would be to use the formula:
siteID = siteNum + 1e4*countyCode + 1e7*stateCode
That would generate unique siteIDs for all sites.
Use datenum to convert the date field into a number.
You are now in a position where the data_O3
defined on line 79 can be a purely numeric array (no cells!), as can your E
matrix. That alone will make the process many times faster.
You also might want to define the E
as something other than NaN. Maybe give it values of -1.
There may be more optimizations you can do in the comparison, but do the above first and I expect you will see a huge improvement.