Question

How do I use the function "is_double_url" to check 700 data that saved in the file name "training_URL", I can only check the data in cmd window or type "is_double_url('www.google.com')" in f10.m. But i want to use the import data as shown below to call out the function and check the 700 data

is_double_url.m file

function out = is_double_url(url_path1)

if url_path1(end)~='/'
url_path1(end+1)='/';
end

url_path1 = regexprep(url_path1,'//','//www.');
url_path1 = regexprep(url_path1,'//www.www.','//www.');

f1 = strfind(url_path1,'www.');
if numel(f1)<2
out = false;
else
f2 = strfind(url_path1,'/');
f3 = bsxfun(@minus,f2,f1');

count_dots = zeros(size(f3,1),1);
for k = 1:size(f3,1)
    [x,y] = find(f3(k,:)>0,1);
    str2 = url_path1(f1(k):f2(y));
    if ~isempty(strfind(str2,'..'))
        continue
    end
    count_dots(k) = nnz(strfind(str2,'.'));
end
out = ~any(count_dots(2:end)<2);

if any(strfind(url_path1,'://')>f2(1))
    out = true;
end
end

return;

f10.m file

url_path1 = importdata('DATA\URL\training_URL')
out = is_double_url(url_path1)

enter image description here

Was it helpful?

Solution

Note the change in the numel(.) condition:

function out = is_double_url(url_path1)

if url_path1(end)~='/'
    url_path1(end+1)='/';
end

url_path1 = regexprep(url_path1,'//','//www.');

url_path1 = regexprep(url_path1,'//www.www.','//www.');

f1 = strfind(url_path1,'www.');
if numel(f1)>2                  % changed it here
    out = false;
else
    f2 = strfind(url_path1,'/');
    f3 = bsxfun(@minus,f2,f1');

    count_dots = zeros(size(f3,1),1);
    for k = 1:size(f3,1)
        [x,y] = find(f3(k,:)>0,1);
        str2 = url_path1(f1(k):f2(y));
        if ~isempty(strfind(str2,'..'))
            continue
        end
        count_dots(k) = nnz(strfind(str2,'.'));
    end
    out = ~any(count_dots(2:end)<2);

    if any(strfind(url_path1,'://')>f2(1))
        out = true;
    end
end

if ~out  % I'm not sure if this is what you want..
out=-1;
end

return;

Regarding the issue you mentioned in comments, let's say url_path1 is a cell array of size 700x1; then (I think) you could say

out=[];
for i=1:size(url_path1,1)
  out=cat(2,out,is_double_url(url_path1{i,1}));
end
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top