Python: Create a list of nonmatching values

https://stackoverflow.com/questions/16236800

13-04-2022
|

Question

I've been working on program which searches a folder and finds matching files names based on a list of vaules from an input list and then copies them to a folder. The program works but now I want to add one extra layer to it; Get a list of non matching samples and then output it a CSV file. The code is not efficient, but it gets the job done though I am aware that it may not be properly set up to do what I ask.

import os, fnmatch, csv, shutil, operator

#Function created to search through a folder location to for using a specific list of keywords
def locate(pattern, root=os.curdir):
matches = []

for path, dirs, files in os.walk(os.path.abspath(root)):
    for filename in fnmatch.filter(files, pattern):
        matches.append(os.path.join(path, filename))

return matches

#output file created to store the pathfiles
outfile="G:\output.csv"
output=csv.writer(open(outfile,'w'), delimiter=',',quoting=csv.QUOTE_NONE)

#Opens the file and stores the values in each row
path="G:\GIS\Parsons Stuff\samples.csv"
pathfile=open(path,'rb')
openfile=csv.reader((pathfile), delimiter = ',')
samplelist=[]
samplelist.extend(openfile)

#for loop used to return the list of tuples
for checklist in zip(*samplelist):
    print checklist

#an empty list used to store the filepaths of sample locations of interest 
files=[]

#for loop to search for sample id's in a folder and copies the filepath
for x in checklist:
    LocatedFiles=locate(x, "G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\")
    print LocatedFiles
    files.append(LocatedFiles)

# flattens the list called files into a managable list
flattenedpath=reduce(operator.add, files)

#filters out files that match the filter .pdf
filteredpath=[]
filteredpath.append(fnmatch.filter(flattenedpath,"*.pdf*"))

#outputs the file path a .csv file called output
output.writerows(files)

pathfile.close()

#location of where files are going to be copied
dst='C:\\TestFolder\\'

#filters out files that match the filer .pdf
filtered=[]
filtered.append(fnmatch.filter(flattenedpath,"*.pdf*"))
filteredpath=reduce(operator.add,filtered)

#the function set() goes through the list of interest to store a list a unique values.  
delete_dup=set(filteredpath)
delete_dup=reduce(operator.add,zip(delete_dup))

#for loop to copy files in the list delete_dup
for x in delete_dup:
    shutil.copy(x,dst)

My idea is that since the lists "samplelist" and "files" are the same length:

len(samplelist)
36
len(files)
36

I should be able to pull out the index values of each empty list from "files", pass it to a list which stores the index value which can be used to pull out elements from "samplelist".

I've tried using the following links for ideas to do this but have had no luck:

In Python, how can I find the index of the first item in a list that is NOT some value?

Finding matching and nonmatching items in lists

Finding the index of an item given a list containing it in Python

Pythonic way to compare two lists and print out the differences

Following is the output from the list called "samplelist"

('*S42TPZ2*', '*S3138*', '*S2415*', '*S2378*', '*S2310*', '*S2299*', '*S1778*', '*S1777*', '*S1776*', '*S1408*', '*S1340*', '*S1327*', '*RW-61*', '*MW-247*', '*MW-229*', '*MW-228*', '*MW-209*', '*MW-208*', '*MW-193*', '*M51TPZ6*', '*M51TP21*', '*H1013*', '*H1001*', '*H0858*', '*H0843*', '*H0834*', '*H0514*', '*H0451*', '*H0450*', '*EY1TP9*', '*EY1TP7*', '*EY1TP6*', '*EY1TP5*', '*EY1TP4*', '*EY1TP2*', '*EY1TP1*')

Following is the output from the list called "files"(I am not going to list all the outputs since it is unnecessary, just wanted to give an idea of what the list looks like)

[[], [], ['G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\S2415.pdf'], ['G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\S2378.pdf'], ['G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\MW-247.S2310.pdf', 'G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\S2310.MW-247.pdf', 'G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\S2310.pdf'], ['G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\S2299.pdf'], ['G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\S1778.pdf'], ['G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\S1777.pdf'], ['G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\S1776.pdf'], ['G:\\GIS\\Parsons Stuff\\boring logs\\boring logs\\S1408.pdf']

La solution

I'm not quite sure this is what you are asking for but couldn't you:

index_list = []
for n, item in enumerate(list):
    if len(item) == 0:
        index_list.append(n)

that little piece of code will iterate over you list and if the list contain an empty list it will return the index of the empty list and add it to another list!

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow