I don't know what order of magnitude that should take but to improve the speed you could start by removing the redundant compare in your foreach
loop:
$d = [string]::Compare($file.FullName, $username, $True)
String comparison is costly and you're not using $d
. You're also comparing against $username or "AMS\" + $username which again is costly. I can't see why you need to compare against both. I've modified your script to add timing to it. I would recommend trying it on a subset of the files to get some empirical data to work out how long the full set would take. Bear in mind that the total size of the files is irrelevant in this case as you aren't processing them, just their properties.
<#
.SYNOPSIS
C:\ams\psscripts\list-files.ps1
.DESCRIPTION
List all the files that a given user owns
.PARAMETER none
username: user
logfile: path to log file. This is optional. If omitted the the log file is created "u:\scratch\<$username>-files.txt
.EXAMPLE
C:\ams\psscripts\list-files.ps1 plo
Example: C:\ams\psscripts\list-files.ps1 plo u:\scratch\log.txt
#>
param (
[string]$username,
[string]$logfile
)
# Load modules
Set-ExecutionPolicy Unrestricted
#Import-Module ActiveDirectory
#Add-PSSnapin Quest.ActiveRoles.ADManagement
function printHelp {
Write-Host "This script will find all the files owned by a user. It scans \\dfs\groups"
Write-Host "C:\ams\psscripts\list-files.ps1 user logfile (optional)"
Write-Host "Example: C:\ams\psscripts\list-files.ps1 plo"
Write-Host "Example: C:\ams\psscripts\list-files.ps1 plo u:\scratch\log.txt"
}
#StopWatch
$stopWatch = New-Object System.Diagnostics.Stopwatch
$stopWatch.Start()
if ($logfile -eq "") {
$logfile = "e:\scratch\" + $username + "-files.txt"
Write-Host "Setting log file to $logfile"
}
# you must use a UNC path
[String]$path = "\\test-server\testfolder\subfolder"
[String]$AD_username = "AMS\" + $username
# check that we have a valid AD user
if (!(Get-QADUser $AD_username)){
Write-Host "ERROR: Not a valid AD User: $AD_username"
Exit 0
}
Write-Output "Listing all files owned by $username from $path" | Out-File -FilePath $logfile
Write-Host "Listing all files owned by $username from $path"
$d = Get-Date
Write-Output $d | Out-File -FilePath $logfile -Append
$stopWatch.Stop()
Write-Output ("Setup time: {0}." -f $stopWatch.Elapsed) | Out-File -FilePath $logfile -Append
$stopWatch.Reset()
$stopWatch.Start()
$files = Get-ChildItem $path -Recurse
$stopWatch.Stop()
Write-Output ("Got {0} files to process, took {1}" -f $files.Count, $stopWatch.Elapsed) | Out-File -FilePath $logfile -Append
$stopWatch.Reset()
$stopWatch.Start()
Foreach ($file in $files)
{
$f = Get-Acl $file.FullName
#$d = [string]::Compare($file.FullName, $username, $True)
#if (($f.Owner -eq $username) -or ($f.Owner -eq $AD_username))
if ($f.Owner -eq $AD_username)
{
Write-Host ("{0}" -f $file.FullName)
Write-Output $file.FullName | Out-File -FilePath $logfile -Append
}
}
$stopWatch.Stop()
Write-Output ("Processed {0} files, took {1}" -f $files.Count, $stopWatch.Elapsed) | Out-File -FilePath $logfile -Append
Write-Host "Completed"
exit 0
Running this yielded the following results in our infrastructure:
Got 37803 files to process, took 00:00:57.5834897
Processed 37803 files, took 00:10:42.2988004
Your original code took 15 minutes to process the same number of files:
Processed 37803 files, took 00:15:04.1024350
Added @GeorgeR.Jenkins in memory string construction but it didn't make a significant reduction to the processing time:
Processed 37803 files, took 00:10:26.7815446
Interestingly attempting to pipe the get-childitem to a where clause didn't improve the performance. Using
$files = Get-ChildItem $path -Recurse | where {(Get-Acl $_.FullName).Owner -eq $AD_username} <br/>
which will only return files with the correct owner so no processing was needed later yielded:
Got 46 files to process, took 00:13:51.4940596
That all said, if you where running on similar infrastructure to me I would expect, at the fastest rate I've seen, which is 59 files per second, that your 619,238 files would take around 175 minutes. The rate I got with your original code was 42 files per second, which would have taken 246 minutes. Again I would advise running on your system with a small subset of the files to calculate how long it will take before running the whole set.