Question

Does anyone know of a way (say Powershell, or a tool) in Windows that can recurse over a directory and convert any unix files to windows files.

I'd be perfectly happy with a way in Powershell to at least detect a unix file.

It's easy do this for one single file, but I'm after something a bit more scalable (hence leaning towards a Powershellish solution).

Was it helpful?

Solution

Here is the pure PowerShell way if you are interested.

Finding files with atleast one UNIX line ending (PowerShell v1):

dir * -inc *.txt | %{ if (gc $_.FullName -delim "`0" | Select-String "[^`r]`n") {$_} }

Here is how you find and covert UNIX line endings to Windows line endings. One important thing to note is that an extra line ending (\r\n) will be added to the end of the file if there isn't already a line ending at the end. If you really don't want that, I'll post an example of how you can avoid it (it is a bit more complex).

Get-ChildItem * -Include *.txt | ForEach-Object {
    ## If contains UNIX line endings, replace with Windows line endings
    if (Get-Content $_.FullName -Delimiter "`0" | Select-String "[^`r]`n")
    {
        $content = Get-Content $_.FullName
        $content | Set-Content $_.FullName
    }
}

The above works because PowerShell will automatically split the contents on \n (dropping \r if they exist) and then add \r\n when it writes each thing (in this case a line) to the file. That is why you always end up with a line ending at the end of the file.

Also, I wrote the above code so that it only modifies files that it needs to. If you don't care about that you can remove the if statement. Oh, make sure that only files get to the ForEach-Object. Other than that you can do whatever filtering you want at the start of that pipeline.

OTHER TIPS

There is dos2unix and unix2dos in Cygwin.

This seems to work for me.

Get-Content Unix.txt | Out-File Dos.txt

download vim, open your file and issue

:se fileformat=dos|up

Batch for multiple files (all *.txt files in C:\tmp - recursive):

:args C:\tmp\**\*.txt
:argdo se fileformat=dos|up

You can use Visual Studio. File -> Advanced Save Options...

If Cygwin isn't for you, there are numerous standalone executables for unix2dos under Windows if you Google around, or you could write one yourself, see my similar (opposite direction for conversion) question here.

I spent 6 hours yesterday and today testing the code given above in a loop with 10,000 files, many of them >50kb in size. Bottom line, the powershell code is very inefficient/slow/unusable for large files and large number of files. It also does not preserve BOM bytes. I found unix2dos 7.2.3 to be the fastest and most practical solution. Hope this helps others and saves them time.

Opening a file with Unix line endings in Wordpad and saving it will rewrite all the line endings as DOS. A bit laborious for large numbers of files, but it works well enough for a few files every once in a while.

It works for me:

 Get-ChildItem -Recurse -File | % { $tmp = Get-Content $_; $tmp | Out-File "$_" -Encoding UTF8 }

How about this (with negative lookbehind). Without -nonewline, set-content puts an extra `r`n at the bottom. With the parentheses, you can modify the same file.

function unix2dos {
    (Get-Content -raw $args[0]) -replace "(?<!`r)`n","`r`n" | 
    set-content -nonewline $args[0]
}

The reverse would be this, windows to unix text.

function dos2unix {
    (Get-Content -raw $args[0]) -replace "`r`n","`n" | 
    set-content -nonewline $args[0]
}

Examples:

unix2dos file.txt
dos2unix file.txt

If you have emacs, you can check it with esc-x hexl-mode. Notepad won't display unix text correctly. I have to specify the path for set-content, because -replace erases the pspath.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top