質問

I've been looking for several hours now with no success!

I have a XML file (which a program creates) and I like to get information out of it. Now, my problem is, the header of the files sais UFTF-8, but the file is encoded in UNICODE! VB.net XmlTextReader won't read that file...!? As soon as it gets to "Load", it throws around with exceptions. Then I opened one of those thousands XML files in Notepad++ and saved it as UFT-8 - well, guess that! That file worked!

But I don't think I want to change all the files on our server (new ones are added daily!) and I don't think I can get the developer to change the way he saves those XML-files.

Any ideas on how to "cheat" VB.net into reading those files anyways?

Thanks!

役に立ちましたか?

解決

You can change the encoding when you read the file into the memory

Dim Stream As New IO.StreamReader("File.xml", System.Text.Encoding.UTF8)
Dim Reader As New Xml.XmlTextReader(Stream)

for a more advanced approach, you can first detect the encoding of the file and then try to change it.

他のヒント

First you need to read the dodgy XML into a bytearray. And then convert that into a string specifying the character encoding.

like so

    Using fsSource As FileStream = New FileStream(pathSource, _
        FileMode.Open, FileAccess.Read)
        ' Read the source file into a byte array. 
            Dim bytes() As Byte = New Byte((fsSource.Length) - 1) {}
            Dim numBytesToRead As Integer = CType(fsSource.Length,Integer)
            Dim numBytesRead As Integer = 0

            While (numBytesToRead > 0)
                ' Read may return anything from 0 to numBytesToRead. 
                Dim n As Integer = fsSource.Read(bytes, numBytesRead, _
                    numBytesToRead)
                ' Break when the end of the file is reached. 
                If (n = 0) Then 
                    Exit While 
                End If
                numBytesRead = (numBytesRead + n)
                numBytesToRead = (numBytesToRead - n)

            End While
        numBytesToRead = bytes.Length

        Dim strText As String = System.Text.Encoding.GetEncoding(1252).GetString(bytes)
    End Using 

I am using windows-1252 here but you will need to change that to what ever encoding those files are.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top