Domanda

I've been looking for several hours now with no success!

I have a XML file (which a program creates) and I like to get information out of it. Now, my problem is, the header of the files sais UFTF-8, but the file is encoded in UNICODE! VB.net XmlTextReader won't read that file...!? As soon as it gets to "Load", it throws around with exceptions. Then I opened one of those thousands XML files in Notepad++ and saved it as UFT-8 - well, guess that! That file worked!

But I don't think I want to change all the files on our server (new ones are added daily!) and I don't think I can get the developer to change the way he saves those XML-files.

Any ideas on how to "cheat" VB.net into reading those files anyways?

Thanks!

È stato utile?

Soluzione

You can change the encoding when you read the file into the memory

Dim Stream As New IO.StreamReader("File.xml", System.Text.Encoding.UTF8)
Dim Reader As New Xml.XmlTextReader(Stream)

for a more advanced approach, you can first detect the encoding of the file and then try to change it.

Altri suggerimenti

First you need to read the dodgy XML into a bytearray. And then convert that into a string specifying the character encoding.

like so

    Using fsSource As FileStream = New FileStream(pathSource, _
        FileMode.Open, FileAccess.Read)
        ' Read the source file into a byte array. 
            Dim bytes() As Byte = New Byte((fsSource.Length) - 1) {}
            Dim numBytesToRead As Integer = CType(fsSource.Length,Integer)
            Dim numBytesRead As Integer = 0

            While (numBytesToRead > 0)
                ' Read may return anything from 0 to numBytesToRead. 
                Dim n As Integer = fsSource.Read(bytes, numBytesRead, _
                    numBytesToRead)
                ' Break when the end of the file is reached. 
                If (n = 0) Then 
                    Exit While 
                End If
                numBytesRead = (numBytesRead + n)
                numBytesToRead = (numBytesToRead - n)

            End While
        numBytesToRead = bytes.Length

        Dim strText As String = System.Text.Encoding.GetEncoding(1252).GetString(bytes)
    End Using 

I am using windows-1252 here but you will need to change that to what ever encoding those files are.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top