Domanda

I have a text tab delimited file which was imported to PB using a DataStore with ImportFile() method. There is no error during the importation but when I checked the table, the dash character turned out to invalid character(â€). The table's column is in varchar(300) data type.

Any help / advised is appreciated.

enter image description here

And when I check to the database, the result set is :

enter image description here

Below is the import file script I've currently implemented.

//Import File Script
IF (ids_edihdr.ImportFile(ls_SourcePath,1,1) = 1 ) AND (ids_edidtl.ImportFile(ls_SourcePath,2) > 0 ) THEN 
    //HEADER
    IF ids_edihdr.RowCount() = 1 THEN 

        ids_edihdr.SetItem(1,'FNAME',Upper(as_file))
        ids_edihdr.SetItem(1,'CREATEDBY',Upper(SQLCA.LogID))    
        ids_edihdr.SetItem(1,'CREATEDDATE',idt_TranDate)    

    END IF

    //DETAIL
    IF ids_edidtl.RowCount() >= 1 THEN
        FOR ll_edidtl = 1 TO ids_edidtl.RowCount()
            ids_edidtl.SetItem(ll_edidtl,'Fname',Upper(as_file))
            ids_edidtl.SetItem(ll_edidtl,'CREATEDBY',Upper(SQLCA.LogID))    
            ids_edidtl.SetItem(ll_edidtl,'CREATEDDATE',idt_TranDate)
        NEXT        
    END IF
END IF
È stato utile?

Soluzione

Any chance this file being imported was data entered in Word or Excel? Have you looked at the data file with a hex editor? Odds are the dash character was "intelligently" substituted with an extended character, and you have a character set clash going on. My bet is on the fix being in the data file, not the code.

Good luck,

Terry

Altri suggerimenti

I did a little bit of research, and learned that the character in question is a unicode dash U+002D. Now if the data looked fine in your input file and was corrupt upon importing the problem may due to PB not handling the data as Unicode so you can fix the situation using functions in PB.

It could be that the database interface you are using doesn't support the conversion between ansii and unicode (see page 7).. not sure if you were using a pipeline object or anything where database drivers come into play.

Either way knowing it is a character encoding issue, fixing this should be pretty simple, just use the the "EncodingASNI!" or "EncodingUnicode!" enumerated argument on the String and Blob methods prior to importing the text into the datawindow. If that isn't possible than you could write a quick routine to read through the file, convert, and save before importing.

If you don't want to convert before importing you can do it by looping through the datawindow/datastore before actually performing the update to the database.

You can find examples of code on my blog on converting between ANSI and Unicode, but basically you just use one of these encoding parameters on String and Blob functions.

  • EncodingANSI!
  • Encoding UTF8!
  • EncodingUTF16LE! – UTF-16 Little Endian encoding (PowerBuilder 10 default)
  • EncodingUTF16BE! – UTF-16 Big Endian encoding

Appreciate all the comments. And I would like to share on how did I addressed the question. I've added a script to handle file encoding validation and conversion as well.

ll_FileNum = FileOpen(ls_sourcepath, StreamMode!, Read!, LockWrite!, Replace!)
ll_FileLength = FileLength(ls_sourcepath)
eRet = FileEncoding(ls_sourcepath)
IF eRet = EncodingANSI! and ll_filelength <= 32765 THEN 
    li_bytes = FileReadEx(ll_FileNum, lbl_data)     
    ls_unicode = String(lbl_data, EncodingUTF8!)    
    FileClose(ll_FileNum)
END IF

    IF (ids_edihdr.ImportString(ls_unicode,1,1) = 1 ) AND (ids_edidtl.ImportString(ls_unicode,2) > 0 ) THEN 
         <some conditions here....>
    END IF
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top