Convert String from ASCII to EBCDIC in Java?

https://stackoverflow.com/questions/368603

21-08-2019
|

Question

I need to write a 'simple' util to convert from ASCII to EBCDIC?

The Ascii is coming from Java, Web and going to an AS400. I've had a google around, can't seem to find a easy solution (maybe coz there isn't one :( ). I was hoping for an opensource util or paid for util that has already been written.

Like this maybe?

Converter.convertToAscii(String textFromAS400)
Converter.convertToEBCDIC(String textFromJava)

Thanks,

Scott

Solution

JTOpen, IBM's open source version of their Java toolbox has a collection of classes to access AS/400 objects, including a FileReader and FileWriter to access native AS400 text files. That may be easier to use then writing your own conversion classes.

From the JTOpen homepage:

Here are just a few of the many i5/OS and OS/400 resources you can access using JTOpen:

Database -- JDBC (SQL) and record-level access (DDM)

Integrated File System

Program calls

Commands

Data queues

Data areas

Print/spool resources

Product and PTF information

Jobs and job logs

Messages, message queues, message files

Users and groups

User spaces

System values

System status

OTHER TIPS

Please note that a String in Java holds text in Java's native encoding. When holding an ASCII or EBCDIC "string" in memory, prior to encoding as a String, you'll have it in a byte[].

ASCII -> Java:   new String(bytes, "ASCII")
EBCDIC -> Java:  new String(bytes, "Cp1047")
Java -> ASCII:   string.getBytes("ASCII")
Java -> EBCDIC:  string.getBytes("Cp1047")

package javaapplication1;

import java.nio.ByteBuffer;
import java.nio.CharBuffer;

import java.nio.charset.CharacterCodingException;

import java.nio.charset.Charset;

import java.nio.charset.CharsetDecoder;

import java.nio.charset.CharsetEncoder;

public class ConvertBetweenCharacterSetEncodingsWithCharBuffer {

    public static void main(String[] args) {

       //String cadena = "@@@@@@@@@@@@@@@ñâæÃÈÄóöó@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ÔÁâãÅÙÃÁÙÄ@ÄÅÂÉã@âæÉãÃÈ@@@@@@@@";
        String cadena = "ñâæÃÈÄóöó";
        System.out.println(Convert(cadena,"CP1047","ISO-8859-1"));
        cadena = "1SWCHD363";
        System.out.println(Convert(cadena,"ISO-8859-1","CP1047"));

    }

    public static String Convert (String strToConvert,String in, String out){
       try {

        Charset charset_in = Charset.forName(out);
        Charset charset_out = Charset.forName(in);

        CharsetDecoder decoder = charset_out.newDecoder();

        CharsetEncoder encoder = charset_in.newEncoder();

        CharBuffer uCharBuffer = CharBuffer.wrap(strToConvert);

        ByteBuffer bbuf = encoder.encode(uCharBuffer);

        CharBuffer cbuf = decoder.decode(bbuf);

        String s = cbuf.toString();

        //System.out.println("Original String is: " + s);
        return s;

    } catch (CharacterCodingException e) {

        //System.out.println("Character Coding Error: " + e.getMessage());
        return "";

    }


}

}

You should use either the Java character set Cp1047 (Java 5) or Cp500 (JDK 1.3+).

Use the String constructor: String(byte[] bytes, [int offset, int length,] String enc)

You can create one yoursef with this translation table.

But here is a site that has a link to a Java example.

I make a code that transforms data types easily.

public class Converter{

    public static void main(String[] args) {

        Charset charsetEBCDIC = Charset.forName("CP037");
        Charset charsetACSII = Charset.forName("US-ASCII");

        String ebcdic = "(((((((";
        System.out.println("String EBCDIC: " + ebcdic);
        System.out.println("String converted to ASCII: " + convertTO(ebcdic, charsetEBCDIC, charsetACSII));

        String ascII = "MMMMMM";
        System.out.println("String ASCII: " + ascII);
        System.out.println("String converted to EBCDIC: " + convertTO(ascII, charsetACSII, charsetEBCDIC));
    }

    public static String convertTO(String dados, Charset encondingFrom, Charset encondingTo) {
        return new String(dados.getBytes(encondingFrom), encondingTo);
    }
}

It should be fairly simple to write a map for the EBCDIC character set, and one for the ASCII character set, and in each return the character representation of the other. Then just loop over the string to translate, and look up each character in the map and append it to an output string.

I don't know if there are any converter's publicly available, but it shouldn't take more than an hour or so to write one.

This is what I've been using.

public static final int[] ebc2asc = new int[256];
public static final int[] asc2ebc = new int[256];

static
{
  byte[] values = new byte[256];
  for (int i = 0; i < 256; i++)
    values[i] = (byte) i;

  try
  {
    String s = new String (values, "CP1047");
    char[] chars = s.toCharArray ();
    for (int i = 0; i < 256; i++)
    {
      int val = chars[i];
      ebc2asc[i] = val;
      asc2ebc[val] = i;
    }
  }
  catch (UnsupportedEncodingException e)
  {
    e.printStackTrace ();
  }
}

Perhaps, like me you were not strictly using a JDBC feature (Writing to a Dataqueue, in my instance), so the auto-magical encoding didn't apply to you since we're communicating through multiple APIs.

My issue was similar to @scottyab's issue with certain characters not mapping. In my case, the example code I was referencing worked perfectly, but writing an xml string to a dataqueue resulted in [ being replaced with £.

As a web developer working with a pre-existing database backend with decades of information, I didn't simply have the ability to "right" the "mis-configuration" as one other commenter suggests.

However, I was able to see which Coded Character Set Identifier the i was likely using by issuing a command to the 400 to display file field information on a known good file: DSPFFD *LIB*/*FILE*.

Doing so gave me good information, including the specific CCSID set:

After some information sought on CCSIDs, I ran into a page on IBM for EBCDIC with key information printed on the page (since that has a habit of disappearing):

Version 11.0.0 Extended Binary Coded Decimal Interchange Code (EBCDIC) is an encoding scheme that is typically used on zSeries (z/OS®) and iSeries (System i®).

And most helpful:

Some example EBCDIC CCSIDs are 37, 500, and 1047.

Since I already learned from this question itself that Cp1047 is another good character set to try (This time, the £ turned into an accented "Y"), I tried Cp37 to see no such charsset existed, but attempted Cp037 and got the right encoding.

It looks like the key is finding which Coded Character Set Identifier (CCSID) is used in your system, and ensuring that your jt400 instance - which otherwise is working perfecting - matches up 100% to the encoding set on the as400, in my case way before my lifetime and decades of business logic ago.

I want to add on to what Kwebble and Shawn S have said. I can use JTOpen to do this.

I needed to write to a field which was 6 0P (6 bytes, nothing behind the decimal, packed). That's a decimal(11,0) for those of you who don't grok DDM.

    AS400PackedDecimal convertedCustId = new AS400PackedDecimal(11, 0);
    byte[] packedCust = convertedCustId.toBytes((int) custId);

    String packedCustStr = new String(packedCust, "Cp037");

    StringBuilder jcommData = new StringBuilder();
    jcommData.append(String.format("%6s", packedCustStr));

Yes, I used the library KWebble mentioned. Looking at DSPPFD as Shawn S mentioned, I discovered that the table was using CCSID 37. This worked.

I originally tried using Cp1047, as per Alan Krueger's suggestion. It seemed to work. Unfortunately, if my custId ended with a 5, the data rendered into the file was B0 instead of 5F. Changing it to Cp037 fixed that.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow