Versatile string manipulation procedure in RPG

Question 1

Most character variables in RPG are indeed fixed length. Which means finite length. A character defined as 50a will always contain exactly 50 characters. Eval myChar = 'A'; will result in myChar containing 50 characters: the letter A followed by 49 blanks. This is boring but important.

The second boring but important bit is to understand that the caller allocates memory, not the callee. If the caller declares myChar 50a and the callee declares myParm 65535a, the caller has initialised only 50 bytes of storage. If the callee tries to work with myParm past the 50th byte, it is working with storage whose condition is unknown. As they say, unpredictable results may occur.

This then is the background to your question about a subprocedure handling character variables whose size is not known to the sub-procedure in advance. The classic way to handle this is to pass not only the character variable, but also its length. eval myProcedure(myChar: %len(myChar)); That's kind of ugly, and it forces every caller to calculate the length of myChar. It sure would be nice if the subprocedure could interrogate the incoming parameter to find how the caller defined it.

IBM have provided just such a facility through something they call Operational Descriptors. With operational descriptors, the caller passes metadata about the character parameter to the callee. One retrieves that via the CEEDOD API. There's an example of using CEEDOD here.

Basically, the subprocedure needs to declare that it wants operational descriptors:

 dddeCheck         pr              n   opdesc
 d test                          20a   const options(*varsize)

The caller then makes a normal looking call out to the subprocedure:

if ddeCheck(mtel) = *on;               // 10 bytes
...
endif;
if ddeCheck(mdate: *on) = *on;         // 6 bytes
...
endif;

Note that the caller passes different sized fixed length variables to the subprocedure.

The subprocedure needs to use CEEDOD to interrogate the incoming parameter's length:

     dddeCheck         pi              n   opdesc
     d test                          20a   const options(*varsize)
...
     dCEEDOD           pr
     d parmNum                       10i 0 const
     d descType                      10i 0
     d dataType                      10i 0
     d descInfo1                     10i 0
     d descInfo2                     10i 0
     d parmLen                       10i 0
     d ec                            12a   options(*omit)

     d parmNum         s             10i 0
     d descType        s             10i 0
     d dataType        s             10i 0
     d descInfo1       s             10i 0
     d descInfo2       s             10i 0
     d parmLen         s             10i 0
     d ec              s             12a
...
       CEEDOD (1: descType: dataType: descinfo1: descinfo2: parmlen: *omit);

At this point, parmlen contains the length that the caller has defined the incoming variable as being. Now it's up to us to do something with that information. If we're processing character by character, we need to do something like this:

for i = 1 to parmLen;
  char_test = %subst(test: i: 1);
  ...
endfor;

If we're processing as a single string, we need to do something like this:

returnVar = %xlate(str_lc_letters_c: str_uc_letters_c: %subst(s: 1: parmLen));

The important thing is to never, ever refer to the input parameter unless that reference is somehow bounded by the actual variable length as defined by the caller. These precautions are only necessary for fixed length variables. The compiler already knows the length of variable length character variables.

On the subject of the way the compiler maps myFixed to myVarying via CONST, understand how that works. The compiler will copy all of the bytes from myFixed into MyVarying - all of them. If myFixed is 10a, myVarying will become 10 bytes long. if myFixed is 50a, myVarying will become 50 bytes long. Trailing blanks are always included because they are a part of every fixed length character variable. Those blanks aren't really important for a translate procedure, one that ignores blanks, but they might be important for a procedure that centers a string. In this case, you'd need to resort to operational descriptors or do something like upperVary = str_us(%trimr(myFixed));

Question 2

The most flexible way of string passing in RPG that I found works with 64k-varlength strings and passing with *varsize (It's supposed to actually only send the number of bytes in the string passed, so the 64k should not be a problem – I think I found that somewhere suggested by Scott Klement). Here how I would write an A-Z only upcase function with that (as it is a most basic example):

 * typedefs:
Dstr_string_t     S          65535A   VARYING TEMPLATE

 * constants:
Dstr_uc_letters_c C                   'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
Dstr_lc_letters_c C                   'abcdefghijklmnopqrstuvwxyz'

 * prototype:
Dstr_uc           PR                  like(str_string_t)       
D s                                   like(str_string_t)       
D                                       options(*varsize) const

 * implementation:
Pstr_uc           B                   export                   
D                 PI                  like(str_string_t)       
D s                                   like(str_string_t)       
D                                       options(*varsize) const
 /free                                                         
  return %xlate(str_lc_letters_c:str_uc_letters_c:s);          
 /end-free                                                     
Pstr_uc           E

Now there are multiple things that concern me here:

Could there be some problems with fixed length strings that I pass to this?
Does this "only as many bytes as needed are passed" work for the return value as well? I Would hate to have thousands of bytes reserved and passed around everytime I want to upcase 3 char string.
It's only flexible upto 64k bytes. But I think thats more theoretically an issue with our programs – at least for now...