4

I want to get the UTF-16 code unit at a given index in ABAP.

Same can be done in JavaScript with charCodeAt().

For example "d".charCodeAt(); will give back 100.

Is there a similar functionality in ABAP?

1
  • 2
    When you say the "UTF-16 code unit", either you mean the Unicode code point, e.g. "d" is always U+0064 (official "name" of Unicode character, 0x0064 being the hex representation of 100), while UTF-16 little endian (SAP code page 4103) and big endian (SAP code page 4102) encode "d" differently, respectively 2 bytes 0x4400 and 2 bytes 0x0044. Commented Mar 23, 2021 at 18:19

2 Answers 2

5

This can be done with class CL_ABAP_CONV_OUT_CE

DATA(lo_converter) = cl_abap_conv_out_ce=>create( encoding = '4103' ). "Litte Endian

TRY.
    CALL METHOD lo_converter->convert
      EXPORTING
        data   = 'a'
        n      = 1
      IMPORTING
        buffer = DATA(lv_buffer). "lv_buffer will 0061
CATCH ...

ENDTRY.

Codepage 4102 is for UTF-16 Big endian.

It is possible to encode not just a single character, but a string as well:

      EXPORTING
        data   = 'abc'
        n      = 3

"n" always stands for the length of the string you want to be encoded. It could be less, than the actual length of the string.

3
  • 3
    I think 4103 is little endian and 4102 is big endian. Commented Mar 23, 2021 at 18:13
  • 1
    You are right, like always, I edited my answer. Commented Mar 23, 2021 at 19:40
  • Note that n is optional and the method deduces it automatically from the length of data. Commented Mar 26, 2021 at 9:10
3

When you say you "want to get the UTF-16 code unit",

  • either you mean the Unicode code point, e.g. the character d is always U+0064 (official "name" of Unicode character, the two bytes 0x0064 being the hexadecimal representation of decimal 100),
  • or you mean you want to encode d to UTF-16 little endian (SAP code page 4103) or big endian (SAP code page 4102) which gives respectively 2 bytes 0x4400 or 2 bytes 0x0044.

For the second case, see József answer.

For the first case, you may get it using the method UCCP (UniCode Code Point) or UCCPI (UniCode Code Point Integer) of class CL_ABAP_CONV_OUT_CE:

DATA: l_unicode_point_hex TYPE x LENGTH 2,
      l_unicode_point_int TYPE i.


l_unicode_point_hex = cl_abap_conv_out_ce=>UCCP( 'd' ).

ASSERT l_unicode_point_hex = '0064'.


l_unicode_point_int = cl_abap_conv_out_ce=>UCCPI( 'd' ).

ASSERT l_unicode_point_int = 100.

EDIT: Note that the two methods return always the same values whatever the SAP system code page is (4102, 4103 or whatever).

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.