Skip to main content

Search

How to Change the Character Encoding for SMPP

Comments

4 comments

  • Avatar
    Toby Phipps

    This needs a little more in the way of clarification. After a little experimentation, it appears that:

    1. By "gsm8", Nexmo really means "GSM 03.38" which is a 7-bit encoding. There's no such character set as "gsm8", at least not from a standards point of view.

    2. Changes to this setting only take effect after an SMPP rebind. Until then, the value that was set when the SMPP bind was established still takes effect.

    3. This setting only takes effect when the PDU value "data_coding" is set to 0 (default). You can still override it on a per-message basis by changing the value of "data_coding". This is useful if you want to use the most efficient encoding based on the message content - i.e. using GSM 03.38 when all characters fall inside its repertoire and Unicode when not.

    0
  • Avatar
    Nexmo Support

    Further clarification in Character Encoding in Nexmo SMPP API:

    Single encoding is applied to all messages on an account and client can not override that encoding, so thats a slight break with 'pure' smpp according to the specification.

    However, different bits in data_coding specifying binary or unicode are taken into account, therefore we will respect data_coding & 0x08 != 0 for Unicode messages, although the default charset in the dashboard is Latin9 or GSM8.

    On the other hand, if charset is set to Latin9 on the Dashboard and user sets data_coding=1 we will still expecting Latin9 instead of GSM.

     

    0
  • Avatar
    RichardKesiar

    In the API settings form utf8 is listed as an encoding option. I assume that this is actually UTF-16/UCS-2 which seems to be the only type of Unicode supported in SMS messages and by the SMPP specification (http://docs.nimta.com/SMPP_v3_4_Issue1_2.pdf page 126 data_coding).

     

    0
  • Avatar
    Nexmo Support

    The encoding options that we list under Encoding declare the possible 'smsc default' encoding that we can support, ie, the encoding that is used when a message submit request specifies a data_coding (dcs) of '0'.  In this case we are still dealing with 8-bit messaging, but have also have a limited number of encoding/charset options available to accommodate a range of client platforms.

    Unicode messaging on the other hand, where the dcs is 0x08 uses a 16 bit encoding, and for this we expect the message to be encoded in UTF-16/UCS-2. In the case of unicode messaging, the default encoding that you declare in the dashboard does not apply.

    A data_coding value of 0 is the one case where the SMPP specification leaves a certain amount of room for interpretation, and is considered to be 'vendor specific' and left to the SMSC vendor to specify. Whilst this alows a certain amount of flexibility in a number of international scenarios, it has lead to there being no clear defined 'standard' encoding, hence the need to have a selectable list of encodings to match the vast range of smpp client implementations that we encounter.

    I hope that helps.

    0

Article is closed for comments.