Document ID:        15095.1
Subject:            EXPORT/IMPORT AND NLS CONSIDERATIONS
Last Revision Date: 15 August    1995
Author:             ETAFEEN


If a user has exported/imported a database or table(s) and is now encountering
character set conversion problems, use the following information to
confirm whether the export/import procedure was performed correctly:
    --Export database using NLS_LANG to identify the source database's
      character set. Export stores the character set ID (not the text
      string) in the dump file.
    --Establish NLS_LANG environment variable for Import session.
    --Import reads the Export character set ID from the dump file and
      compares it with the session's character set as defined in NLS_LANG.
    --No conversion occurs if the Export's character set and the Import's
      session character set are the same. If they are not the same,
      conversion is performed from the Export character set to the Import's
      session character set prior to the data being inserted into the
      database.
    --The Import's session character set should be a superset of the Export's
      character set otherwise special characters will not be correctly
      converted.
    --Include the parameter 'CHARSET' when defining the Import parameter set.
      CHARSET identifies the character set of the Export file. Currently
      in V7, the code expects the value in CHARSET to match the Export's
      file character set. If they do not match, IMP-42 will result.
      The CHARSET option was developed to import older export files
      which did not have stored character set ID information.
    --After the data has been converted to the Import's session character
      set, it is then converted to the database character set if they
      differ. The database character set should be a superset of the
      Import's session character set otherwise special characters will not
      be correctly converted.

    -------------------
    | db in character |     export            exp session is in character
    | set A           |---------------------> set B as defined by NLS_LANG.
    -------------------                       Therefore the dump file is in
         source                               character set B. Character set
                                              conversion may occur.

                                                        |
                                                        |
                                                        | move file over to
                                                        | another machine
                                                        |
                                                        V
         destination
    -------------------
    | db in character |     import            imp session is in character
    | set D           | <-------------------  set C as defined by NLS_LANG.
    -------------------                       The dump file is still in
                                              character set B. Character set
                                              conversion may occur.
    During the import process
    character set conversion
    may occur between character
    set C and the db's character
    set D if they differ.

The user needs to identify the following:
--what is the database character set specified when issuing the
  CREATE DATABASE for the source database (character set A in the above).
--what is the client character set specified with NLS_LANG when the
  data was inserted.
--what is the client character set when the data was exported (character
  set B in the above).
--what is the database character set specified when issuing the CREATE
  DATABASE for the destination database (character set D in the above).
--what is the client character set when the data was imported.

It's important to note Import will do up to 2 character set conversions
depending on:
(a) character set of export file, (b)NLS_LANG of import session and (c)
character set of database.
Refer to bugs 220349,224161 and enhancement requests 181388,181389.