Using trim in ctl file




















The following sections explain the possible scenarios. In a conventional path load, data is committed after all data in the bind array is loaded into all tables. If the load is discontinued, only the rows that were processed up to the time of the last commit operation are loaded. There is no partial commit of data. In a direct path load, the behavior of a discontinued load varies depending on the reason the load was discontinued. This means that when you continue the load, the value you specify for the SKIP parameter may be different for different tables.

If a fatal error is encountered, the load is stopped and no data is saved unless ROWS was specified at the beginning of the load. In that case, all data that was previously committed is saved.

This means that the value of the SKIP parameter will be the same for all tables. When a load is discontinued, any data already loaded remains in the tables, and the tables are left in a valid state.

If the conventional path is used, all indexes are left in a valid state. If the direct path load method is used, any indexes that run out of space are left in an unusable state. You must drop these indexes before the load can continue. You can re-create the indexes either before continuing or after the load completes. Other indexes are valid if no other errors occurred. See Indexes Left in an Unusable State for other reasons why an index might be left in an unusable state.

Use this information to resume the load where it left off. To continue the discontinued load, use the SKIP parameter to specify the number of logical records that have already been processed by the previous load. At the time the load is discontinued, the value for SKIP is written to the log file in a message similar to the following:.

This message specifying the value of the SKIP parameter is preceded by a message indicating why the load was discontinued. Note that for multiple-table loads, the value of the SKIP parameter is displayed only if it is the same for all tables. However, there may still be situations in which you may want to do so. At some point, when you want to combine those multiple physical records back into one logical record, you can use one of the following clauses, depending on your data:.

In the following example, integer specifies the number of physical records to combine. For example, two records might be combined if a pound sign were in byte position 80 of the first record. If any other character were there, the second record would not be added to the first. If the condition is true in the current record, then the next physical record is read and concatenated to the current physical record, continuing until the condition is false.

If the condition is false, then the current physical record becomes the last physical record of the current logical record. THIS is the default. If the condition is true in the next record, then the current physical record is concatenated to the current logical record, continuing until the condition is false.

For the equal operator, the field and comparison string must match exactly for the condition to be true. For the not equal operator, they may differ in any character. This test is similar to THIS, but the test is always against the last nonblank character.

If the last nonblank character in the current physical record meets the test, then the next physical record is read and concatenated to the current physical record, continuing until the condition is false.

If the condition is false in the current record, then the current physical record is the last physical record of the current logical record. Specifies the starting and ending column numbers in the physical record. Column numbers start with 1. Either a hyphen or a colon is acceptable start - end or start : end. If you omit end, the length of the continuation field is the length of the byte string or character string.

If you use end, and the length of the resulting continuation field is not the same as that of the byte string or the character string, the shorter one is padded. Character strings are padded with blanks, hexadecimal strings with zeros. A string of characters to be compared to the continuation field defined by start and end, according to the operator.

The string must be enclosed in double or single quotation marks. The comparison is made character by character, blank padding on the right if necessary. A string of bytes in hexadecimal format used in the same way as str.

X'1FB would represent the three bytes with values 1F, B0, and 33 hexadecimal. The default is to exclude them. This is the only time you refer to positions in physical records. All other references are to logical records. That is, data values are allowed to span the records with no extra characters continuation characters in the middle. Assume that you have physical records 14 bytes long and that a period represents a space:. Assume that you have the same physical records as in Example Note that columns 1 and 2 are not removed from the physical records when the logical records are assembled.

Therefore, the logical records are assembled as follows the same results as for Example It defines the relationship between records in the datafile and tables in the database. The specification of fields and datatypes is described in later sections. The table must already exist. If the table is not in the user's schema, then the user must either use a synonym to reference the table or include the schema name as part of the table name for example, scott.

That method overrides the global table-loading method. The following sections discuss using these options to load data into empty and nonempty tables.

It requires the table to be empty before loading. After the rows are successfully deleted, a commit is issued. You cannot recover the data that was in the table before the load, unless it was saved with Export or a comparable utility. If data does not already exist, the new rows are simply loaded. The row deletes cause any delete triggers defined on the table to fire. For more information on cascaded deletes, see the information about data integrity in Oracle9i Database Concepts.

To update existing rows, use the following procedure:. It is only valid for a parallel load. Parameters for Parallel Direct Path Loads. You can choose to load or discard a logical record by using the WHEN clause to test a condition in the record. The WHEN clause appears after the table name and is followed by one or more field conditions. For example, the following clause indicates that any record with the value "q" in the fifth column position should be loaded:.

Parentheses are optional, but should be used for clarity with multiple comparisons joined by AND, for example:. If all data fields are terminated similarly in the datafile, you can use the FIELDS clause to indicate the default delimiters. Terminator strings can contain one or more characters. You can override the delimiter for any given column by specifying it after the column name.

Assume that the preceding data is read with the following control file and the record ends after dname:. In this case, the remaining loc field is set to null. This option inserts each index entry directly into the index, one record at a time.

Instead, index entries are put into a separate, temporary storage area and merged with the original index at the end of the load. This method achieves better performance and produces an optimal index, but it requires extra storage space. During the merge, the original index, the new index, and the space for new entries all simultaneously occupy storage space. The resulting index may not be as optimal as a freshly sorted one, but it takes less space to produce. It also takes more time because additional UNDO information is generated for each index insert.

This option is suggested for use when either of the following situations exists:. The remainder of this section details important ways to make use of that behavior. Some data storage and transfer media have fixed-length physical records. When the data records are short, more than one can be stored in a single, physical record to use the storage space efficiently.

For example, assume the data is as follows:. The same record could be loaded with a different specification. The following control file uses relative positioning instead of fixed positioning. Instead, scanning continues where it left off. A single datafile might contain records in a variety of formats.

Consider the following data, in which emp and dept records are intermixed:. A record ID field distinguishes between the two formats. Department records have a 1 in the first column, while employee records have a 2.

The following control file uses exact positioning to load this data:. The records in the previous example could also be loaded as delimited data. The following control file could be used:. It causes field scanning to start over at column 1 when checking for data that matches the second format. A single datafile may contain records made up of row objects inherited from the same base row object type.

For example, consider the following simple object type and object table definitions in which a nonfinal base object type is defined along with two object subtypes that inherit from the base type:.

The following input datafile contains a mixture of these row objects subtypes. A type ID field distinguishes between the three subtypes.

Loading Column Objects for more information on loading object types. This logic Not working — Chaya. C mpg-vg-alg Please look into that — Chaya. Show 1 more comment.

Active Oldest Votes. Improve this answer. Add a comment. Sign up or log in Sign up using Google. Sign up using Facebook. Sign up using Email and Password. Post as a guest Name. X'hex-string' A string of bytes in hexadecimal format used in the same way as the character string described above. X'1FB would represent the three bytes with values 1F, b , and 33 hex.

This is the only time you refer to character positions in physical records. All other references are to logical records. This allows data values to span the records with no extra characters continuation characters in the middle.

Trailing blanks in the physical records are part of the logical records. You cannot fragment records in secondary datafiles SDFs into multiple physical records. Then the next physical record record2 should be appended to it. If record2 also has an asterisk in column 1, then record3 is appended also. If record2 does not have an asterisk in column 1, then it is still appended to record1, but record3 begins a new logical record.

In the next example, you specify that if the current physical record record1 has a comma in the last non-blank data column. If a record does not have a comma in the last column, it is the last physical record of the current logical record. In the last example, you specify that if the next physical record record2 has a "10" in columns 7 and 8.

Then it should be appended to the preceding physical record record1. If a record does not have a "10" in columns 7 and 8, then it begins a new logical record. It defines the relationship between records in the datafile and tables in the database.

The specification of fields and datatypes is described in later sections. The table must already exist. Otherwise, the table name should be prefixed by the username of the owner as follows:. It is only valid for a parallel load. For more information, see Parallel Data Loading Models. You can choose to load or discard a logical record by using the WHEN clause to test a condition in the record.

The WHEN clause appears after the table name and is followed by one or more field conditions. For example, the following clause indicates that any record with the value "q" in the fifth column position should be loaded:.

Parentheses are optional, but should be used for clarity with multiple comparisons joined by AND. For example. Then the WHEN clause is evaluated. A row is inserted into the table only if the WHEN clause is true. Field conditions are discussed in detail in Specifying Field Conditions.

If a WHEN directive fails on a record, that record is discarded skipped. Note also that, the skipped record is assumed to be contained completely in the main datafile, therefore, a secondary data file will not be affected if present. If all data fields are terminated similarly in the datafile, you can use the FIELDS clause to indicate the default delimiters.

The syntax is:. Note: Terminators are strings not limited to a single character. Note: Enclosure strings do not have to be a single character. You can override the delimiter for any given column by specifying it after the column name.

See Specifying Delimiters for more information on delimiter specification. The remaining LOC field is set to null. Syntax for this feature is given in High-Level Syntax Diagrams. This option inserts each index entry directly into the index, one row at a time.

Instead, index entries are put into a separate, temporary storage area and merged with the original index at the end of the load. This method achieves better performance and produces an optimal index, but it requires extra storage space. During the merge, the original index, the new index, and the space for new entries all simultaneously occupy storage space. The resulting index may not be as optimal as a freshly sorted one, but it takes less space to produce.

It also takes more time, since additional UNDO information is generated for each index insert. This option is suggested for use when:. Specifying Field Conditions A field condition is a statement about a field in a logical record that evaluates as true or false.

First, positions in the field condition refer to the logical record, not to the physical record. Second, you may specify either a position in the logical record or the name of a field that is being loaded. Either start-end or start:end is acceptable, If you omit end the length of the field is determined by the length of the comparison string.

If the lengths are different, the shorter field is padded. If the field col2 is an attribute of a column object col1, when referring to col2 in one of the directives, you must use the notation col1. If the comparison is true, the current row is inserted into the table. See below. It can be used in place of a literal string in any field comparison. The condition is TRUE whenever the column is entirely blank. Using it is the same as specifying an appropriately-sized literal string of blanks.

For example, the following specifications are equivalent:. Note: There can be more than one "blank" in a multi-byte character set. It is a good idea to use the BLANKS keyword with these character sets instead of specifying a string of blank characters. The character string will match only a specific sequence of blank characters, while the BLANKS keyword will match combinations of different blank characters.

When a data field is compared to a shorter literal string, the string is padded for the comparison; character strings are padded with blanks; for example:. If position contains 4 blanks, then the clause evaluates as true. You may load any number of a table's columns. Columns defined in the database, but not specified in the control file, are assigned null values this is the proper way to insert null values.

A column specification is the name of the column, followed by a specification for the value to be put in that column. The list of columns is enclosed by parentheses and separated with commas as follows:. See Generating Data. If the column's value is read from the datafile, the data field that contains the column's value is specified.

In this case, the column specification includes a column name that identifies a column in the database table, and a field specification that describes a field in a data record. The field specification includes position, datatype, null restrictions, and defaults. It is not necessary to specify all attributes when loading column objects.

Any missing attributes will be set to NULL. Filler fields have names but they are not loaded into the table. Also, filler fields can occur anyplace in the data file. A CHAR field, however, can contain any character data. You may specify one datatype for each field; if unspecified, CHAR is assumed.

The position may either be stated explicitly or relative to the preceding field. The first character position in a logical record is 1. Either start-end or start:end is acceptable. If you omit end, the length of the field is derived from the datatype in the datafile. Note that CHAR data specified without start or end is assumed to be length 1. If it is impossible to derive a length from the datatype, an error message is issued. A number of characters as specified by n are skipped before reading the value for the current field.

So it starts in column 29 and continues until a slash is encountered. When you are determining field positions, be alert for TABs in the datafile. The load then fails with multiple "invalid number" and "missing field" errors. These kinds of errors occur when the data contains TABs. When printed, each TAB expands to consume several columns on the paper.

In the datafile, however, each TAB is still only one character. The use of delimiters to specify relative positioning of fields is discussed in detail in Specifying Delimiters. For an example, see the second example in Extracting Multiple Logical Records. A logical record may contain data for one of two tables, but not both. The remainder of this section details important ways to make use of that behavior.

Some data storage and transfer media have fixed-length physical records. When the data records are short, more than one can be stored in a single, physical record to use the storage space efficiently. For example, if the data looks like. The same record could be loaded with a different specification. The following control file uses relative positioning instead of fixed positioning.

Instead, scanning continues where it left off. That mechanism is described next. A single datafile might contain records in a variety of formats.

A record ID field distinguishes between the two formats. Department records have a "1" in the first column, while employee records have a "2". The following control file uses exact positioning to load this data:.

Again, the records in the previous example could also be loaded as delimited data. The following control file could be used:. This keyword causes field scanning to start over at column 1 when checking for data that matches the second format. The following functions are described:. The LOAD keyword is required in this situation. The SKIP keyword is not permitted.

In addition, no memory is required for a bind array. This is the simplest form of generated data. It does not vary during the load, and it does not vary between loads. It is converted, as necessary, to the database column type. You may enclose the value within quotation marks, and must do so if it contains white space or reserved words.

Be sure to specify a legal value for the target column. If the value is bad, every row is rejected. To set a column to null, do not specify that column at all. Oracle automatically sets that column to null when loading the row. Use the RECNUM keyword after a column name to set that column to the number of the logical record from which that row was loaded. Records are counted sequentially from the beginning of the first datafile, starting with record 1.

Thus it increments for records that are discarded, skipped, rejected, or loaded. If the column is of type CHAR, then the date is loaded in the form ' dd-mon-yy. If the system date is loaded into a DATE column, then it can be accessed in a variety of forms that include the time and the date. It does not increment for records that are discarded or skipped. MAX The sequence starts with the current maximum value for the column plus the increment. If a row is rejected that is, it has a format error or causes an Oracle error , the generated sequence numbers are not reshuffled to mask this.

If four rows are assigned sequence numbers 10, 12, 14, and 16 in a particular column, and the row with 12 is rejected; the three rows inserted are numbered 10, 14, and 16, not 10, 12, This allows the sequence of inserts to be preserved despite data errors.

When you correct the rejected data and reinsert it, you can manually set the columns to agree with the sequence. Because a unique sequence number is generated for each logical input record, rather than for each table insert, the same sequence number can be used when inserting data into multiple tables.

This is frequently useful behavior. For example, your data format might define three logical records in every input record. To generate sequence numbers for these records, you must generate unique numbers for each of the three inserts. There is a simple technique to do so. Use the number of table-inserts per record as the sequence increment and start the sequence numbers for each insert with successive numbers.

Suppose you want to load the following department names into the DEPT table. Each input record contains three department names, and you want to generate the department numbers automatically. You could use the following control file entries to generate unique department numbers:. They all use 3 as the sequence increment the number of department names in each record.

This control file loads Accounting as department number 1, Personnel as 2, and Manufacturing as 3. The sequence numbers are then incremented for the next record, so Shipping loads as 4, Purchasing as 5, and so on. These datatypes are grouped into portable and non-portable datatypes. Within each of these two groups, the datatypes are subgrouped into length-value datatypes and value datatypes.

The main grouping, portable vs. This issue arises due to a number of platform specificities such as differences in the byte ordering schemes of different platforms big-endian vs.

Note that not all of these problems apply to all of the non-portable datatypes. The sub-grouping, value vs. While value datatypes assume a single part to a datafield, length-value datatypes require that the datafield consist of two sub fields -- the length subfield which specifies how long the second value subfield is.

The length of the field is the length of a full-word integer on your system. This length cannot be overridden in the control file. The data is a half-word binary integer unsigned. The length of the field is a half-word integer is on your system.

One way to determine its length is to make a small control file with no data and look at the resulting log file. See your Oracle operating system-specific documentation for details.

The data is a single-precision, floating-point, binary number. The length of the field is the length of a single-precision, floating-point binary number on your system. The data is a double-precision, floating-point binary number. The length of the field is the length of a double-precision, floating-point binary number on your system. The decimal value of the binary representation of the byte is loaded.

For example, the input character x"1C" is loaded as ZONED data is in zoned decimal format: a string of decimal digits, one per byte, with the sign included in the last byte. The length of this field is equal to the precision number of digits that you specify. DECIMAL data is in packed decimal format: two digits per byte, except for the last byte which contains a digit and sign.

The default is zero indicating an integer. In the data record, this field would take up 4 bytes. The data is a varying-length, double-byte character string.

It consists of a length subfield followed by a string of double-byte characters DBCS. The length of the current field is given in the first two bytes. This length is a count of graphic double-byte characters. So it is multiplied by two to determine the number of bytes to read. The maximum length specifies the number of graphic double byte characters.

So it is also multiplied by two to determine the maximum length of the field in bytes. It is a good idea to specify a maximum length for such fields whenever possible, to minimize memory requirements. See Determining the Size of the Bind Array for more details. Both start and end identify single-character byte positions in the file.

It consists of a binary length subfield followed by a character string of the specified length. A maximum length specified in the control file does not include the size of the length subfield. The default buffer size is 4 Kb. These fields can be delimited and can have lengths or maximum lengths specified in the control file. The data field contains character data.

If no length is given, CHAR data is assumed to have a length of 1. A field of datatype CHAR may also be variable-length delimited or enclosed. See Specifying Delimiters. This guarantees that a large enough buffer is allocated for the value and is necessary even if the data is delimited or enclosed. The data field contains character data that should be converted to an Oracle date using the specified date mask. Attention: Whitespace is ignored and dates are parsed from left to right unless delimiters are present.

The length specification is optional, unless a varying-length date mask is specified. In the example above, the date mask specifies a fixed-length date format of 11 characters.

But, with a specification such as. In this case, a length must be specified. Similarly, a length is required for any Julian dates date mask "J" --a field length is required any time the length of the date string could exceed the length of the mask that is, the count of characters in the mask. It is a good idea to specify the length whenever you use a mask, unless you are absolutely sure that the length of the data is less than, or equal to, the length of the mask.

Either of these overrides the length derived from the mask. The mask may be any valid Oracle date mask. If you omit the mask, the default Oracle date mask of "dd-mon-yy" is used. The length must be enclosed in parentheses and the mask in quotation marks. A field of datatype DATE may also be specified with delimiters.

For more information, see Loading All-Blank Fields. The data is a string of double-byte characters DBCS. That value is multiplied by 2 to find the length of the field in bytes. The syntax for this datatype is:. For example, let [ ] represent shift-in and shift-out characters, and let represent any double-byte character. These datatypes are the human-readable, character form of numeric data. Attention: The data is a number in character form, not binary representation.

Both "5. The data is raw, binary data loaded "as is". It does not undergo character set conversion. If loaded into a RAW database column, it is not converted by Oracle. It cannot be loaded into a DATE or number column. The length of this field is the number of bytes specified in the control file. This length is limited only by the length of the target column in the database and by memory resources.

RAW datafields can not be delimited. If multiple lengths are specified and they conflict, then one of the lengths takes precedence. A warning is issued when a conflict exists. The following rules determine which field length is used:.

In this case, the log file shows the actual length used under the heading "Len" in the column table:. The server defines the datatypes for the columns in the database. The link between these two is the column name specified in the control file. The server does any necessary data conversion to store the data in the proper internal format. It does not do datatype conversion when loading nested tables as a separate table from the parent.

The datatype of the data in the file does not necessarily need to be the same as the datatype of the column in the Oracle table. Oracle automatically performs conversions, but you need to ensure that the conversion makes sense and does not generate errors. For instance, when a datafile field with datatype CHAR is loaded into a database column with datatype NUMBER, you must make sure that the contents of the character field represent a valid number.

You indicate how the field is delimited by using a delimiter specification after specifying the datatype. If the terminator delimiter is found in the first column position, the field is null.

Then the current position is advanced until no more adjacent whitespace characters are found. This allows field values to be delimited by varying amounts of whitespace.

Enclosed fields are read by skipping whitespace until a non-whitespace character is encountered. If that character is the delimiter, then data is read up to the second delimiter. Any other character causes an error. If two delimiter characters are encountered next to each other, a single occurrence of the delimiter character is used in the data value.

However, if the field consists of just two delimiter characters, its value is null. The syntax for delimiter specifications is:. BY An optional keyword for readability.

If the data is not enclosed, the data is read as a terminated field. AND This keyword specifies a trailing enclosure delimiter which may be different from the initial enclosure delimiter. If the AND clause is not present, then the initial and trailing delimiters are assumed to be the same.

Sometimes the same punctuation mark that is a delimiter also needs to be included in the data. To make that possible, two adjacent delimiter characters are interpreted as a single occurrence of the character, and this character is included in the data.

For example, this data:. For this reason, problems can arise when adjacent fields use the same delimiters. For example, the following specification:. But if field1 and field2 were adjacent, then the results would be incorrect, because. The default maximum length of delimited data is bytes. So delimited fields can require significant amounts of storage for the bind array. A good policy is to specify the smallest possible maximum value; see Determining the Size of the Bind Array. Trailing blanks can only be loaded with delimited datatypes.

For more discussion on whitespace in fields, see Trimming Blanks and Tabs. If conflicting lengths are specified, one of the lengths takes precedence. A warning is also issued when a conflict exists.



0コメント

  • 1000 / 1000