The discard file is created in the same record and file format as the data file. For data files in stream record format, the same record terminator that is found in the data file is also used in the discard file.
If no discard clauses are included in the control file or on the command line, then a discard file is not created even if there are discarded records that is, records that fail to satisfy all of the WHEN clauses specified in the control file. However, you must specify at least one.
The directory parameter specifies a directory to which the discard file will be written. The default file name is the name of the data file, and the default file extension or file type is. A discard file name specified on the command line overrides one specified in the control file. If a discard file with that name already exists, then it is either overwritten or a new version is created, depending on your operating system.
Parent topic: Specifying the Discard File. You can specify a different number of discards for each data file. Or, if you specify the number of discards only once, then the maximum number of discards specified applies to all files. When the discard limit is reached, processing of the data file terminates and continues with the next data file, if one exists. The list shows different ways that you can specify a name for the discard file from within the control file.
To specify a discard file with file name circular and default file extension or file type of. To specify a discard file named notappl with the file extension or file type of.
To specify a full path to the discard file forget. An attempt is made to insert every record into such a table. Therefore, records may be rejected, but none are discarded. Case study 7, Extracting Data from a Formatted Report, provides an example of using a discard file. A file name specified on the command line overrides any discard file that you may have specified in the control file.
If there is a match using the equal or not equal specification, then the field is set to NULL for that row. Any field that has a length of 0 after blank trimming is also set to NULL. This specification is used for every date or timestamp field unless a different mask is specified at the field level. A mask specified at the field level overrides a mask specified at the table level. Datetime and Interval Data Types for information about specifying datetime data types at the field level.
Oracle Database Globalization Support Guide. The following sections provide a brief introduction to some of the supported character encoding schemes. Data can be loaded in multibyte format, and database object names fields, tables, and so on can be specified with multibyte characters. In the control file, comments and object names can also use multibyte characters. Unicode is a universal encoded character set that supports storage of information from most languages in a single character set.
Unicode provides a unique code value for every character, regardless of the platform, program, or language. A character in UTF-8 can be 1 byte, 2 bytes, or 3 bytes long. AL32UTF8 and UTF8 character sets are not compatible with each other as they have different maximum character widths four versus three bytes per character.
Multibyte fixed-width character sets for example, AL16UTF16 are not supported as the database character set. This alternative character set is called the database national character set. Only Unicode character sets are supported as the database national character set.
However, the Oracle database supports only UTF encoding with big-endian byte ordering AL16UTF16 and only as a database national character set, not as a database character set. When data character set conversion is required, the target character set should be a superset of the source data file character set. Otherwise, characters that have no equivalent in the target character set are converted to replacement characters, often a default character such as a question mark?
This causes loss of data. If they are specified in bytes, and data character set conversion is required, then the converted values may take more bytes than the source values if the target character set uses more bytes than the source character set for any character that is converted. This will result in the following error message being reported if the larger target value exceeds the size of the database column:.
You can avoid this problem by specifying the database column size in characters and also by using character sizes in the control file to describe the data. Another way to avoid this problem is to ensure that the maximum column size is large enough, in bytes, to hold the converted value. Character-Length Semantics. Rows might be rejected because a field is too large for the database column, but in reality the field is not too large.
A load might be abnormally terminated without any rows being loaded, when only the field that really was too large should have been rejected. Parent topic: Input Character Conversion. Normally, the specified name must be the name of an Oracle-supported character set. However, because you are allowed to set up data using the byte order of the system where you create the data file, the data in the data file can be either big-endian or little-endian.
Therefore, a different character set name UTF16 is used. All primary data files are assumed to be in the same character set. Byte Ordering. Oracle Database Globalization Support Guide for more information about the names of the supported character sets. Control File Character Set. If the control file character set is different from the data file character set, then keep the following issue in mind.
To ensure that the specifications are correct, you may prefer to specify hexadecimal strings, rather than character string values. If hexadecimal strings are used with a data file in the UTF Unicode encoding, then the byte order is different on a big-endian versus a little-endian system.
For example, "," comma in UTF on a big-endian system is X'c'. On a little-endian system it is X'2c00'. This allows the same syntax to be used in the control file on both a big-endian and a little-endian system. For example, the specification CHAR 10 in the control file can mean 10 bytes or 10 characters. These are equivalent if the data file uses a single-byte character set.
However, they are often different if the data file uses a multibyte character set. To avoid insertion errors caused by expansion of character strings during character set conversion, use character-length semantics in both the data file and the target database columns. Byte-length semantics are the default for all data files except those that use the UTF16 character set which uses character-length semantics by default.
The following data types use byte-length semantics even if character-length semantics are being used for the data file, because the data is binary, or is in a special binary-encoded form in the case of ZONED and DECIMAL :. This is necessary to handle data files that have a mix of data of different data types, some of which use character-length semantics, and some of which use byte-length semantics. The SMALLINT length field takes up a certain number of bytes depending on the system usually 2 bytes , but its value indicates the length of the character string in characters.
Character-length semantics in the data file can be used independent of whether character-length semantics are used for the database columns. Therefore, the data file and the database columns can use either the same or different length semantics. The fastest way to load shift-sensitive character data is to use fixed-position fields without delimiters.
To improve performance, remember the following points:. If blanks are not preserved and multibyte-blank-checking is required, then a slower path is used.
This can happen when the shift-in byte is the last byte of a field after single-byte blank stripping is performed. Additionally, when an interrupted load is continued, the use and value of the SKIP parameter can vary depending on the particular case. The following sections explain the possible scenarios.
In a conventional path load, data is committed after all data in the bind array is loaded into all tables. If the load is discontinued, then only the rows that were processed up to the time of the last commit operation are loaded.
There is no partial commit of data. Parent topic: Interrupted Loads. In a direct path load, the behavior of a discontinued load varies depending on the reason the load was discontinued. Space errors when loading data into multiple subpartitions that is, loading into a partitioned table, a composite partitioned table, or one partition of a composite partitioned table :.
If space errors occur when loading into multiple subpartitions, then the load is discontinued and no data is saved unless ROWS has been specified in which case, all data that was previously committed will be saved.
The reason for this behavior is that it is possible rows might be loaded out of order. This is because each row is assigned not necessarily in order to a partition and each partition is loaded separately. If the load discontinues before all rows assigned to partitions are loaded, then the row for record "n" may have been loaded, but not the row for record "n-1".
Space errors when loading data into an unpartitioned table, one partition of a partitioned table, or one subpartition of a composite partitioned table:. In either case, this behavior is independent of whether the ROWS parameter was specified. When you continue the load, you can use the SKIP parameter to skip rows that have already been loaded. Parent topic: Discontinued Direct Path Loads. This means that when you continue the load, the value you specify for the SKIP parameter may be different for different tables.
If a fatal error is encountered, then the load is stopped and no data is saved unless ROWS was specified at the beginning of the load. In that case, all data that was previously committed is saved.
This means that the value of the SKIP parameter will be the same for all tables. When a load is discontinued, any data already loaded remains in the tables, and the tables are left in a valid state. If the direct path load method is used, then any indexes on the table are left in an unusable state.
You can either rebuild or re-create the indexes before continuing, or after the load is restarted and completes. Other indexes are valid if no other errors occurred. See Indexes Left in an Unusable State for other reasons why an index might be left in an unusable state. To continue the discontinued load, use the SKIP parameter to specify the number of logical records that have already been processed by the previous load.
At the time the load is discontinued, the value for SKIP is written to the log file in a message similar to the following:. This message specifying the value of the SKIP parameter is preceded by a message indicating why the load was discontinued. Note that for multiple-table loads, the value of the SKIP parameter is displayed only if it is the same for all tables.
To combine multiple physical records into one logical record, you can use one of the following clauses, depending on your data:. In the following example, integer specifies the number of physical records to combine. For example, two records might be combined if a pound sign were in byte position 80 of the first record.
If any other character were there, then the second record would not be added to the first. When you convert to a different operating system, you will probably need to modify these strings.
If your operating system uses the backslash character to separate directories in a path name, and if the version of the Oracle database server running on your operating system implements the backslash escape character for filenames and other nonportable strings, then you must specify double backslashes in your path names and use single quotation marks.
See your Oracle operating system-specific documentation for information about which escape characters are required or allowed. The version of the Oracle database server running on your operating system may not implement the escape character for nonportable strings.
When the escape character is disallowed, a backslash is treated as a normal character, rather than as an escape character although it is still usable in all other strings. Then path names such as the following can be specified normally:. Because the backslash is not recognized as an escape character, strings within single quotation marks cannot be embedded inside another string delimited by single quotation marks.
This rule also holds for double quotation marks. A string within double quotation marks cannot be embedded inside another string delimited by double quotation marks. To specify a datafile that contains the data to be loaded, use the INFILE clause, followed by the filename and optional file processing options string. If no filename is specified, the filename defaults to the control filename with an extension or file type of.
The information in this section applies only to primary datafiles. Any spaces or punctuation marks in the filename must be enclosed in single quotation marks.
See Specifying Filenames and Object Names. If your data is in the control file itself, use an asterisk instead of the filename. If you have data in the control file as well as datafiles, you must specify the asterisk first in order for the data to be read. This is the file-processing options string.
It specifies the datafile format. It also optimizes datafile reads. The syntax used for this string is specific to your operating system. See Specifying Datafile Format and Buffering. Filenames that include spaces or punctuation marks must be enclosed in single quotation marks.
For more details on filename specification, see Specifying Filenames and Object Names. Datafiles need not have the same file processing options, although the layout of the records must be identical.
For example, two files could be specified with completely different file processing options strings, and a third could consist of data in the control file. You can also specify a separate discard file and bad file for each datafile. In such a case, the separate bad files and discard files must be declared immediately after each datafile name. For example, the following excerpt from a control file specifies four datafiles with separate bad and discard files:.
If the data is included in the control file itself, then the INFILE clause is followed by an asterisk rather than a filename. The actual data is placed in the control file after the load configuration specifications. The syntax is:. For example, suppose that your operating system has the following option-string syntax:. To declare a file named mydata. For details on the syntax of the file processing options string, see your Oracle operating system-specific documentation.
This example uses the recommended convention of single quotation marks for filenames and double quotation marks for everything else. If you have specified that a bad file is to be created, the following applies:. On some systems, a new version of the file is created if a file with the same name already exists. See your Oracle operating system-specific documentation to find out if this is the case on your system. If you do not specify a name for the bad file, the name defaults to the name of the datafile with an extension or file type of.
The bad file is created in the same record and file format as the datafile so that the data can be reloaded after making corrections. For datafiles in stream record format, the record terminator that is found in the datafile is also used in the bad file. To specify a bad file with filename foo and default file extension or file type of. To specify a bad file with filename bad and file extension or file type of. If there is an error loading a LOB, the row is not rejected.
Rather, the LOB column is left empty not null with a length of zero 0 bytes. If the data can be evaluated according to the WHEN clause criteria even with unbalanced delimiters , then it is either inserted or rejected. Neither a conventional path nor a direct path load will write a row to any table if it is rejected because of reason number 2 in the previous list.
Additionally, a conventional path load will not write a row to any tables if reason number 1 or 3 in the previous list is violated for any one table. The row is rejected for that table and written to the reject file. The log file indicates the Oracle error for each rejected record. The records contained in this file are called discarded records. Discarded records do not satisfy any of the WHEN clauses specified in the control file. These records differ from rejected records. Discarded records do not necessarily have any bad data.
No insert is attempted on a discarded record. You can specify the discard file directly by specifying its name, or indirectly by specifying the maximum number of discards. The discard file is created in the same record and file format as the datafile. For datafiles in stream record format, the same record terminator that is found in the datafile is also used in the discard file. The default filename is the name of the datafile, and the default file extension or file type is.
A discard filename specified on the command line overrides one specified in the control file. If a discard file with that name already exists, it is either overwritten or a new version is created, depending on your operating system. A filename specified on the command line overrides any discard file that you may have specified in the control file.
The following list shows different ways you can specify a name for the discard file from within the control file:. An attempt is made to insert every record into such a table.
Therefore, records may be rejected, but none are discarded. You can limit the number of records to be discarded for each datafile by specifying an integer:. When the discard limit specified with integer is reached, processing of the datafile terminates and continues with the next datafile, if one exists. You can specify a different number of discards for each datafile. Or, if you specify the number of discards only once, then the maximum number of discards specified applies to all files.
Oracle9i Database Globalization Support Guide. The fastest way to load shift-sensitive character data is to use fixed-position fields without delimiters.
To improve performance, remember the following points:. The following sections provide a brief introduction to some of the supported character encoding schemes. Multibyte character sets support Asian languages. Data can be loaded in multibyte format, and database object names fields, tables, and so on can be specified with multibyte characters.
In the control file, comments and object names can also use multibyte characters. Unicode is a universal encoded character set that supports storage of information from most languages in a single character set.
Unicode provides a unique code value for every character, regardless of the platform, program, or language. A character in UTF-8 can be 1 byte, 2 bytes, or 3 bytes long. Conceptually such objects are stored in their entirety in a single column position in a row. These objects do not have object identifiers and cannot be referenced. These objects are stored in tables, known as object tables, that have columns corresponding to the attributes of the object.
Columns in other tables can refer to these objects by using the OIDs. A nested table is a table that appears as a column in another table. All operations that can be performed on other tables can also be performed on nested tables. An array is an ordered set of built-in types or objects, called elements. Each array element is of the same type and has an index, which is a number corresponding to the element's position in the VARRAY.
LOBs can have an actual value, they can be null , or they can be "empty. A partitioned object in an Oracle database is a table or index consisting of partitions pieces that have been grouped, typically by common logical attributes.
For example, sales data for the year might be partitioned by month. The data for each month is stored in a separate partition of the sales table. Each partition is stored in a separate segment of the database and can have different physical attributes.
Oracle provides a direct path load API for application developers. In some case studies, additional columns have been added. The case studies are numbered 1 through 11, starting with the simplest scenario and progressing in complexity. Case Study 1: Loading Variable-Length Data - Loads stream format records in which the fields are terminated by commas and may be enclosed by quotation marks.
The data is found at the end of the control file. Case Study 3: Loading a Delimited, Free-Format File - Loads data from stream format records with delimited fields and sequence numbers. Case Study 4: Loading Combined Physical Records - Combines multiple physical records into one logical record corresponding to one database row.
This case study uses character-length semantics. These files are installed when you install Oracle Database. If the sample data for the case study is contained within the control file, then there will be no. Case study 2 does not require any special set up, so there is no. Case study 7 requires that you run both a starting setup script and an ending cleanup script. For example, to execute the SQL script for case study 1, enter the following:.
Be sure to read the control file for any notes that are specific to the particular case study you are executing. This is because the log file for each case study is produced when you execute the case study, provided that you use the LOG parameter. If you do not wish to produce a log file, omit the LOG parameter from the command line. This is done, as follows:. For example, if the table emp was loaded, enter:.
Load data from multiple datafiles during the same load session. Load data into multiple tables during the same load session. Specify the character set of the data. Selectively load data you can load records based on the records' values.
Manipulate the data before loading it, using SQL functions. Generate unique sequential key values in specified columns. Use the operating system's file system to access the datafiles. Load data from disk, tape, or named pipe. If you omit end, the length of the continuation field is the length of the byte string or character string.
If you use end, and the length of the resulting continuation field is not the same as that of the byte string or the character string, the shorter one is padded. Character strings are padded with blanks, hexadecimal strings with zeros. This is the only time you refer to positions in physical records. All other references are to logical records. That is, data values are allowed to span the records with no extra characters continuation characters in the middle.
This means that the continuation characters are removed if they are in positions 3 through 5 of the record. It also means that the characters in positions 3 through 5 are removed from the record even if the continuation characters are not in positions 3 through 5.
Note that columns 1 and 2 are not removed from the physical records when the logical records are assembled. Therefore, the logical records are assembled as follows the same results as for Example The specification of fields and datatypes is described in later sections. The table must already exist. If the table is not in the user's schema, then the user must either use a synonym to reference the table or include the schema name as part of the table name for example, scott.
That method overrides the global table-loading method. The following sections discuss using these options to load data into empty and nonempty tables.
It requires the table to be empty before loading. Case study 1, Loading Variable-Length Data, provides an example. If data does not already exist, the new rows are simply loaded. Case study 4, Loading Combined Physical Records, provides an example. The row deletes cause any delete triggers defined on the table to fire. For more information about cascaded deletes, see the information about data integrity in Oracle Database Concepts.
To update existing rows, use the following procedure:. It is valid only for a parallel load. You can choose to load or discard a logical record by using the WHEN clause to test a condition in the record.
The WHEN clause appears after the table name and is followed by one or more field conditions. For example, the following clause indicates that any record with the value "q" in the fifth column position should be loaded:. Parentheses are optional, but should be used for clarity with multiple comparisons joined by AND, for example:. If all data fields are terminated similarly in the datafile, you can use the FIELDS clause to indicate the default delimiters.
Description of the illustration terminat. Description of the illustration enclose. You can override the delimiter for any given column by specifying it after the column name. Specifying Delimiters for a complete description of the syntax. Assume that the preceding data is read with the following control file and the record ends after dname:.
In this case, the remaining loc field is set to null. This option inserts each index entry directly into the index, one record at a time. Instead, index entries are put into a separate, temporary storage area and merged with the original index at the end of the load.
This method achieves better performance and produces an optimal index, but it requires extra storage space. During the merge operation, the original index, the new index, and the space for new entries all simultaneously occupy storage space. The resulting index may not be as optimal as a freshly sorted one, but it takes less space to produce. It also takes more time because additional UNDO information is generated for each index insert.
This option is suggested for use when either of the following situations exists:. The number of records to be loaded is small compared to the size of the table a ratio of or less is recommended. Some data storage and transfer media have fixed-length physical records.
When the data records are short, more than one can be stored in a single, physical record to use the storage space efficiently. For example, assume the data is as follows:. The same record could be loaded with a different specification. The following control file uses relative positioning instead of fixed positioning.
Instead, scanning continues where it left off. A single datafile might contain records in a variety of formats. Consider the following data, in which emp and dept records are intermixed:. A record ID field distinguishes between the two formats. Department records have a 1 in the first column, while employee records have a 2.
The following control file uses exact positioning to load this data:. The records in the previous example could also be loaded as delimited data. The following control file could be used:. It causes field scanning to start over at column 1 when checking for data that matches the second format.
A single datafile may contain records made up of row objects inherited from the same base row object type. For example, consider the following simple object type and object table definitions, in which a nonfinal base object type is defined along with two object subtypes that inherit their row objects from the base type:.
The following input datafile contains a mixture of these row objects subtypes. A type ID field distinguishes between the three subtypes. See case study 5, Loading Data into Multiple Tables, for an example. Multiple rows are read at one time and stored in the bind array. It does not apply to the direct path load method because a direct path load uses the direct path API, rather than Oracle's SQL interface.
The bind array must be large enough to contain a single row. Otherwise, the bind array contains as many rows as can fit within it, up to the limit set by the value of the ROWS parameter. Although the entire bind array need not be in contiguous memory, the buffer for each field in the bind array must occupy contiguous memory. Large bind arrays minimize the number of calls to the Oracle database and maximize performance.
In general, you gain large improvements in performance with each increase in the bind array size up to rows. Increasing the bind array size to be greater than rows generally delivers more modest improvements in performance. The size in bytes of rows is typically a good value to use. It is not usually necessary to perform the detailed calculations described in this section. Read this section when you need maximum performance or an explanation of memory usage. The bind array never exceeds that maximum.
If that size is too large to fit within the specified maximum, the load terminates with an error. The bind array's size is equivalent to the number of rows it contains times the maximum length of each row. The maximum length of a row is equal to the sum of the maximum field lengths, plus overhead, as follows:. Many fields do not vary in size. These fixed-length fields are the same for each loaded row.
There is no overhead for these fields. The maximum lengths describe the number of bytes that the fields can occupy in the input data record. That length also describes the amount of storage that each field occupies in the bind array, but the bind array includes additional overhead for fields that can vary in size.
When specified without delimiters, the size in the record is fixed, but the size of the inserted field may still vary, due to whitespace trimming. So internally, these datatypes are always treated as varying-length fields—even when they are fixed-length fields. A length indicator is included for each of these fields in the bind array. The space reserved for the field in the bind array is large enough to hold the longest possible value of the field. The length indicator gives the actual length of the field for each row.
On most systems, the size of the length indicator is 2 bytes. On a few systems, it is 3 bytes. To determine its size, use the following control file:. This control file loads a 1-byte CHAR using a 1-row bind array. In this example, no data is actually loaded because a conversion error occurs when the character a is loaded into a numeric column deptno.
The bind array size shown in the log file, minus one the length of the character field is the value of the length indicator.
0コメント