10g OCP – Loading and Unloading Data

In Oracle 10g, there are several different ways of loading and unloading data. Prior to Oracle 10g the two methods outside of running queries were RMAN and import and export. In 10g RMAN still exists and is exanded upon. It is also the backup tool of choice for many DBAs as it actually simplifies things and gives you other benefits such a block corruption checking. Import and export are still there as well. Import will always be there but export will be going away. Why would they keep the import utility? Because you can export an Oracle 8i or 9i database and import it into Oracle 10g or the new 11g that will be coming out. Datapump is the new utility that replaces import and export.

So what has changed?

First of all, the command is different. To invoke export or import you would just use exp or imp. With datapump it is similar but follows with ‘dp’ for data pump: expdp and impdp. Also, if you are using new features such as Transparent Data Encryption, the old exp utility will not export encrypted columns. Your only choice is to use datapump in place of it. Datapump is also much faster than export and import as well as more flexible. You can run multiple jobs as well as run single jobs in parallel. You can also attach from jobs and detach from jobs.  You cannot use datapump to import an export from the traditional utilities. You can still use imp in Oracle 10g.

What does the datapump architecture look like?

For starters when you invoke a datapump job it starts a process on the server. If you use export and import you would attach across the network to the database server but you would work with files that were local to your computer meaning that if you imported a file you did it from a local drive and if you exported a file you would do it to a local drive. This makes it more efficient from not going over the network. The one drawback is you do lose some flexibility but nothing that FTP or SCP can’t solve.

Datapump has an interactive mode just like the old utilities it replaces. The drawback of using interactive mode is you have a limited set of choices and you lose many options (but most people use interactive mode for simplicity anyways). With Datapump you can only backup to disk. In the Veritas Net Backup Agent for Oracle, there is some interesting information for taking datapump backup jobs and sending them across the network, but I believe that is external to Oracle, meaning Oracle is not aware of this, unlike RMAN being aware of tape backups because of the MML that links the libobk.so file.

For help on the command structure and parameters of datapump export and import you can issue: expdp help=y and impdp help=y. There are many options with datapump and to get the exact specifics I would recomment reading Sam Alapati’s Oracle 10g study guide and / or refer to the Oracle documentation. What I intend to go over are some of the highlights and strong points for the exam. (What I think you need to know).

Benefits of Datapump technology:

  • Ability to restart data pump jobs
  • Parallel execution capabilities
  • Ability to attach to running jobs
  • Database to Database direct transfer using DB Links and SQL Net
  • Finegrained data import capability
  • Remap schemas, tablespaces, and filenames
  • Ability to estimate space requirements.

What does the datapump technology consist of?

  1. The DBMS_DATA_PUMP package
  2. The DBMS_METADATA package
  3. expdp and impdp command line utilities.

There are two data access methods.

  1. Direct Path
    • Part of the direct path API. Much faster load and dump times.
  2. External Tables
    • Read and write data to OS files. This gives a lot of flexibility and I have seen many people use this in place of SQL Loader for getting data from those files inside the database.

With the Direct path option, you want to be aware of clustered tables, active triggers on tables, single partitions with global indexes, referential integrity constraints, domain indexes on LOB columns, fine grained access control enabled in the insert mode, and tables with BFILE or opaque column types.

WHEW, now I can breath. : )

Data pump creates a dump file which is a proprietary format file holds your data. You also have the option of log files and SQL Files which will give you all the DDL statements it will use for importing. You specify this by setting the SQLFILE parameter. You will also have to setup Directory objects. Once you identify this, you have to grant a user read and write access to the directory. The oracle user or whoever the DB runs as also needs permissions to this directory on the operating system.

A simple datapump export command will look something like this:

expdp / directory=data_pump_dir dumpfile=dumpfilename.dmp

You also have the option of using a parameter file much in the same way you did in the original imp/exp commands. It would look something like this for datapump:

expdp parfile=myparfile.par

You can decide to do interactive data pump if you would like but the thing to remember is it will not give you as much control or as many option as explicitly defining everything in the command line.

There are 4 types of exports that you can do.

  1. Full export mode
  2. Schema mode
  3. Tablespace mode
  4. Table mode

In data pump there are so many options that I am not going to go through them here. I would recommend Sam Alapati’s book or going directly to the Oracle documentation. I would also go ahead and play with it and try some of the things out in order to get comfortable with it. I will also talk about some of the import functionality but also refer to the documentation to find all the answer’s and to get some experience with it.

Data pump import lets you manipulate your data in many way.

  • Filtering (Include or Exclude items even based on a Query)
  • Remap Schema
  • Remap Tablespace
  • Remap Datafiles
  • Transform (ability to exclude certain attributes like storage and tablespace clauses, modify the DDL)
  • Flashback Time (import data consistent to a flashback time – some restrictions)

If you want to view data pump jobs in the data dictionary to be able to attach to them using either expdp or impdp, it is DBA_DATAPUMP_JOBS.

External Tables is another thing that has been enhanced in 10g. External Tables used to have the restriction of being read only. You now have the ability to read and write to external tables. There are two types. The Oracle_loader type and the Oracle_datapump type. The Oracle_loader is the traditional and default. It will not write to an external table but the new Oracle_datapump will. The reason is because they use 2 different access drivers. Datapump does write it’s external tables using a proprietary format. Another restriction on external tables is that you cannot index them or use DML against them. You can however use them on another database if you’d like. You can also populate external tables utilizing parallel operations.

Transportable tablespaces is the last item I will cover. You use datapump to export the metadata and import it into the new database. If you want to transport the tablespace across platforms then you will have to make sure the endianess matches or modify the file. You can query the v$transportable_platform view to see which endian formats are used on which platform. If you have to modify the tablespace, then you can use RMAN on the source or destination machine to do so. Some other restrictions are the tablespace must be in read only mode prior to exporting metadata and copying the datafiles and the compatibaility level on both databases must be at 10.0.0 or higher.

1 Comment »

  1. Chaitra said,

    July 7, 2008 @ 12:08 am

    Good document for datapump.

RSS feed for comments on this post · TrackBack URI

Leave a Comment