PureData – Table Effective Practices

Here a few tips, which can make a significant difference in the efficiency and effectiveness of developers and users, making information available to them when developing and creating analytic objects.  This information can, also, be very help to data modelers.  While some of these recommendations are not enforced by Netezza/PureData, this fact makes them no less helpful to your community.

Alter table to Identify Primary Keys (PK)

  • Visually helps developers and users to know what the keys primary keys of the table are
  • Primary key information can, also, be imported as metadata by other IBM tools (e.g. InfoSphere, Datastage, Data Architect, Governance Catalog, Aginity, etc.)
  • The query optimizer will use these definitions to define efficient query execution plans

Alter table to Identify Foreign Keys (FK)

  • Illustrate table relationships for developers and users
  • Foreign key information can, also, be imported as metadata by other IBM tools (e.g. InfoSphere, Datastage, Data Architect, Governance Catalog, Aginity, etc.)
  • The query optimizer will use these definitions to define efficient query execution plans

Limit Distribution Key to Non-Updatable Fields

  • This one seems obvious, but this problem occurs regularly if tables and optimizations are not properly planned; Causing an error will be generated, if an update is attempted against a field contained in the distribution of a table.

Use Null on Fields

  • Using ‘Not Null’ whenever the field data and ETL transformation rules can enforce it, helps improve performance by reducing the number of null condition checks performed and reduces storage.

Use Consistent Field Properties

  • Use the same data type and field length in all tables with the same field name reduces the amount of interpretation/conversion required by the system, developers, and report SQL.

Schedule Table Optimizations

  • Work with your DBA’s to determine the best scheduling time, system user, and priority of groom and generate statistics operations. Keep in mind the relationship to when the optimizations occur in relation to when users need to consume the data. All too often, this operation is not performed before users need the performance and/or is driven by DBA choice, without proper consideration to other processing performance needs.  This has proven, especially true, in data warehousing when the DBA does not have Data warehousing experience and/or does not understand the load patterns of the ETL/ELT process.

Related Links

One thought on “PureData – Table Effective Practices

Comments are closed.