In its simplest form, the convert function in Infosphere DataStage is a string replacement operation. Convert can be used to replace a specific character, a list of characters, or a unicode character (e.g. thumbs Up Sign or Grinning Face).
convert(‘<<Value to be replaced’,'<<Replacement value >>’,<<Input field>>)
Using the Convert Function to remove a list of Characters
Special Characters in DataStage Handles/converts special characters in a transformer stage, which can cause issues in XML processing and certain databases.
Beware when you see this message when working with Boolean in DataStage, the message displays as informational (at list it did for me) not as a warning or an error. Even though it seems innocuous, what it meant for my job, was the Boolean (‘true’ / ‘false’) was not being interpreted and everything posted to ‘false’.
In DataStage the Netezza ‘Boolean’ field/Data SQL type maps to the ‘Bit’ SQL type, which expects a numeric input of Zero (0) or one (1). So, my solution (once I detected the problem during unit testing) was to put Transformer Stage logic in place to convert the Boolean input to the expected number value.
Netezza to Datastage Data Type Mapping
Netezza data types
data types (SQL types)
Expected Input value
0 or 1 (1 = true, 0 = false)
Transformer Stage logic Boolean Handling Logic
A Netezza Boolean field can store: true values, false values, and null. So, some thought should be given to you desired data outcome for nulls
This first example sets a that the nulls are set to a specific value, which can support a specific business rule for null handling and, also, provide null handling for non-nullable fields. Here we are setting nulls to the numeric value for ‘true’ and all other non-true inputs to ‘false’.
If isnull(Lnk_Src_In.USER_ACTIVE) then 1 Else if Lnk_Src_In.USER_ACTIVE = ‘true’ Then 1 Else 0
These second examples sets a that the nulls are set by the Else value, if your logic direction is correct value and still provides null handling for non-nullable fields.
If Lnk_Src_In.USER_ACTIVE = ‘true’ Then 1 Else 0
If Lnk_Src_In.USER_ACTIVE = ‘False’ Then 0 Else 1
Director Log Message
<<Link Name Where Message Occurred>>: Numeric string expected. Use default value.
Or something like this:
<<Link Name Where Message Occurred>>: Numeric string expected for input column ‘<<Field Name Here>>‘. Use default value.
PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza user-defined functions, UDX data types reference information, Supported data types, Boolean
InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from Netezza to DataStage
Some recent research to eliminate some ETL Data Conversion issues, made me want to make an enhancement to the documentation provided by IBM, to prevent repeating the research.
Netezza data types and their equivalent InfoSphere DataStage data types
Netezza data types
InfoSphere DataStage data types (SQL types)
Numeric, decimal, double
Decimal and double are aliases of Numeric
TIME WITH TIME ZONE
The value of time with time zone will be returned without the time zone information.
You cannot load the interval data type from an external table.
Data type conversions from Netezza to DataStage
InfoSphere Information Server,InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from Netezza to DataStage
During a recent project, I had reason to investigate what the maximum character field links for Netezza to be able to fit the data ETL ‘ed from the source. The constraints on Netezza character fields are not as long as some other databases, therefore, I thought I would document and a quick post for future reference. Knowing these limits will help with your ETL design and construction but you may know where to truncate your field input data.
InfoSphere DataStage data types (SQL type)
CHARACTER, CHAR(n), CHARACTER(n)
Fixed-length character string, blank padded to length n. If you do not specify n, the default is an unsized CHAR value.
CHARACTER VARYING, VARCHAR(n), CHARACTER VARYING(n), CHAR VARYING(n)
Variable length character string to a maximum length of n. If you do not specify n, the default is an unsized VARCHAR value. There is no blank padding, and the value is stored as entered.
NATIONAL CHARACTER, NATIONAL CHAR(n), NCHAR(size)
Fixed-length character string, blank padded to length n. If you do not specify n, the default is an unsized NCHAR value.
NATIONAL CHARACTER VARYING, NATIONAL CHAR VARYING, NATIONAL CHAR VARYING(n), NATIONAL CHARACTER VARYING(n), and NVARCHAR(n)
InfoSphere Information Server, InfoSphere Information Server 11.5.0, Connecting to data sources, Databases, Netezza Performance Server, Netezza connector, Designing jobs by using the Netezza connector, Defining a Netezza connector job, Data type conversions, Data type conversions from DataStage to Netezza
IMAN is an Acronym for ‘InfoSphere Metadata Asset Manager’.
The IMAM (InfoSphere Metadata Asset Manager) is a thin client application is used Import, export, and manage common metadata assets in Metadata Repository of Infosphere and Information Server (IIS) family of applications.
While helping a customer we encountered the [SQLCODE=HY000][Native=46] ERROR, which was a new one for me. So here are a few notes to help the next unlucky soul may run into the error.
Netezza Error Reason:
[SQLCODE=HY008][Native=51] Operation canceled; [SQLCODE=HY000][Native=46] ERROR: External Table : count of bad input rows reached maxerrors limit
What Does the Error Mean
In a nutshell, it means invalid data was submitted and could not be inserted.
What To Do
Basically, you need to go to the Netezza logs to see why the rows were rejected and resolve input data error, then resubmit your transactions. The logs are temporary and reused, so, you need to get to them before they are overwritten.
One of the changes between the old versions of IBM Infosphere Information Server (IIS) and the 11.3 and 11.5 version, which may not be obvious is the improvement in Secure Socket Layer (SSL). Beginning with 11.3 all communications between the client and services tier is done over HTTPS (SSL). This includes all clients that access the services tier, whether a rich desktop client, a browser-based client or a command-line client.