Over the years have occasionally use the action column feature, however, the last month or so I have found myself using it quite a lot. This is especially true in relation to the tea set and not just in relation to the change capture stage.
The first thing you need to know is, if you want to prevent getting the ‘no action column found’ notice on the target stage, need to ensure that the action column has been coded to be a single character field char (1). Otherwise, the Netezza connector stage will not recognize your field as an action column.
While most developers will commonly work with the action column feature in relation to the change capture stage, it can also be very useful if you have created a field from one or more inputs to tell you what behavior the row requires. I have found that this approach can be very useful and efficient under the right circumstances.
Action column configuration example
Change Code Values Mapping To Action Column
Here’s a quick reference table to provide the interpretation of the change type code to the actual one character action column value to which it will need to be interpreted.
Change Code Type
Change Type Code
Action Column Value
Copy (Data Without Changes)
value for this Change Type
Example Transformer Stage, Derivation
Here is a quick transformer stage derivation coding example to take advantage of the action call capabilities. If you haven’t already handled the removal of the copy rows, you may also want to add a constraint.
The combination I most frequently find myself using is the insert and update combination.
if Lnk_Out_To_Tfm.change_code=1 then ‘I’
Else if Lnk_Out_To_Tfm.change_code=2 then ‘D’
Else if Lnk_Out_To_Tfm.change_code=3 then ‘U’
Home > InfoSphere Information Server 11.7.0 > InfoSphere DataStage and QualityStage > Developing parallel jobs > Introduction to InfoSphere DataStage Balanced Optimization > Job design considerations > Specific considerations for the Netezza connector
During the course of the week, the discussion happened regarding the different places where a person might read the DataStage and QualityStage logs in InfoSphere. I hadn’t really thought about it, but here are a few places that come to mind:
IBM InfoSphere DataStage and QualityStage Operations Console
IBM InfoSphere DataStage and QualityStageDirector client
IBM InfoSphere DataStage and QualityStageDesigner client by pressing Ctrl+L
While investigating a recent Infosphere Information Server (IIS), Datastage, Essbase Connect error I found the explanations of the probable causes of the error not to be terribly meaningful. So, now that I have run our error to ground, I thought it might be nice to jot down a quick note of the potential cause of the ‘Client Commands are Currently Not Being Accepted’ error, which I gleaned from the process.
Error Message Id
An error occurred while processing the request on the server. The error information is 1051544 (message on contacting or from application:[<<DateTimeStamp>>]Local////3544/Error(1013204) Client Commands are Currently Not Being Accepted.
Possible Causes of The Error
This Error is a problem with access to the Essbase object or accessing the security within the Essbase Object. This can be a result of multiple issues, such as:
Object doesn’t exist – The Essbase object didn’t exist in the location specified,
Communications – the location is unavailable or cannot be reached,
Path Security – Security gets in the way to access the Essbase object location
Essbase Security – Security within the Essbase object does not support the user or filter being submitted. Also, the Essbase object security may be corrupted or incomplete.
Essbase Object Structure – the Essbase object was not properly structured to support the filter or the Essbase filter is malformed for the current structure.
IBM Knowledge Center, InfoSphere Information Server 11.7.0, Connecting to data sources, Enterprise applications, IBM InfoSphere Information Server Pack for Hyperion Essbase
While chasing an error to which only applied to join type stages, I thought it might be nice to identify what the InfoSphere Information Server DataStage / QualityStage are. There are three of them, as you can see from the picture above, which are the:
And, Merge Stage.
All three stages that join data based on the values of identified keycolumns.
IBM Knowledge Center, InfoSphere Information Server 11.7.0, InfoSphere DataStage and QualityStage, Developing parallel jobs, Processing Data, Lookup Stage
When you are controlling a chain of sequences in the job stream and taking advantage of reusable (multiple instances) jobs it is useful to be able to pass the Invocation ID from the master controlling sequence and have it passed down and assigned to the job run. This can easily be done with needing to manual enter the values in each of the sequences, by leveraging the DSJobInvocationId variable. For this to work:
The job must have ‘Allow Multiple Instance’ enabled
The Invocation Id must be provided in the Parent sequence must have the Invocation Name entered
The receiving child sequence will have the invocation variable entered
At runtime, a DataStage invocation id instance of the multi-instance job will generate with its own logs.
This approach allows for the reuse of job and the assignment of meaningful instance extension names, which are managed for a single point of entry in the object tree.
While working with a client’s 9.1 DataStage version, I ran into a situation where they wanted to parameterize SQL where clause lists in an Oracle Connector stage, which honestly was not very straight forward to figure out. First, if the APT_OSL_PARAM_ESC_SQUOTE is not set and single quotes are used in the parameter, the job creates unquoted invalid SQL when the parameter is populated. Second, I found much of the information confusing and/or incomplete in its explanation. After some research and some trial and error, here is how I resolved the issue. I’ll endeavor to be concise, but holistic in my explanation.
When this Variable applies
This where I know this process applies, there may be other circumstances to which is this applicable, but I’m listing the ones here with which I have recent experience.
Infosphere Information Server Datastage
Versions 91, 11.3, and 11.5
Versions 11g and 12c
Here is a brief explanation of the steps I used to implement the where clause as a parameter. Please note that in this example, I am using a job parameter to populate on a portion of the where clause, you can certainly pass the entire where clause as a parameter, if it is not too long.
Configure Project Variable in Administrator
Add APT_OSL_PARAM_ESC_SQUOTE to project in Administrator
Populate the APT_OSL_PARAM_ESC_SQUOTE Variable \
Create job parameter
Following your project name convention or standard practice, if you customer and/or project do not have established naming conventions, create the job parameter in the job. See jp_ItemSource parameter in the image below.
Add job parameter to Custom SQL in Select Oracle Connector Stage
On the Job parameter has been created, add the job parameter to the SQL statement of the job.
IBM Knowledge Center > InfoSphere Information Server 11.5.0
Connecting to data sources > Databases > Oracle databases > Oracle connector
I’ve tried to explain the difference between OLTP systems and a Data Warehouse to my managers many times, as I’ve worked at a hospital as a Data Warehouse Manager/data analyst for many years. Why was the list that came from the operational applications different than the one that came from the Data Warehouse? Why couldn’t I just get a list of patients that were laying in the hospital right now from the Data Warehouse? So I explained, and explained again, and explained to another manager, and another. You get the picture. In this article I will explain this very same thing to you. So you know how to explain this to your manager. Or, if you are a manager, you might understand what your data analyst can and cannot give you.
OLTP stands for On Line Transactional Processing. With other words: getting your data directly from the operational systems to make reports. An operational system is a system that is used for the day to day processes. For example: When a patient checks in, his or her information gets entered into a Patient Information System. The doctor put scheduled tests, a diagnoses and a treatment plan in there as well. Doctors, nurses and other people working with patients use this system on a daily basis to enter and get detailed information on their patients. The way the data is stored within operational systems is so the data can be used efficiently by the people working directly on the product, or with the patient in this case.
A Data Warehouse is a big database that fills itself with data from operational systems. It is used solely for reporting and analytical purposes. No one uses this data for day to day operations. The beauty of a Data Warehouse is, among others, that you can combine the data from the different operational systems. You can actually combine the number of patients in a department with the number of nurses for example. You can see how far a doctor is behind schedule and find the cause of that by looking at the patients. Does he run late with elderly patients? Is there a particular diagnoses that takes more time? Or does he just oversleep a lot? You can use this information to look at the past, see trends, so you can plan for the future.
The difference between OLTP and Data Warehousing
This is how a Data Warehouse works:
The data gets entered into the operational systems. Then the ETL processes Extract this data from these systems, Transforms the data so it will fit neatly into the Data Warehouse, and then Loads it into the Data Warehouse. After that reports are formed with a reporting tool, from the data that lies in the Data Warehouse.
This is how OLTP works:
Reports are directly made from the data inside the database of the operational systems. Some operational systems come with their own reporting tool, but you can always use a standalone reporting tool to make reports form the operational databases.
Pro’s and Con’s
There is no strain on the operational systems during business hours
As you can schedule the ETL processes to run during the hours the least amount of people are using the operational system, you won’t disturb the operational processes. And when you need to run a large query, the operational systems won’t be affected, as you are working directly on the Data Warehouse database.
Data from different systems can be combined
It is possible to combine finance and productivity data for example. As the ETL process transforms the data so it can be combined.
Data is optimized for making queries and reports
You use different data in reports than you use on a day to day base. A Data Warehouse is built for this. For instance: most Data Warehouses have a separate date table where the weekday, day, month and year is saved. You can make a query to derive the weekday from a date, but that takes processing time. By using a separate table like this you’ll save time and decrease the strain on the database.
Data is saved longer than in the source systems
The source systems need to have their old records deleted when they are no longer used in the day to day operations. So they get deleted to gain performance.
You always look at the past
A Data Warehouse is updated once a night, or even just once a week. That means that you never have the latest data. Staying with the hospital example: you never knew how many patients are in the hospital are right now. Or what surgeon didn’t show up on time this morning.
You don’t have all the data
A Data Warehouse is built for discovering trends, showing the big picture. The little details, the ones not used in trends, get discarded during the ETL process.
Data isn’t the same as the data in the source systems
Because the data is older than those of the source systems it will always be a little different. But also because of the Transformation step in the ETL process, data will be a little different. It doesn’t mean one or the other is wrong. It’s just a different way of looking at the data. For example: the Data Warehouse at the hospital excluded all transactions that were marked as cancelled. If you try to get the same reports from both systems, and don’t exclude the cancelled transactions in the source system, you’ll get different results.
online transactional processing (OLTP)
You get real time data
If someone is entering a new record now, you’ll see it right away in your report. No delays.
You’ve got all the details
You have access to all the details that the employees have entered into the system. No grouping, no skipping records, just all the raw data that’s available.
You are putting strain on an application during business hours.
When you are making a large query, you can take processing space that would otherwise be available to the people that need to work with this system for their day to day operations. And if you make an error, by for instance forgetting to put a date filter on your query, you could even bring the system down so no one can use it anymore.
You can’t compare the data with data from other sources.
Even when the systems are similar. Like an HR system and a payroll system that use each other to work. Data is always going to be different because it is granulated on a different level, or not all data is relevant for both systems.
You don’t have access to old data
To keep the applications at peak performance, old data, that’s irrelevant to day to day operations is deleted.
Data is optimized to suit day to day operations
And not for report making. This means you’ll have to get creative with your queries to get the data you need.
So what method should you use?
That all depends on what you need at that moment. If you need detailed information about things that are happening now, you should use OLTP. If you are looking for trends, or insights on a higher level, you should use a Data Warehouse.