Denodo Data Consolidation & Denormalization

Denodo Data Consolidation & Denormalization
Denodo Data Consolidation & Denormalization

As the data modeling process in denodo moves through the conceptual layers of the data warehouse, there is an evolution of the data structure and their associated metadata.

The Base Layer

As the modeling process begins the base layer is the ingestion layer, where the source system data structures are recreated in denodo and field are transformed in denodo Virtual Query Language (VQL) data types. the Business layer is what folks with a traditional data warehousing background would think of as Staging or landing. These base layer views should most closely mirror the technical structure and data characteristics of the input data source and will be the least business friend in their organization, naming, and metadata.

The Semantics layer

The semantics layer is where the major data reorganization, data transformation, and the application of business friend field names and metadata begins. The semantics layer is what folks with a traditional data warehousing background would think of as the Data Warehouse (DW) or Enterprise Data Warehouse (EDW). The semantics layer of the logical data warehouse (LDW) performs serval tasks:

  • Data from multiple input sources are consolidated
  • The model becomes multi-dimensional (Fact and Dimension oriented)
  • Field names and descriptive metadata are changed to meaningful, domain normalized, business-friendly names and descriptions.
  • Domain normalizing business rules and transformations are applied.
  • Serves as a data source for the business layer and reporting layer.

The Business Layer

The business layer, which is considered optional by denodo, is modeled along a more narrow business subject orientation and more specialized business rules are applied. This is what folks with a traditional data warehousing background would think of as a Datamart (DM).

The business layer of the logical data warehouse (LDW) performs serval tasks:

  • Limits and optimizes the data to facilitate business intelligence and report activities concerning a specific line of business or business topic (e.g. Financials, Human Resources, Inventory, Asset management, etc. )
  • Business-specific/customized rules and metadata are applied
  • Supplements the semantic layer and serves as a data source for the reporting layer.
  • Additional data consolidation and data structure denormalization (flattening) may occur in the business layer

The Reporting Layer

The reporting layer, which is considered optional by denodo, is the most customized layer and sees the most reporting topic specialization and specific need transformation. The reporting layer is where a traditional data warehousing may provide customized reporting, or system interface views, interface ETL’s to produce interface files, and reporting team do more of their own development.

The reporting layer of the logical data warehouse (LDW) performs serval tasks:

  • Provides consumer-specific customized rules and metadata
  • Provides consumer-specific data organization/layouts
  • Data is optimized for consumer purposes and may be highly or entirely denormalized to meet consumer needs.

Denodo Best Practices For Base Views

The denodo “Base Layer” in the Logical Data Warehouse (LDW) can be thought of as the Data Staging local layer in a more traditional data warehouse (DW) development pattern.  The Base layer is the level at which the source system data structures are transformed into denodo field types and the source data structures are rendered as created in base views (bv).

Base views (bv), the first step in virtualizing data, are the denodo structures reflecting the source system structure and the second step behind the data source connection and, therefore, are essential elements for the other layers of the Logical Data Warehouse (LDW).  To provide some guidance to facilitate the usefulness and performance of base views here are some best practices:

  • Use consistent Object Naming conventions.  It is strongly recommended that the denodo standard naming conventions be used.
  • Import and or create the Primary Keys (PK), Foreign Keys (FK), and Associations.
  • Have Statistics Collection been set and include all critical fields?
  • Base views, as a rule, should not be cached unless absolutely necessary for reasons of performance.
  • Create indexes on Primary Keys (PK), Surrogate keys, and Foreign Keys (FK)
  • Create performance Indexes to mirror sources system to improve performance.
  • Populate view Metadata properties describing the type and the nature of the data which the view contains. Note:  if you have a governance team, they may want to manage the metadata in the denodo data catalog.
  • Retain the original table name (applying naming convention prefix and field names to facilitate data Lineage traceability.?
  • Use denodo tools against tables, where possible, rather than (Manual) SQL views, Database views, or Stored Procedure.  Denodo cannot rewrite or optimize these objects.
  • Field metadata should be annotated with “Not Used” if the field is always null, blank, or empty.  This saves time and labor when working with levels and researching data issues

Related Reference

How to Check Linux Version?

While researching an old install for an upgrade system requirement compliance, I discovered that I b=need to validate which Linux version was installed.  So, here is a quick note on the command I used to validate which version of Linux was installed.

Command

  • cat /etc/os-release

Example Output of the command “os-release” file

Windows – Host File Location

Occasionally, I need to update the windows hosts files, but I seem to have a permanent memory block where the file is located. I have written the location into numerous documents, however, every time I need to verify and or up the host file I need to look up the path. Today, when I went to look it up I discovered that I had not actually posted it to this blog site. So, for future reference, I am adding it now.

Here is the path of the Windows Hosts file, the drive letter may change depending on the drive letter on which the Windows install was performed.

C:\WINDOWS\system32\drivers\etc

New CentOS 8 Linux Release

The new CentOS 8 rebuild is out. Christened version 8.0-1905, this release provides a secure, stable and a more reliable foundation for CentOS users such as organizations running high-performance websites and businesses with Linus experts that use CentOS daily for their workloads, but who do not need strong commercial support.

The new OS comes in after Red Hat released RHEL 8 – Red Hat Enterprise Linux – in May of this year. According to CentOS 8 release notes, the contributors note that this rebuild is 100% compliant with Red Hat’s redistribution policy. This Linux distro allows users to achieve successful operations using the robust power of an enterprise-class OS, but without the cost of support and certification. Below are some of the updates as outlined in CentOS 8 release notes that you can expect with this new release and some of the deprecated features.

What’s New in the Just Released CentOS 8?

  • BaseOS and Appstream
  • New container tools
  • Systemwide crypto policies
  • TCP stack improvements
  • DNF

· BaseOS and Appstream

The main repository or Base Operating System offers the components of distribution that in turn provide the running user space on the hardware, virtual machines, or even a container. The Application Stream or App stream offers all the apps you might want to run in particular user space. The Supplemental repository offers software that comes with special licensing.

· New Container Tools

With the aid of Podman, CentOS 8 supports Linux Containers. This replaces Docker and Mobdy, which depend on daemon and run as root. Unlike the previous release, the Podman in the new version does not depend on daemon. Podman allows users to create images from scratch using Buildah.

· Systemwide Crypto Policies

The command “update crypto policies” can be used to update the system-wide cryptographic policy on the new OS. The policies have settings for the following applications and libraries; NSS TLS library, Kerberos 5 library, Open SSH SSH2 protocol implementation, IKE protocol implementation & Libreswan IPsec, Open SSL TLS library and GnuTLS TLS library.

· TCP Stack Improvements

The CentOS 8 Linux distro also brings with it TCP stack version 4.16 with an improved ingress connection rate. The Linux kernel is now able to support the new BBR and NV control algorithms. This is very helpful in helping improve the Linux server internet speed.

· DNF – Dandified Yum

The new Operating System includes the basic foundations of the Yum package but is now upgraded to the DNF (Dandified Yum). Though it maintains a similar command-line interface and API to its predecessor, it does promise to be faster, seamless and super-efficient.

· Other Improvements

The CentOS also has a compiler based on the version 8.2 and includes support for more recent C ++ language standard versions, improved optimizations, more code, and hardening techniques as well as new hardware support and better warnings.

In addition to those features, the new CentOS 8 also supports secure guests, which using cryptographically signed images will ensure that the program retains its integrity. It also boasts of improved management of memory and support. CentOS 8 release notes state that the new OS will allow the Crash dump to take in kernel crash during all booting phases which were not possible before.

CentOS 8 gives encrypted storage to LUKS2. It also allows for enhancements made to the process scheduler to include the new deadline process scheduler. This Linux distro will also enable installations and boot from dual-in-line, non-volatile memory modules.

A great bonus feature is that you can manage the new software with Cockpit via a web browser. This feature is very user-friendly, making it great for system administrators and new users alike.

Deprecated Features and Functionalities

If you are upgrading from previous CentOS versions, the most significant change is seen in the nftables framework which has replaced iptables. Nfatables allows users to perform network address translation (NAT) mangling, packet classification, and packet filtering. Unlike iptables, nfatables helps to provide secure firewall support with enhanced performance, increased scalability, and easy code maintenance.

These changes, though not major, may cause problems with firewall functionality. Although upgrades using RHEL may be supported, it is not advisable to upgrade directly from much older versions of CentOS like CentOs 6 and below as they may not be compatible.

Users of CentOS as a desktop will see an update of the GNOME SHELL default interface to version 3.28, while still carrying the default display server as Wayland.

Final Thoughts

If you are looking to upgrade from previous versions, a system to do so directly is yet to be released. As such, your most favorable option would be to back up your data as you install the newly released CentOS 8. When it is up and running, you can then move all the data to the new system.

Nonetheless, the new CentOS 8 Linux release is an exciting feat. This OS provides a manageable and consistent platform that suits a wide variety of deployments. It comes with well-thought-out and ingenious software updates that will help avid users to build more robust container workloads and web apps.

denodo Virtualization – Useful Links

Here are some denodo Virtualization references, which may be useful.

Reference Name Link
denodo Home Page https://www.denodo.com/en/about-us/our-company
denodo Platform 7.0 Documentation https://community.denodo.com/docs/html/browse/7.0/
denodo Knowledge Base and Best Practices https://community.denodo.com/kb/
denodo Tutorials https://community.denodo.com/tutorials/
denodo Express 7.0 Download https://community.denodo.com/express/download
Denodo Virtual Data Port (VDP) https://community.denodo.com/kb/download/pdf/VDP%20Naming%20Conventions?category=Operation
JDBC / ODBC drivers for Denodo https://community.denodo.com/drivers/
Denodo Governance Bridge – User Manual https://community.denodo.com/docs/html/document/denodoconnects/7.0/Denodo%20Governance%20Bridge%20-%20User%20Manual

Related References

Linux man (manual) Command

Despite the ‘man’ commands, relative simplicity and appearance of unimportance, the ‘man’ command is, perhaps, one of the most important commands to lean in Linux. 

Why the ‘man’ command important?

The true value of the ‘man’ command is that provides access to the online manuals (documentation), which will be consulted often until Linux commands and functions have to be learned and internalized.  Even after learning the more familiar and commonly used Linux command and functions, one will still need to refer the less commonly used capabilities or to confirm something which has been used in a while.

When some the more arrogant Linux users will sometime tell folks with questions to “read the frickin’ manual” (RTFM), the ‘man’ command is what they are usually talking about.  Although there are other perfectly useful reference materials online (e.g., git documentation project) or commercial books, the ‘man’ command should be the go-to place for documentation.  The reason this is actually very simple, if the command or function is installed in your version or environment instance of Linux, then man pages will be available.  Therefore, usually, there will no need to go search on the internet for answers or carrying books around.

The ‘man’ command syntax

The syntax of the ‘man’ command is simple and easy to learn to use.  In fact, the ‘man’ command is so easy to use that people frequently will not even use options when they use the man command and enter ‘man’ command and the keyword.

‘man’ command syntax

man [options] (keywords)

Simple examples to illustrate how to use the ‘man’ command.

Example to pull up the ‘Man’ command documentation

[blog-server ~]$ man man

In this example, the man command is using ‘man man’ to pull up its own online documentation.

Example to pull up the ‘ls’ command documentation

[blog-server ~]$ man ls

In this example, the man command is using ‘man ls’ to pull up the ‘list directory contents‘ online documentation.

list directory contents

Example to pull up the ‘cp’ command documentation

[blog-server ~]$ man cp

In this example, the man command is using ‘man cp’ to pull up the ‘copy files and directories ‘ online documentation.

Example screenshot of the ‘cp’ (copy files and directories) file command online documentation