Denodo Data Consolidation & Denormalization

As the data modeling process in denodo moves through the conceptual layers of the data warehouse, there is an evolution of the data structure and their associated metadata.

The Base Layer

As the modeling process begins the base layer is the ingestion layer, where the source system data structures are recreated in denodo and field are transformed in denodo Virtual Query Language (VQL) data types. the Business layer is what folks with a traditional data warehousing background would think of as Staging or landing. These base layer views should most closely mirror the technical structure and data characteristics of the input data source and will be the least business friend in their organization, naming, and metadata.

The Semantics layer

The semantics layer is where the major data reorganization, data transformation, and the application of business friend field names and metadata begins. The semantics layer is what folks with a traditional data warehousing background would think of as the Data Warehouse (DW) or Enterprise Data Warehouse (EDW). The semantics layer of the logical data warehouse (LDW) performs serval tasks:

  • Data from multiple input sources are consolidated
  • The model becomes multi-dimensional (Fact and Dimension oriented)
  • Field names and descriptive metadata are changed to meaningful, domain normalized, business-friendly names and descriptions.
  • Domain normalizing business rules and transformations are applied.
  • Serves as a data source for the business layer and reporting layer.

The Business Layer

The business layer, which is considered optional by denodo, is modeled along a more narrow business subject orientation and more specialized business rules are applied. This is what folks with a traditional data warehousing background would think of as a Datamart (DM).

The business layer of the logical data warehouse (LDW) performs serval tasks:

  • Limits and optimizes the data to facilitate business intelligence and report activities concerning a specific line of business or business topic (e.g. Financials, Human Resources, Inventory, Asset management, etc. )
  • Business-specific/customized rules and metadata are applied
  • Supplements the semantic layer and serves as a data source for the reporting layer.
  • Additional data consolidation and data structure denormalization (flattening) may occur in the business layer

The Reporting Layer

The reporting layer, which is considered optional by denodo, is the most customized layer and sees the most reporting topic specialization and specific need transformation. The reporting layer is where a traditional data warehousing may provide customized reporting, or system interface views, interface ETL’s to produce interface files, and reporting team do more of their own development.

The reporting layer of the logical data warehouse (LDW) performs serval tasks:

  • Provides consumer-specific customized rules and metadata
  • Provides consumer-specific data organization/layouts
  • Data is optimized for consumer purposes and may be highly or entirely denormalized to meet consumer needs.

Denodo Best Practices For Base Views

The denodo “Base Layer” in the Logical Data Warehouse (LDW) can be thought of as the Data Staging local layer in a more traditional data warehouse (DW) development pattern.  The Base layer is the level at which the source system data structures are transformed into denodo field types and the source data structures are rendered as created in base views (bv).

Base views (bv), the first step in virtualizing data, are the denodo structures reflecting the source system structure and the second step behind the data source connection and, therefore, are essential elements for the other layers of the Logical Data Warehouse (LDW).  To provide some guidance to facilitate the usefulness and performance of base views here are some best practices:

  • Use consistent Object Naming conventions.  It is strongly recommended that the denodo standard naming conventions be used.
  • Create indexes on Primary Keys (PK), Surrogate keys, and Foreign Keys (FK)
  • Import and or create the Primary Keys (PK), Foreign Keys (FK), and Associations.
  • Have Statistics Collection been set and include all critical fields?
  • Base views, as a rule, should not be cached unless absolutely necessary for reasons of performance.
  • Create performance Indexes to mirror sources system to improve performance.
  • Populate view Metadata properties describing the type and the nature of the data which the view contains.
  • Populate field Metadata properties describing Field. This is important for a few reasons:
    • The description can be inherited by view built from the base view.
    • This populates the Denodo Metadata Catalog within which data Stewards can maintain and improve the description.
    • Informs other developers and user of what the field is/means.
    • Field metadata should be annotated with “Not Used”, if the field is always null, blank, or empty.  This saves time and labor when researching data issues
  • Retain the original table name (applying naming convention prefix and field names to facilitate data Lineage traceability.?
  • Use denodo tools against tables, where possible, rather than (Manual) SQL views, Database views, or Stored Procedure.  Denodo cannot rewrite or optimize these objects.

Related Reference

Denodo Data Catalog Roles

The denodo catalog provides the data governance and self-service capabilities to supplement the denodo Virtual DataPort (VDP) core capabilities. Six roles provide the ability to assign or deny capabilities with the denodo data catalog and supplement the database, row, and column security and permissions of denodo Virtual DataPort (VDP).

The Tasks The Roles Can PerformDenodo Data Catalog Role Name
Assign categories, tags and custom properties groups to views and web services.data_catalog_classifier
Edit views, web services, and databases. Create, edit and delete tags, categories, custom properties groups, and custom properties.data_catalog_editor
Can do the same as a user with the roles “data_catalog_editor” and “data_catalog_classifier”.data_catalog_manager
Configure personalization options and content search.data_catalog_content_admin
This role can perform any action of all the other data catalog roles.data_catalog_admin
The exporter role can export the results of a query from the Denodo Data Catalog.data_catalog_exporter
denodo Virtualization
denodo Virtualization

Related References

denodo > User Manuals > Denodo Platform New Features Guide

denodo > User Manuals > Data Catalog Guide > Administration

Why Business Intelligence (BI) needs a Semantic Data Model

A semantic data model is a method of organizing and representing corporate data that reflects the meaning and relationships among data items. This method of organizing data helps end users access data autonomously using familiar business terms such as revenue, product, or customer via the BI (business intelligence) and other analytics tools. The use of a semantic model offers a consolidated, unified view of data across the business allowing end-users to obtain valuable insights quickly from large, complex, and diverse data sets.

What is the purpose of semantic data modeling in BI and data virtualization?

A semantic data model sits between a reporting tool and the original database in order to assist end-users with reporting. It is the main entry point for accessing data for most organizations when they are running ad hoc queries or creating reports and dashboards. It facilitates reporting and improvements in various areas, such as:

  • No relationships or joins for end-users to worry about because they’ve already been handled in the semantic data model
  • Data such as invoice data, salesforce data, and inventory data have all been pre-integrated for end-users to consume.
  • Columns have been renamed into user-friendly names such as Invoice Amount as opposed to INVAMT.
  • The model includes powerful time-oriented calculations such as Percentage in sales since last quarter, sales year-to-date, and sales increase year over year.
  • Business logic and calculations are centralized in the semantic data model in order to reduce the risk of incorrect recalculations.
  • Data security can be incorporated. This might include exposing certain measurements to only authorized end-users and/or standard row-level security.

A well-designed semantic data model with agile tooling allows end-users to learn and understand how altering their queries results in different outcomes. It also gives them independence from IT while having confidence that their results are correct.

Analytics Model Types

Every day, businesses are creating around 2.5 quintillion bytes of data, making it increasingly difficult to make sense and get valuable information from this data. And while this data can reveal a lot about customer bases, users, and market patterns and trends, if not tamed and analyzed, this data is just useless. Therefore, for organizations to realize the full value of this big data, it has to be processed. This way, businesses can pull powerful insights from this stockpile of bits.

And thanks to artificial intelligence and machine learning, we can now do away with mundane spreadsheets as a tool to process data. Through the various AI and ML-enabled data analytics models, we can now transform the vast volumes of data into actionable insights that businesses can use to scale operational goals, increase savings, drive efficiency and comply with industry-specific requirements.

We can broadly classify data analytics into three distinct models:

  • Descriptive
  • Predictive
  • Prescriptive

Let’s examine each of these analytics models and their applications.

Descriptive Analytics. A Look Into What happened?

How can an organization or an industry understand what happened in the past to make decisions for the future? Well, through descriptive analytics.

Descriptive analytics is the gateway to the past. It helps us gain insights into what has happened. Descriptive analytics allows organizations to look at historical data and gain actionable insights that can be used to make decisions for “the now” and the future, upon further analysis.

For many businesses, descriptive analytics is at the core of their everyday processes. It is the basis for setting goals. For instance, descriptive analytics can be used to set goals for better customer experience. By looking at the number of tickets raised in the past and their resolutions, businesses can use ticketing trends to plan for the future.

Some everyday applications of descriptive analytics include:

  • Reporting of new trends and disruptive market changes
  • Tabulation of social metrics such as the number of tweets, followers gained over some time, or Facebook likes garnered on a post.
  • Summarizing past events such as customer retention, regional sales, or marketing campaigns success.

To enhance their decision-making capabilities businesses have to reduce the data further to allow them to make better future predictions. That’s where predictive analytics comes in.

Predictive Analytics takes Descriptive Data One Step Further

Using both new and historical data sets predictive analytics to help businesses model and forecast what might happen in the future. Using various data mining and statistical algorithms, we can leverage the power of AI and machine learning to analyze currently available data and model it to make predictions about future behaviors, trends, risks, and opportunities. The goal is to go beyond the data surface of “what has happened and why it has happened” and identify what will happen.

Predictive data analytics allows organizations to be prepared and become more proactive, and therefore make decisions based on data and not assumptions. It is a robust model that is being used by businesses to increase their competitiveness and protect their bottom line.

The predictive analytics process is a step-by-step process that requires analysts to:

  • Define project deliverables and business objectives
  • Collect historical and new transactional data
  • Analyze the data to identify useful information. This analysis can be through inspection, data cleaning, data transformation, and data modeling.
  • Use various statistical models to test and validate the assumptions.
  • Create accurate predictive models about the future.
  • Deploy the data to guide your day-to-data actions and decision-making processes.
  • Manage and monitor the model performance to ensure that you’re getting the expected results.

Instances Where Predictive Analytics Can be Used

  • Propel marketing campaigns and reach customer service objectives.
  • Improve operations by forecasting inventory and managing resources optimally.
  • Fraud detection such as false insurance claims or inaccurate credit applications
  • Risk management and assessment
  • Determine the best direct marketing strategies and identify the most appropriate channels.
  • Help in underwriting by predicting the chances of bankruptcy, default, or illness.
  • Health care: Use predictive analytics to determine health-related risk and make informed clinical support decisions.

Prescriptive Analytics: Developing Actionable Insights from Descriptive Data

Prescriptive analytics helps us to find the best course of action for a given situation. By studying interactions between the past, the present, and the possible future scenarios, prescriptive analytics can provide businesses with the decision-making power to take advantage of future opportunities while minimizing risks.

Using Artificial Intelligence (AI) and Machine Learning (ML), we can use prescriptive analytics to automatically process new data sets as they are available and provide the most viable decision options in a manner beyond any human capabilities.

When effectively used, it can help businesses avoid the immediate uncertainties resulting from changing conditions by providing them with fact-based best and worst-case scenarios. It can help organizations limit their risks, prevent fraud, fast-track business goals, increase operational efficiencies, and create more loyal customers.

Bringing It All Together

As you can see, different big data analytics models can help you add more sense to raw, complex data by leveraging AI and machine learning. When effectively done, descriptive, predictive, and prescriptive analytics can help businesses realize better efficiencies, allocate resources more wisely, and deliver superior customer success most cost-effectively. But ideally, if you wish to gain meaningful insights from predictive or even prescriptive analytics, you must start with descriptive analytics and then build up from there.

A Quick Guide to Creating Competition Battle Cards

Writing and The Written Word

Why Use Competition
Battle Cards?

During the B2B sales and
marketing processes, if there’s one question that the buyer is sure to ask, it
will be some variation of “how does your product stack up against X competition?”
Maybe the prospect is interested in a certain feature, price, or benefit –
regardless of the specifics, your reps need to speak intelligently about how
your product or service compares.

The struggle is that
in any B2B sales role, there’s a lot of information to remember. The ability to
retain every nuance of their product or service is no large feat, let alone the
details of every competition.

That’s where competition
battle cards come in. They’re essentially a cheat sheet for your sales reps.
When a prospect brings up the competition, the rep can open the battle card and
have instant access to that company’s product information…

View original post 734 more words

Using Logical Data Lakes

Today, data-driven decision making is at the center of all things. The emergence of data science and machine learning has further reinforced the importance of data as the most critical commodity in today’s world. From FAAMG (the biggest five tech companies: Facebook, Amazon, Apple, Microsoft, and Google) to governments and non-profits, everyone is busy leveraging the power of data to achieve final goals. Unfortunately, this growing demand for data has exposed the inefficiency of the current systems to support the ever-growing data needs. This inefficiency is what led to the evolution of what we today know as Logical Data Lakes.

What Is a Logical Data Lake?

In simple words, a data lake is a data repository that is capable of storing any data in its original format. As opposed to traditional data sources that use the ETL (Extract, Transform, and Load) strategy, data lakes work on the ELT (Extract, Load, and Transform) strategy. This means data does not have to be first transformed and then loaded, which essentially translates into reduced time and efforts. Logical data lakes have captured the attention of millions as they do away with the need to integrate data from different data repositories. Thus, with this open access to data, companies can now begin to draw correlations between separate data entities and use this exercise to their advantage.

Primary Use Case Scenarios of Data Lakes

Logical data lakes are a relatively new concept, and thus, readers can benefit from some knowledge of how logical data lakes can be used in real-life scenarios.

To conduct Experimental Analysis of Data:

  • Logical data lakes can play an essential role in the experimental analysis of data to establish its value. Since data lakes work on the ELT strategy, they grant deftness and speed to processes during such experiments.

To store and analyze IoT Data:

  • Logical data lakes can efficiently store the Internet of Things type of data. Data lakes are capable of storing both relational as well as non-relational data. Under logical data lakes, it is not mandatory to define the structure or schema of the data stored. Moreover, logical data lakes can run analytics on IoT data and come up with ways to enhance quality and reduce operational cost.

To improve Customer Interaction:

  • Logical data lakes can methodically combine CRM data with social media analytics to give businesses an understanding of customer behavior as well as customer churn and its various causes.

To create a Data Warehouse:

  • Logical data lakes contain raw data. Data warehouses, on the other hand, store structured and filtered data. Creating a data lake is the first step in the process of data warehouse creation. A data lake may also be used to augment a data warehouse.

To support reporting and analytical function:

  • Data lakes can also be used to support the reporting and analytical function in organizations. By storing maximum data in a single repository, logical data lakes make it easier to analyze all data to come up with relevant and valuable findings.

A logical data lake is a comparatively new area of study. However, it can be said with certainty that logical data lakes will revolutionize the traditional data theories.

Related References