Row Store Vs Column Store

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 20

Row Store Vs Column Store

Overview of Row Data Storage and Column Data Storage:

Relational databases typically use row-based data storage. However Column-


based storage is more suitable for many business applications. SAP HANA
supports both row-based and column-based storage, and is particularly
optimized for column-based storage.

As shown in the figure below, a database table is conceptually a two-


dimensional structure composed of cells arranged in rows and columns.

Because computer memory is structured linearly, there are two options for
the sequences of cell values stored in contiguous memory locations:

Row Storage – It stores table records in a sequence of rows.

Column Storage – It stores table records in a sequence of columns i.e. the


entries of a column is stored in contiguous memory locations.
Traditional databases store data simply in rows. The HANA in-memory
database stores data in both rows and columns. It is this combination of both
storage approaches that produces the speed, flexibility and performance of
the HANA database.

Advantages of column-based tables:

Faster Data Access:


Only affected columns have to be read during the selection process of a
query. Any of the columns can serve as an index.

Better Compression:
Columnar data storage allows highly efficient compression because the
majority of the columns contain only few distinct values (compared to
number of rows).

Better parallel Processing:


In a column store, data is already vertically partitioned. This means that
operations on different columns can easily be processed in parallel. If
multiple columns need to be searched or aggregated, each of these
operations can be assigned to a different processor core.

Advantages and disadvantages of row-based tables:


Row based tables have advantages in the following circumstances:
 The application needs to only process a single record at one time
(many selects and/or updates of single records).
 The application typically needs to access a complete record (or row).
 Neither aggregations nor fast searching are required.
 The table has a small number of rows (e. g. configuration tables,
system tables).

Row based tables have disadvantages in case of analytic applications where


aggregation are used and faster search & processing are required. In row
based tables all data in a row has to be read even though the requirement
may be there to access data from a few columns.

Which type of tables should be preferred – Row-based or Column-


based?
In case of analytic applications, where aggregations are used and faster
search & processing are required, row-based storage are not good. In row
based tables all data stored in a row has to be read even though the
requirement may be there to access data from a few columns. Hence, these
queries on huge amounts of data would take lots of times.

In columnar tables, this information is stored physically next to each other,


that significantly increases the speed of certain data queries.
The following example shows the difference between the usage of column
and row storage, and positions them relative to row and column queries.
Column storage is most useful for OLAP queries (queries using any SQL
aggregate functions). Because, these queries get just a few attributes from
every data entry. But for traditional OLTP queries (queries not using any SQL
aggregate functions), it is more advantageous to store all attributes side-by-
side in row tables. HANA combines the benefits of both row- and column-
storage tables.

Conclusion:
To enable fast on-the-fly aggregations, ad-hoc reporting, and to benefit from
compression mechanisms it is recommended that transaction data is stored
in a column-based table.

The SAP HANA data-base allows joining row-based tables with column-based
tables. However, it is more efficient to join tables that are located in the
same row or column store. For example, master data that is frequently
joined with transaction data should also be stored in column-based tables.
Scheduling the SAP HANA Backups: 1651055
SAP HANA database backup and recovery: 1642148

Core Data Services in ABAP


Core Data services (CDS) are domain specific languages (DSL) and services for defining and
consuming semantically rich data models in SAP HANA. They are integral part of SAP HANA,
and can be leveraged in the ABAP stack.

Core Data Services are enhanced “view entities” in ABAP provides several advantages:.

 Pushes data intensive Calculations into database layer.


 Queries can be combined, ex: UNION, UNION ALL
 Nested Views are supported
 Association between view entities can be defined.
 Case expression supported
 Arithmetic, String and cast expression supported
 ABAP CDS supports INNER joins, OUTER joins, and RIGHT OUTER joins.
 Annotations can be used.
 Aggregations, Grouping, Filtering groups
 Support for AnyDB But no access to specific HANA features
 Supports about 90% of SQL features
 Supports ABAP and SAP HANA deployment.
Support for CDS in ABAP begins with ABAP 7.4 SP2. Especially with SAP NetWeaver ABAP
7.40 SP5 additional capabilities like CDS Associations are supported. CDS Associations can be
used to replace joins with simple path expression in queries.

The goal of ABAP CDS is to fully support the SQL-92 standard and beyond that to support
additional features likes associations, entities, create, read, update, and delete (CRUD) and OData
Services.
ABAP 7.4 SP05 support code to data paradigm. ABAP stack manages all entities and deploys all
artifacts from ABAP layer into the database layer. ABAP server act as master for developing,
managing and calling HANA views. On Activation of CDS views two objects are generated one
CDS Database View and other CDS entity. CDS views are defined for existing database tables,
any other views or CDS views in ABAP dictionary. Syntax for defining CDS view is similar to
SQL.

Simple CDS view:

The annotation @AbapCatalog.sqlViewName is mandatory, and specifies the name of DDL


source in the ABAP Dictionary and the database. On Activation of DDL source two objects are
created CDS database view and CDS entity. CDS database view can be accessed using SE11
ABAP dictionary. These representation are generated, updated, and deleted automatically when
activating DDL source.

In the above example simple CDS view, “ZCDS_DDL_SAMPLE2” is CDS ABAP Database
View and “ZCDS_SAMPLE2” is CDS entity. CDS entity can be identified using “DEFINE
VIEW” syntax and CDS ABAP Database view can be identified using
“@AbapCatalog.sqlViewName” annotation which on activation accessible from SE11- ABAP
Dictionary View. Same applies to all the examples shown below.

CDS View with column list and using column alias:


Here CDS view with column list with curly brackets and using column alias to rename a column
is shown. Column list are comma separated and alias names are given with keyword ‘AS’.

Case Expressions in CDS View:

In the new open SQL, ABAP CDS supports the CASE expression. It is also possible to nest cast
expression. An Alias name is required for the resulting column except for nested CASE. The
resulting column type is derived from expressions after THEN and ELSE clause.

String, Arithmetic and Cast expressions in CDS Views:

Arithmetic expressions using operators +, - , *, & or the unary -, or using built in functions such
as CEIL( ) and MOD( ) are used. String expressions, including string concatenation operator &&,
or built in functions such as SUBSTRING( ) are used. Cast expressions are also supported.
Joins in CDS Views:

ABAP CDS support INNER joins, LEFT OUTER joins and RIGHT outer joins. Join conditions
can be inserted before or after the select list. Brackets are used to implement complex join
operators. When joins are used asterix is not supported in SELECT statements.

Annotations in CDS Views:

Annotations specify additional information such as how records of the view should be buffered
by ABAP table buffer, whether view is client dependent or not. By default CDS views are client
dependent and there is no need to include client fields explicitly in the field list or in the join
conditions.

Annotations are marked with the @sign on top of DDL source code in the view. In above
example @AbapCatlog.Buffering.status states whether buffering is switched off or active,
@AbapCatlog.Buffering.type states type of buffering for CDS view, @Semantics is used for
reference currency and quantity fields.

Aggregation, Grouping, Filtering Groups in CDS Views:.


Through ABAP CDS calculation of aggregations, grouping records and filtering groups are
supported. GROUP BY and HAVING are used.

Simple and complex filtering conditions are supported. It is also possible to include WHERE
conditions in CDS view definitions.

Unions in CDS Views:

Using the UNION and UNION ALL keywords views are merged from two or more queries. Here
UNION implies DISTINCT, the combined result contains no duplicate rows, UNION ALL
implies the combined result has duplicate rows.

Selecting Data Using CDS Views:


The open SQL SELECT statement is used in ABAP to select data from a CDS view. ABAP
Dictionary View name, respectively SQL view name of the CDS view is used in select from
clause.

When using the new Open SQL syntax you can used CDS View name which is identified using
DEFINE VIEW keyword in the select from clause. Note that CDS view name cannot have the
same name as the SQL view/ ABAP Dictionary view and the CDS view name also does not have
same name of the ABAP Dictionary.
Attribute Views in SAP HANA Modeling are created on the top of Dimension
tables. They are used to join Dimension tables or other Attribute Views. You
can also copy a new Attribute View from already existing Attribute Views
inside other Packages but that doesn’t let you change the View Attributes.

Characteristics of Attribute View

 Attribute Views in HANA are used to join Dimension tables or other


Attribute Views.
 Attribute Views are used in Analytical and Calculation Views for
analysis to pass master data.
 They are similar to Characteristics in BM and contain master data.
 Attribute Views are used for performance optimization in large size
Dimension tables, you can limit the number of attributes in an Attribute View
which are further used for Reporting and analysis purpose.
 Attribute Views are used to model master data to give some context.

How to Create an Attribute View?

Choose the Package name under which you want to create an Attribute View.
Right Click on Package → Go to New → Attribute View

When you click on Attribute View, New Window will open. Enter Attribute
View name and description. From the drop down list, choose View Type and
sub type. In sub type, there are three types of Attribute views − Standard,
Time, and Derived.

Time subtype Attribute View is a special type of Attribute view that adds a
Time Dimension to Data Foundation. When you enter the Attribute name,
Type and Subtype and click on Finish, it will open three work panes −
 Scenario pane that has Data Foundation and Semantic Layer.
 Details Pane shows attribute of all tables added to Data Foundation
and joining between them.
 Output pane where we can add attributes from Detail pane to filter in
the report.
You can add Objects to Data Foundation, by clicking on ‘+’ sign written next
to Data Foundation. You can add multiple Dimension tables and Attribute
Views in the Scenario Pane and join them using a Primary Key.

When you click on Add Object in Data Foundation, you will get a search bar
from where you can add Dimension tables and Attribute views to Scenario
Pane. Once Tables or Attribute Views are added to Data Foundation, they can
be joined using a Primary Key in Details Pane as shown below.
Once joining is done, choose multiple attributes in details pane, right click
and Add to Output. All columns will be added to Output pane. Now Click on
Activate option and you will get a confirmation message in job log.

Now you can right click on the Attribute View and go for Data Preview.

Note − When a View is not activated, it has diamond mark on it. However,
once you activate it, that diamond disappears that confirms that View has
been activated successfully.

Once you click on Data Preview, it will show all the attributes that has been
added to Output pane under Available Objects.

These Objects can be added to Labels and Value axis by right click and
adding or by dragging the objects as shown below −
Analytic View is in the form of Star schema, wherein we join one Fact table to
multiple Dimension tables. Analytic views use real power of SAP HANA to
perform complex calculations and aggregate functions by joining tables in
form of star schema and by executing Star schema queries.

Characteristics of Analytic View

Following are the properties of SAP HANA Analytic View −

 Analytic Views are used to perform complex calculations and Aggregate


functions like Sum, Count, Min, Max, Etc.
 Analytic Views are designed to run Start schema queries.
 Each Analytic View has one Fact table surrounded by multiple
dimension tables. Fact table contains primary key for each Dim table and
measures.
 Analytic Views are similar to Info Objects and Info sets of SAP BW.

How to Create an Analytic View?

Choose the Package name under which you want to create an Analytic View.
Right Click on Package → Go to New → Analytic View. When you click on an
Analytic View, New Window will open. Enter View name and Description and
from drop down list choose View Type and Finish.
When you click Finish, you can see an Analytic View with Data Foundation
and Star Join option.

Click on Data Foundation to add Dimension and Fact tables. Click on Star Join
to add Attribute Views.

Add Dim and Fact tables to Data Foundation using “+” sign. In the example
given below, 3 dim tables have been added: DIM_CUSTOMER,
DIM_PRODUCT, DIM_REGION and 1 Fact table FCT_SALES to Details Pane.
Joining Dim table to Fact table using Primary Keys stored in Fact table.

Select Attributes from Dim and Fact table to add to Output pane as shown in
snapshot shown above. Now change the data type of Facts, from fact table to
measures.

Click on Semantic layer, choose facts and click on measures sign as shown
below to change datatype to measures and Activate the View.
Once you activate view and click on Data Preview, all attributes and
measures will be added under the list of Available objects. Add Attributes to
Labels Axis and Measure to Value axis for analysis purpose.

There is an option to choose different types of chart and graphs.

Calculation Views are used to consume other Analytic, Attribute and other Calculation views
and base column tables. These are used to perform complex calculations, which are not
possible with other type of Views.

Characteristics of Calculation View

Below given are few characteristics of Calculation Views −


 Calculation Views are used to consume Analytic, Attribute and other Calculation
Views.
 They are used to perform complex calculations, which are not possible with other
Views.
 There are two ways to create Calculation Views- SQL Editor or Graphical Editor.
 Built-in Union, Join, Projection & Aggregation nodes.

How to create a Calculation View?

Choose the Package name under which you want to create a Calculation View. Right Click
on Package → Go to New → Calculation View. When you click on Calculation View, New
Window will open.
Enter View name, Description and choose view type as Calculation View, Subtype Standard
or Time (this is special kind of View which adds time dimension). You can use two types of
Calculation View − Graphical and SQL Script.

Graphical Calculation Views

It has default nodes like aggregation, Projection, Join and Union. It is used to consume other
Attribute, Analytic and other Calculation views.

SQL Script based Calculation Views

It is written in SQL scripts that are built on SQL commands or HANA defined functions.

Data Category

Cube, in this default node, is Aggregation. You can choose Star join with Cube dimension.

Dimension, in this default node is Projection.


Calculation View with Star Join

It does not allow base column tables, Attribute Views or Analytic views to add at data
foundation. All Dimension tables must be changed to Dimension Calculation views to use in
Star Join. All Fact tables can be added and can use default nodes in Calculation View.

Example

The following example shows how we can use Calculation View with Star join −

You have four tables, two Dim tables, and two Fact tables. You have to find list of all
employees with their Joining date, Emp Name, empId, Salary and Bonus.

Copy and paste the below script in SQL editor and execute.

Dim Tables − Empdim and Empdate

Create column table Empdim (empId nvarchar(3),Empname nvarchar(100));


Insert into Empdim values('AA1','John');
Insert into Empdim values('BB1','Anand');
Insert into Empdim values('CC1','Jason');

Create column table Empdate (caldate date, CALMONTH nvarchar(4) ,CALYEAR


nvarchar(4));
Insert into Empdate values('20100101','04','2010');
Insert into Empdate values('20110101','05','2011');
Insert into Empdate values('20120101','06','2012');

Fact Tables − Empfact1, Empfact2


Create column table Empfact1 (empId nvarchar(3), Empdate date, Sal integer );
Insert into Empfact1 values('AA1','20100101',5000);
Insert into Empfact1 values('BB1','20110101',10000);
Insert into Empfact1 values('CC1','20120101',12000);

Create column table Empfact2 (empId nvarchar(3), deptName nvarchar(20), Bonus integer );
Insert into Empfact2 values ('AA1','SAP', 2000);
Insert into Empfact2 values ('BB1','Oracle', 2500);
Insert into Empfact2 values ('CC1','JAVA', 1500);

Now we have to implement Calculation View with Star Join. First change both Dim tables to
Dimension Calculation View.

Create a Calculation View with Star Join. In Graphical pane, add 2 Projections for 2 Fact
tables. Add both fact tables to both Projections and add attributes of these Projections to
Output pane.

Add a join from default node and join both the fact tables. Add parameters of Fact Join to
output pane.
In Star Join, add both- Dimension Calculation views and add Fact Join to Star Join as shown
below. Choose parameters in Output pane and active the View.

SAP HANA Calculation View − Star Join

Once view is activated successfully, right click on view name and click on Data Preview.
Add attributes and measures to values and labels axis and do the analysis.

Benefits of using Star Join

It simplifies the design process. You need not to create Analytical views and Attribute Views
and directly Fact tables can be used as Projections.

3NF is possible with Star Join.

Calculation View without Star Join

Create 2 Attribute Views on 2 Dim tables-Add output and activate both the views.
Create 2 Analytical Views on Fact Tables → Add both Attribute views and Fact1/Fact2 at
Data Foundation in Analytic view.

Now Create a Calculation View → Dimension (Projection). Create Projections of both


Analytical Views and Join them. Add attributes of this Join to output pane. Now Join to
Projection and add output again.

Activate the view successful and go to Data preview for analysis.

You might also like