Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
33 views34 pages

openSAP dsp1 Week 1 Transcript EN

Uploaded by

ige.lhernandez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views34 pages

openSAP dsp1 Week 1 Transcript EN

Uploaded by

ige.lhernandez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

PUBLIC

openSAP
Introduction to SAP Datasphere

Week 1 Unit 1

00:00:06 Hello, and welcome to our openSAP course about SAP Datasphere.
00:00:11 My name is Klaus-Peter Sauer and I work as a Senior Director in Product Management for
SAP Datasphere.
00:00:17 I'm really excited to guide you through the first few units of this course.
00:00:26 So what is the course about and what can you expect? This course is an introduction course to
SAP Datasphere.
00:00:34 You will learn how to leverage the features and functions of Datasphere,
00:00:38 starting with a simple analytical requirement. During the course, we will gradually extend the
complexity
00:00:45 of the practical example, as we add additional tasks
00:00:49 and the requirements that come in. You can also use hands-on exercises
00:00:56 to get your own experience. So the focus of this course is on data modeling,
00:01:02 but we will also touch many other aspects of the solution over the three weeks of the course.
00:01:09 At the end, you will have a very good understanding on how to use the solution for real use
cases
00:01:15 to get your job done. So as mentioned, in this three-week course,
00:01:22 we get you introduced in the first week to the system to start with your first data models.
00:01:29 We will show you how to extend them as the requirements are changing and getting extended.

00:01:37 You'll also learn how to integrate data from remote sources, how to use the data flow and other
data integration features
00:01:45 like the data integration monitor, also how to share data and apply data access controls.
00:01:54 In week two, you'll learn more about advanced modeling topics,
00:01:57 so using the analytic model and the Business Builder. We'll also introduce you to the Data
Marketplace
00:02:05 and the intelligent lookup functionality. More topics in week two include the repository,
00:02:12 the data impact and lineage analysis, as well as the new catalog component.
00:02:18 We are rounding it off with the command line interface, where you get to know how you can
use it
00:02:23 and what kind of features and functions are there for you. In week three, you will learn about
administration
00:02:32 and configuration topics, but also more options on the data integration side.
00:02:39 We will show you also integration into analytics and planning, as well as the SAP BW bridge,
00:02:46 and how you can leverage your existing BW system in hybrid deployments.
00:02:54 So let's have a look at your task during the course. Your task is simple - you start building the
analytics
00:03:01 for the Best Run Bikes company. So you have tenants for SAP Datasphere,
00:03:07 as well as for SAP Analytics Cloud to create your data models, to load and manage the data,
00:03:14 as well as building analytical reports on top. The focus of this course is, of course, on SAP
Datasphere.
00:03:23 There are different courses available for SAP Analytics Cloud,
00:03:27 in case you're interested to learn more about that product. So in this unit, you will get an
overview of the scenario,
00:03:35 the system setup, about your own dedicated space
00:03:39 where you actually work with the data. So before we dive into the tasks,
00:03:47 you need to get access to a system first. This is mandatory if you want to do the hands-on
exercises
00:03:54 of this course. So if you do not want to do any hands-on exercises,
00:04:01 you can basically skip this task as well. So let me show you this in the browser, how you get
there.
00:04:11 So in your browser, you navigate to sap.com/datasphere, then you get to this page,
00:04:20 and here you simply click the Experience SAP Datasphere button. Now, a registration form
comes up
00:04:30 where you basically fill in your details and take it from here.
00:04:36 After that, you will get an email to activate your account after successful registration,
00:04:43 and your login details to the system will be emailed to you. Then you have a 30-day access
00:04:49 to a guided experience trial system. If you already have an existing SAP account,
00:04:56 you can also use the button for logging in here on the right-hand side,
00:05:00 and then you don't need to fill in the form and just proceed using your existing account.
00:05:08 In case you already have access to a guided experience trial system
00:05:12 and your system access has expired, you can simply get a new one with the same process.
00:05:21 So when you actually log in to the system, you get to the Home screen.
00:05:25 So the main menu on the left provides you direct access to all essential functions of
Datasphere,
00:05:32 but let me show you that in the system. So in the system, the first thing you find
00:05:40 on the left-hand side is the menu bar and the menu entries. So on the bottom, you find more
the administrative topics
00:05:51 with like Space Management, where you define and manage the different spaces in the
system.
00:05:58 So these entries in the menu may differ, depending on the authorizations you might have in
the system,
00:06:05 so I have an administrator, right, where I can see everything.
00:06:09 That's why I also see like the System Monitor, where you are actually monitoring the overall
system.
00:06:17 The Content Network, the next item here, is where you can deploy business content from SAP,

00:06:24 but also from partners. And also, we offer sample packages,
00:06:29 which can be deployed. In the Security area, you can manage your users,
00:06:36 so create new users, delete users, assign roles, and also have access to the activities log
00:06:43 of the different users. The Transport allows you to export models,
00:06:52 which can be imported then in other Datasphere tenants. The Data Sharing Cockpit is part of
the Data Marketplace
00:07:03 and the place for data providers to define their profiles, as well as the data products, licenses,
contexts, and so on.
00:07:17 The System entry brings you to the Administration and Configuration areas.
00:07:27 Above the Administration section, you find the different applications.

2 / 34
00:07:32 So the applications for data modeling with the Data Builder, as well as for business modeling
for the Business Builder.
00:07:42 The Data Marketplace in this part of the menu is on the consumption side.
00:07:48 So this is where you find the landing page, where you can browse the different data products,

00:07:55 manage the licenses you might have acquired, and so on. Data Access Controls are basically
defining
00:08:09 role-level security, so giving your users access to certain levels on the data that can be
managed and set up
00:08:18 with the data access controls. The Data Integration Monitor can be used
00:08:23 to monitor your data loads, as well as your schedules. And finally, here, the Connections area
that allows you
00:08:32 to set up connections, also manage these source system connections
00:08:36 of your different spaces. On top of the applications,
00:08:42 you find the area for the metadata and that's where you find the repository,
00:08:49 as well as our new cataloging solution. In the middle of the Home screen,
00:08:57 that contains the most recent news and blogs, which you find also on the Communities page,
00:09:03 I already talked about that. And you also have some click links here in the middle,
00:09:09 as well as the recent files are shown here. On the top bar, there's also a few menu items.
00:09:20 The first one you see here is the notifications. This is empty as we haven't done a lot here
00:09:26 in this tenant so far, but you will see that during the course,
00:09:30 where you find that your deployments, the Successful or Failed notices will be shown here
00:09:36 in the notification area. You can also give us feedback about the solution,
00:09:42 and the next icon here is more towards our support colleagues,
00:09:47 where an administrator of the system can create a support user for our colleagues
00:09:52 or download a certain log. An important one is the question mark icon here
00:09:59 because this is getting you to the in-app help, where you find some question marks in the area,

00:10:07 giving you more information about the different sections here.
00:10:11 And we also have sometimes a video embedded in the help, which gets you more information
00:10:17 about the different features and functions. And we also have a What's New section available
here,
00:10:25 where you basically get information about the latest and greatest additions we have
00:10:31 for the different versions of the system. Since we are on a biweekly release cycle,
00:10:36 this will frequently change, so every two weeks you will find a new entry
00:10:41 in this area here. So let's close the help bar.
00:10:47 The next button, with this icon here, is basically getting you to your profile settings,
00:10:55 where you can have the settings about language and those kind of things,
00:10:59 and also, the Sign Out button if you want to log out of the system.
00:11:04 And last but not least, the most right icon here on top is getting you to the so-called product
switch.
00:11:13 So we are very tightly integrated with our solution, with SAP Datasphere and SAP Analytics
Cloud,
00:11:20 so that's why we have this kind of embedded application switch in here,
00:11:26 and if you click on the Analytics side of this, you will get to the connected SAP Analytics Cloud
tenant,
00:11:33 but we will also show you that later in one of the first exercises.

3 / 34
00:11:40 So in SAP Datasphere, everything starts with a space. Spaces are virtual work environments
for your artifacts.
00:11:49 It means all your data models live in a space, and depending on your access rights
00:11:54 to one or multiple spaces, you can access the models or not.
00:12:00 Of course, you can also share objects using Cross Space Sharing, and you will also learn
more about that in a later unit.
00:12:10 So spaces help you to isolate objects and assign resources, like space quota and workload
settings and others.
00:12:20 Spaces also hold system connections, as well as space-wide time dimension settings
00:12:26 or settings for currency conversion. So a system can hold many spaces,
00:12:34 for example, one for Finance, for Sales, or for HR data. Also for project or line of business,
specific spaces are possible.
00:12:44 This really depends on your needs. So each can have different connections,
00:12:50 but let's have a look at the system, how to do that. So when you get to the Space overview,
00:12:59 the different spaces will be shown to you, like in this example here.
00:13:03 So let's dig into the details of a particular space and take our openSAP example.
00:13:13 So when you get into the Space Management section on top, the first thing you find is more
the description
00:13:21 and the technical name of the space, if it was deployed or not, and who actually set it up.
00:13:28 You can also, if you're an administrator, enable the space quota - so that's basically
00:13:34 where you assign disk storage to a particular space, and also the compute part.
00:13:44 And let's disable this for this space. And in the next area, you find some workload settings.
00:13:53 You can use with custom and default settings, we're also using here the default settings.
00:14:00 Then in the Members section, you can basically add and remove colleagues
00:14:06 you want to have access to this particular space, so I already added a few colleagues here for
my area.
00:14:15 Then in the Database Access area, you can create database users
00:14:21 and create a so-called Open SQL schema, but we will let you know about the Open SQL
schema
00:14:28 in one of the later units. The Connections area, that's where you get
00:14:35 to the Connections screen. We will also show you those details
00:14:40 in one of the later units, where you can basically define or manage your connections to
different source systems.
00:14:49 The Time Data section, that's where you create the time dimension.
00:14:54 Time dimension is often used for analytics and reporting purposes because you want to have
the data structured
00:15:01 in years, months, quarters, or weeks. And with this time dimension,
00:15:07 we generate a generic time dimension, but we will also show you that
00:15:12 in one of the next units already. At the bottom here, you find the Auditing settings,
00:15:18 so you can enable or disable audit settings for particular spaces, and then the different audit
logs will be recorded
00:15:28 for read or change operations, depending on your settings. As the course goes on, you might
wonder
00:15:37 where to find more information outside of the course material.
00:15:42 So this link collection here gets you to the SAP Community page for Datasphere,
00:15:46 where you find more information about getting started, best practices, business content, the
BW Bridge, and more.

4 / 34
00:15:56 You will also find there the latest blogs about solutions. Or compose your own blogs there, if
you're interested.
00:16:04 The online documentation is also very helpful to get more details about specific features and
functions.
00:16:13 There are also developer tutorials available, as well as the learning journey for SAP
Datasphere.
00:16:23 So to sum up, now you have a good overview of the course and its structures.
00:16:29 I explained the scenario briefly, and during the other units, you will expand the tasks.
00:16:36 I have shown you how to register to get access to a guided experience trial system
00:16:41 to get really hands-on for the exercises of the course. I also explained what you find on the
Home screen,
00:16:49 what the Space Management is all about. Finally, I showed you some good resources for
information,
00:16:57 where you'll find more helpful information during the course. So that's it for unit one.
00:17:03 Thank you, and good luck with the quiz.

5 / 34
Week 1 Unit 2

00:00:05 Hello and welcome to unit two of our openSAP course about SAP Datasphere.
00:00:12 My name is Klaus-Peter Sauer, and I work as a senior director
00:00:15 in product management for SAP Datasphere. In this unit,
00:00:19 you will learn how to create your first data model and a simple report.
00:00:27 So now that you have access to a system and while IT is setting up the system connectivity
00:00:33 to the backend systems, you can already start to create the first report
00:00:38 based on the sales order sample file, which you were provided with by a colleague.
00:00:44 So the report should show the revenue by sales organization. To achieve that, you need to
upload the file to the system,
00:00:54 build a simple model, and then the data model is then consumed
00:01:00 by SAP Analytics Cloud in a story. I will show you how to do that.
00:01:09 First, let me give you an overview of the Data Builder before we actually upload a file.
00:01:15 So the Data Builder is the central place for data modeling in SAP Datasphere.
00:01:21 You will find various editors there. The Table Editor to create, maintain local tables,
00:01:29 edit remote tables, define semantic usage, manage columns, fields, attributes, measures, and
hierarchies, and more.
00:01:40 The Graphical View Editor allows you to combine local and remote datasets into one data
model,
00:01:46 including filters, associations, joins, unions, and other operators you might want to use.
00:01:55 In the SQL View Editor, you can pretty much do the same using SQL language.
00:02:02 The Entity Relationship Editor allows you to create entity relationship models,
00:02:07 so arranging your local tables, and views, and their relations to each other.
00:02:15 The Analytic Model Editor is used to define multi-dimensional models
00:02:20 for analytical purposes. So you can basically model cube-like structures
00:02:25 with facts and dimensions there. The Data Flow Editor,
00:02:32 that lets you model data flows into SAP Datasphere, including complex transformations.
00:02:41 The Intelligent Lookup Editor allows you to merge data from two entities, even if there is no
joining column,
00:02:50 which actually brings them together. The Replication Flow Editor is relevant
00:02:57 if you want to copy multiple datasets from the same source to the same target
00:03:02 in a fast and easy way and do not require complex transformations.
00:03:08 You can group multiple tasks into a task chain and then run them manually or periodically
00:03:15 through a schedule which you can create. So you will learn more about all of these different
editors
00:03:21 as the course goes on. So in addition to the editors,
00:03:29 there are more functions available in the Data Builder overview.
00:03:33 You can import CSV files, which we will do in a minute, but also import objects
00:03:39 based on CSN or JSON metadata files, import entities from systems like S/4HANA Cloud
00:03:46 or the SAP BW bridge. The remote table import allows the creation of remote tables
00:03:53 from connected systems. The overview is also the place
00:03:58 where you can execute mass operations, meaning for data sharing, for deploying,
00:04:04 deleting multiple models. You can also look at the impact
00:04:08 and lineage analysis for artifacts from the Data Builder overview.
00:04:16 So to import your file and to build your first data model, you need to allow a few,

6 / 34
00:04:22 you need to follow a few steps to start from the Data Builder overview.
00:04:28 But let me show you that in the system, how you actually get there.
00:04:33 In the system, you navigate to the Data Builder and use the CSV upload function.
00:04:42 So you select the source file from your computer. In this case, the SalesOrders.csv.
00:04:51 You use the default settings as you see them on the screen and click Upload.
00:04:58 We provide the file with the course materials, of course. This takes a while and gets you to the
screen
00:05:06 where you could do some further transformations on the dataset.
00:05:10 We keep everything as is and click Deploy. So now you give it a name,
00:05:18 like SalesOrders CSV, and deploy it.
00:05:29 So deployment means we are physically creating the table on the database.
00:05:34 After successful deployment of your table, you see the table in the Data Builder overview.
00:05:42 Let's click on it. And so we get to the Table Editor.
00:05:50 There, we need to give the SALESORDERID the setting of a key field,
00:05:58 and then deploy the table. So we can now also look at the data in the preview
00:06:08 after the table is deployed. So that shows up on the bottom of your page,
00:06:14 takes a second, and then you can directly preview the data we have uploaded.
00:06:24 So the Table Editor allows you to define tables and their semantics in SAP Datasphere.
00:06:31 There are different usage types available, for example, for relational datasets, meaning flat
tables,
00:06:37 this is also the default setting we just used. There are other specific types
00:06:43 for multi-language text descriptions or hierarchy datasets. Analytical datasets offer fact data
00:06:52 where you can associate dimensional data to create multi-dimensional models.
00:07:00 In the Table Editor, you can define or modify primary keys. We just did that in the example.
00:07:07 Define compound keys, set default values, decide on the column visibility, associations, field
names,
00:07:16 data types, descriptions, time dependency, and also the semantic settings.
00:07:24 You can also delete and upload new data from files, as well as preview the data and share it
with other spaces.
00:07:33 You will learn more about it as the course goes on. So now that the file was imported
00:07:42 and the local table is deployed, you can use it for further modeling.
00:07:48 Let's create a simple view to be consumed in SAP Analytics Cloud for a simple report.
00:07:55 And also here, let me show you how you do that. In the Data Builder overview,
00:08:01 we click on the create New Graphical View button. An empty canvas now shows up,
00:08:09 with a pane on the left for your repository and source artifacts.
00:08:15 We open the tables and see our just uploaded SalesOrders CSV table.
00:08:24 Let's drag it to the canvas. And now a Properties section shows up
00:08:31 to the right of your screen. Let's click on the output node, the view.
00:08:40 Give it a name, like Sales Orders View. As we want to use it for reporting,
00:08:49 let's select the semantic usage for an analytical dataset. You also need to expose the
consumption
00:08:58 so that it's visible for external tooling. Now we get an error message
00:09:03 that we haven't defined any measures yet. So let's take them here, the GROSSAMOUNT,
00:09:11 the NETAMOUNT, the TAXAMOUNT, we highlight them
00:09:15 and just simply drag and drop them to the Measures section, and the error is gone.
00:09:23 So now let's save and deploy this view. So now it's deployed and we can continue.

7 / 34
00:09:46 So looking at the Graphical View Editor in general, it offers the same semantic usage types as
for the tables.
00:09:54 We have just defined an analytical dataset, where the distinction
00:09:59 between measures and attributes is required. So the editor also allows for complex modeling
00:10:07 using different operators like joins, unions, aggregations, projections, currency conversions,
and others.
00:10:17 You can also rename and remove columns, add calculations, add filters, associations, and
more.
00:10:27 You can also model hierarchies with parent-child or level-based relationships.
00:10:34 Modeling multi-language text descriptions is another option that may be useful with
international businesses
00:10:41 that require analytics in local languages, like German, Spanish, or others.
00:10:48 Depending on the login language of a user, the text fields will then show
00:10:53 the respective local language translations if, of course, maintained in the data.
00:11:01 You can apply data access controls, input parameters, persisted views,
00:11:07 and also share these models with other spaces. The data preview is also here possible at
each node,
00:11:16 so you can check if your operator is working correctly. So let's create a simple report to check
on your data model.
00:11:29 We use the application switch on the top right-hand side to switch to SAP Analytics Cloud,
00:11:35 but let me show you that in the system. Okay, now you can use the application switcher
00:11:42 to get to SAP Analytics Cloud. In Analytics Cloud, we go to Stories,
00:11:52 create a new story using the optimized language. Let's take a chart.
00:12:04 So now, they ask us to connect to your data model in Datasphere.
00:12:09 So let's use the SAPDWC, Select the space and the Sales_Order_View
00:12:17 which we have just created. So now in the chart, we have to select a measure.
00:12:29 We just defined three, the GROSSAMOUNT, NETAMOUNT, and TAXAMOUNT.
00:12:33 So let's take the NETAMOUNT. And we want a report based on the sales organizations.
00:12:41 So we select that. And now we basically created our first little report.
00:12:49 Let's save it. Give it the name, like My First Story.
00:13:06 Save it. And we're actually done for the first part of our exercise.
00:13:13 So in general, there are different options to consume data from SAP Datasphere.
00:13:19 SAP Analytics Cloud is offering a direct live connection. That is what we just saw in the demo.

00:13:26 You can connect multiple Analytics Cloud systems to multiple Datasphere tenants.
00:13:35 Another option is the Microsoft Office Integration, with the add-in for Office 365
00:13:41 for online or desktop versions. The older SAP Analysis for Office is also supported.
00:13:51 Other tools can use SQL or ODatabase interfaces to connect to data models which are
exposed for consumption.
00:14:04 So in this unit, you learned about the Data Builder, about the different editors and options
00:14:11 on the overview screen. You also learned to upload a flat file,
00:14:17 use the Table Editor with some of its features. We also introduced the Graphical View Editor
00:14:23 and some of its basic features. And then you learned how to create a simple story
00:14:29 in SAP Analytics Cloud, and also how to get there using the application switcher.
00:14:36 So that's it for unit two. Thank you and good luck with the quiz.

8 / 34
Week 1 Unit 3

00:00:06 Hello and welcome to unit three of our openSAP course about SAP Datasphere.
00:00:12 My name is Klaus-Peter Sauer and I work as a senior director in product management for SAP
Datasphere.
00:00:18 In this unit, you will learn how to connect to a remote source
00:00:22 and also how to add a time dimension. So in unit two you created your first data model and a
simple story.
00:00:33 In the meantime, your IT colleague has set up the system connection to the backend system.
00:00:39 So you don't need to start from scratch. You can simply replace your local table with a remote
table,
00:00:45 and I'll show you how to do that. After the quick success with your first story,
00:00:53 you notice that the data is summed up over multiple years. If you simply add the date field
00:01:00 it will show the data for each day with booked values. What you really want to achieve
00:01:06 is a more generic time dimension so that the users can view the data on yearly, monthly, or
quarterly levels,
00:01:15 and this unit will show you how to achieve that. So replacing a source table or view
00:01:26 with another one is a very useful function, especially in cases where you start with sample data
first
00:01:33 to get your model, your calculations right, before you use the actual production data.
00:01:40 But let me show you that in the system first how you actually do that.
00:01:45 So let's go to the system connections and check if the HANA connection provided by IT
00:01:54 is actually working, by using the Validate button here. So this now checks
00:01:59 if the different connection options are there and the toaster message on the bottom just
showed us,
00:02:05 okay, this connection is ready for all three different data integration options.
00:02:13 Now move back to the data builder and go to our existing view.
00:02:23 So now, clicking on the view, the system is doing some simple checks
00:02:27 and checking if the underlying structures have changed or were updated by someone else.
00:02:34 That has not happened here so we directly have everything ready.
00:02:39 So instead of the Repository tab, we now move to the Source tab, where we find the different
source system connections
00:02:48 we have in this space and we see the HANA connection,
00:02:52 which we just checked in the connections overview. And now we need to drill down to our area
for the demo
00:03:05 where we find the different tables we have accessible in that remote system. So we basically
take now the sales orders,
00:03:19 drag them onto the canvas, and you now see three options to the right,
00:03:25 union, join, or replace. In our case, we want to replace the flat file with this remote source
00:03:32 so we just hit it here on the Replace button. We click on Import and Deploy.
00:03:43 So same process here, that the table is actually finally deployed on the database.
00:03:50 It asks us about the mapping, since the structures of both of these tables are identical
00:03:56 we can simply hit the Replace button, and now this remote table is actually deployed.
00:04:07 So for the view, also no change here. We can simply just deploy the simple update we had
00:04:20 and then we are good to go and we could refresh our report. So now let's have a look at the
connections in general.

9 / 34
00:04:31 The system offers many options for system connections. We have just used an SAP HANA
connection in our example.
00:04:41 In the Connections area you can set up new connections, edit, delete, and validate existing
connections.
00:04:49 You can also pause and restart connections that are used for real-time replications.
00:04:56 You can also use the open connectors and connect to supported third-party data sources.
00:05:05 So here you see the currently supported connection tiles. You see that we offer a lot of SAP
sources
00:05:13 but also generic connectors to databases, specific hyperscaler systems, cloud storages,
00:05:20 partner solutions, and more. So replacing the local table
00:05:28 with a remote table means that we virtually access data that is not persisted in SAP
Datasphere.
00:05:36 This is called virtual access or data federation. So virtual tables behave similarly to local tables

00:05:44 but the data is only accessed if you query on the dataset, like a preview or in your SAP
Analytics Cloud story.
00:05:54 The data is transferred through the network each time a query is executed. So this also affects
the source system,
00:06:02 where the data is pulled on every access. That is why the data transfer can be restricted
00:06:11 using central filters and selected columns only. So in addition, you can switch seamlessly
00:06:21 between remote access and data replication or snapshots,
00:06:27 without the need to actually change your data model. You can petition these data loads
00:06:33 and schedule the snapshots regularly. As mentioned before, the real-time replication can be
paused and resumed,
00:06:42 and, of course, stopped or canceled. So going into the space management, as I showed you
earlier,
00:06:53 there is an area about connections, and if we click on that button here,
00:06:59 we get to the same place where we've just been earlier with directly using the menu entry to
check
00:07:07 if our HANA connection is valid. So looking at the screen, I can create also new connections.
00:07:18 And on each of those tiles you always find an information icon,
00:07:25 which shows us what kind of data integration options are supported with this particular
connection tile.
00:07:34 So let's have a look at the HANA connection. This supports data flows, remote tables,
00:07:40 and replication flows. And if we want to create a new one
00:07:44 to set a new connection up, we have basically the option to select
00:07:50 if it's a HANA Cloud or an on-premise-based system, depending on what kind of source
system it is
00:07:58 I need to give different connection details, usually it's hosts and port name, user credentials,
00:08:06 and then if I have, in this case, an on-prem system I need a middleware component to pick
from,
00:08:14 so DP agent or Cloud Connector. And if I pick a DP agent, I have to pick one.
00:08:23 And then all of these different connection options are actually possible. I don't want to create a
new one
00:08:31 since IT has already done that for us. So just to show you how simple it is
00:08:37 if, of course, you have the connection details to the source systems to set up a new connection

00:08:43 and there are different ones that you can pick from. So we exchanged the local table with a
remote table earlier.

10 / 34
00:08:53 You can also refresh your story in SAP Analytics Cloud and see that it still works,
00:09:00 and now shows the data based on the remote connection. The story shows the data for all
years
00:09:08 but you only want to show it for the split in different years. So as mentioned in the introduction,

00:09:15 simply adding the date field to the graphic would not do the job,
00:09:20 as it would show all dates where you have booked values. So this is where the time dimension
now comes in.
00:09:29 Let me also show you that in the system, how to set up the time dimension
00:09:33 and how to use it in your data model. So let's go into the space management again
00:09:39 and navigate down to the time data area. Let's create these timetables,
00:09:47 and you can select from which year to which year you want to create it. So let's use year 2000
until 2050.
00:09:57 Hit the Create button. And now the actual timetables are generated for you.
00:10:07 So from those year values you can pick whatever is fitting your needs here,
00:10:14 and now the timetable is created. If we go to the data builder,
00:10:20 we actually see a lot more tables and views being created for you, which you can now use in
your different data models.
00:10:30 So let's open our sales order view. Again, the system is checking if there are any changes.
00:10:43 And on the right-hand side, in the Properties pane, we scroll down to the Associations area.
00:10:51 We pick Association. Now all the possible objects
00:10:55 for associations are being shown here. Let's limit this to dimensions,
00:11:02 and we see, okay, I could associate the day, month, quarter or year dimension.
00:11:08 Let's use the day so we have the most granular way. Now we get an error message thrown
00:11:19 because we have an association but there's no mapping between the different tables.
00:11:25 So that's what we need to do here. So we basically take the created at date,
00:11:34 drag it to the date field so that we have created a join criteria between the two.
00:11:42 And that's it, let's move back. Deploy the table,
00:11:54 or the view, sorry. And on top, here on the right-hand side, you can always check
00:12:04 what kind of status your view has at the moment. It's not deployed.
00:12:11 Now the deployment process is done, shown by little toaster message,
00:12:16 shown by the Deployed status here, and also you can look it up in the notifications area
00:12:24 that a few seconds ago now this table was successfully deployed. So now we updated the
model, but we need to check also
00:12:34 on the report where we want to change and use the time dimension.
00:12:38 So let's move over to SAP Analytics Cloud, where we open our first simple story we have
created.
00:12:50 We want to edit the story, select our chart here,
00:13:04 and now we can simply add a time dimension, the created at date, which we used.
00:13:10 You also see that there's a little hierarchy icon, which means the time hierarchy is already
active.
00:13:17 So let's select it and select here on the hierarchy,
00:13:23 level two, which gives us the year. And since the horizontal view is not so nice,
00:13:32 let's use the vertical view and move around the date and the sales organization
00:13:42 so now we have everything grouped by year and sales organization, and we can save our
story,
00:13:51 and we have achieved our requirement. So the time dimension in general provides a common
setup

11 / 34
00:14:00 of time data for your space. So you don't need to create time data
00:14:07 for each and every data model that you are building. You can use it in multiple models,
00:14:13 use the predefined hierarchies, and also SAP Analytics Cloud understands these
00:14:19 to drill down into the time dimension. Associations, which we also used with the time
dimension here,
00:14:30 is also a powerful feature of SAP Datasphere. You can create them in various places like the
view editor,
00:14:38 which we just used, but also in the table or the entity relationship editor.
00:14:46 An association basically creates a join that is only executed at runtime,
00:14:52 so, when the association is actually queried. You can associate tables, views from dimensions,

00:15:01 texts, hierarchies, and others. So if we look at this unit, what we have learned
00:15:11 is about the connections and remote tables. I showed you how to replace a local table
00:15:17 with a remote table while your report and the model stay stable.
00:15:24 You also learned about the time dimension and how you can use it in your models.
00:15:30 And last but not least, you also learned about the concept of associations
00:15:35 and how to use them in your data models. So that's it for unit three.
00:15:41 So thank you, and good luck with the quiz.

12 / 34
Week 1 Unit 4

00:00:06 Hello and welcome to unit four of our openSAP course about SAP Datasphere.
00:00:12 My name is Klaus-Peter Sauer and I work as a senior director in product management for SAP
Datasphere.
00:00:18 In this unit, you will learn about to enhance your model with joins and dimensions.
00:00:27 So your task in this unit is to enhance your simple story with the top five sales partners per
region and year.
00:00:36 Looking at the data model, the sales partners show only the ID,
00:00:42 but you want to show the partner company name to be visible in the story.
00:00:47 Also, you want to show the information about the best-selling products.
00:00:52 To achieve this, you need to learn about dimension tables, joins, and other important features.

00:01:03 So first you need to add the table for the business partners to the system,
00:01:08 but let me show you in the system how that works. So in the Data Builder,
00:01:15 navigate to the Import Remote Tables function. So that's what you get here.
00:01:22 Select the existing HANA connection, and we can click on the next step.
00:01:31 Open our schema here and select the BusinessPartners table.
00:01:39 Click on the next step. Here we could correct like the name and the technical name,
00:01:46 but we leave it as is and directly hit the Import and Deploy button.
00:01:55 Then we close the dialog, and now you see the BusinessPartners remote table imported
00:02:03 in your Data Builder overview. Now the table is deployed,
00:02:11 let's enter into the table definition. Takes a second here.
00:02:18 And what you see in addition to your local tables is this section about remote,
00:02:26 where you find the connection name, the remote table name, and the access method,
00:02:32 which is in this case, of course, Remote. So what we need to do in order to prepare the
dataset
00:02:40 is change the semantic usage here to Dimension. Now we get in the attributes table different
settings.
00:02:52 So we can now say that the company name, for example, is a text field.
00:02:57 There are multiple semantic types available, and with the Partner ID,
00:03:04 give it the label column "Company name". So then we deploy the change.
00:03:20 Process is complete. So the deployment process is still pending.
00:03:27 We see that in a second here when the changes are deployed. What we can also do here with
remote tables
00:03:37 is, as I mentioned earlier, load snapshots. So I can simply show you how to do that
00:03:44 using the Load New Snapshot command here. Simply execute it.
00:03:51 We also see that the process is completed. If we refresh this section,
00:03:58 we see that the update happened just now, and the refresh frequency is none
00:04:07 because we didn't schedule anything here in this area. So now we've imported the table,
00:04:13 but we need to bring it together with our already existing data model. So let's open our Sales
Orders View.
00:04:32 Scrolling down here to the Associations area, we've seen that before, with the time dimension.

00:04:38 Let's add another association. And we see here the BusinessPartners table,
00:04:45 the remote table we just imported. We select this one.

13 / 34
00:04:52 And now, different to when we assigned the time dimension, we don't need to actually
associate
00:05:02 the different identifying columns. This has automatically been done by the system
00:05:07 because the IDs and the field names are identical with Partner ID on the BusinessPartners
table
00:05:14 and Partner ID on the SalesOrders table. So we can go one step back and directly deploy the
changes.
00:05:30 So let's check if the deploy part has been executed, still on its way.
00:05:44 So now it is deployed. So we can switch over to SAP Analytics Cloud,
00:05:50 open the story we created earlier. Let's edit the story,
00:06:08 and now we can add a different dimension with the business partner.
00:06:14 We see that as Partner ID here. Let's bring it into the picture.
00:06:23 And oops, make it bigger.
00:06:31 So we see now all the partner names automatically identified here.
00:06:36 Sometimes it can happen that it displays the ID only. So that's what you have here as the
Display As setting
00:06:44 with Description, ID, or both. You can select that.
00:06:47 Since we want to show the names only, it's perfect as it is. And now as we want to have the
top five,
00:06:55 let's go to the Rank function, select Partner ID, Top 5, and now we have the top five by year,
region of our sales partners.
00:07:12 Let's save the report to keep our changes. And that's it for this part of the demo.
00:07:24 So we have used a lot of functions in the last demo, so let's take a step back and look at them.

00:07:32 We used a few times remote tables to achieve virtual access. That leaves the data in the
source system
00:07:39 and it's accessed only when needed. So there's no upfront data movement
00:07:44 and various sources are supported for remote access. The data can also be persisted using
snapshots
00:07:52 or real-time replication. We use the snapshot feature with the BusinessPartners table.
00:07:59 And next to remote tables, you can also take snapshots of views
00:08:03 to materialize your transformations, which you have built into the view.
00:08:11 You could also schedule these snapshots regularly and refresh them, for example on a daily
basis.
00:08:20 Multiple of these runs could also be orchestrated using task chains.
00:08:29 So we have used the semantic usage already a few times. So let's have a closer look at what
those options are about.
00:08:39 The analytical dataset is used for multidimensional analysis. That is where you need at least
one
00:08:46 or more measures that can be analyzed. Relational datasets are just flat representations
00:08:54 and contain columns with basically no specific analytical purpose. Dimensions indicate that
your entity contains attributes
00:09:05 that are used to analyze and categorize measures defined in other entities, like Product Master
Data.
00:09:15 Hierarchies are used to show your data with parent-child relationships for members in a
dimension.
00:09:22 And finally, texts indicate that your entities contain strings
00:09:27 with language identifiers to translate those text attributes.
00:09:34 Let's have a closer look at dimensions. So dimensions are needed

14 / 34
00:09:39 for multidimensional analysis of data. You typically have a table with measures
00:09:45 that you want to analyze in different dimensions, like geography, for example regions,
countries, states, cities.
00:09:57 Or you might have products in there with categories and product details,
00:10:02 or business partners, which we just used. And the time dimension is another example
00:10:07 of what we already used for multidimensional analysis. As mentioned before, you are also able

00:10:18 to schedule specific data loads like snapshots. It allows you to load the data on a regular basis

00:10:26 and get the latest updates. The screen here shows the settings of a schedule.
00:10:32 So successful loads have the status Available; as we have seen that in the demo before.
00:10:42 So now let's look at the second part of the task, where you want to get to the best-selling
products.
00:10:50 The SalesOrders table does not contain any product information. So to achieve that,
00:10:56 we now need to join the SalesOrderItems table. So let me also show in a demo how to achieve
that.
00:11:07 So go to the Sales Orders View and join the SalesOrderItems table with the SalesOrders.
00:11:14 So we select our HANA connection, our schema,
00:11:25 and scrolling down, we find the SalesOrderItems table. And similar as we did it with the
replace function,
00:11:33 we simply drag and drop it onto the SalesOrders table which is there because joining is the
default setting.
00:11:40 We don't need to specifically select that. The table is new, so let's import and deploy it directly.

00:11:53 We also get to the join operator here directly since SALESORDERID is available in both
tables.
00:12:05 We can also extend this if you want to have a better view on these settings, right?
00:12:12 So inner join, that's what we want, that's a default setting.
00:12:15 We could select cardinality, but we leave that for the moment, and everything is here as the
standard setting.
00:12:27 One thing we can adjust also in the View Properties is the SalesOrderItems table
00:12:39 contains an additional measure. We could bring that also to the measures area.
00:12:45 Oops, it's in this case here the QUANTITY. You see also depicted here in the picture
00:12:51 which table this field comes from. So QUANTITY, we can simply say Change to Measure.
00:12:58 So instead of dragging and dropping the information, we can also bring it in there with this
function
00:13:06 and deploy the table. So the deployment is still in process.
00:13:30 The deployment's done, so let's move over to SAP Analytics Cloud.
00:13:36 Let's open our story again. Let's edit the story
00:13:52 and let's bring a new graphic in there. Select the measure, for example "Gross amount".
00:14:13 As a dimension, we select the PRODUCTID and we also add the "Created at date".
00:14:31 Let's use the Horizontal setting, move this around,
00:14:38 and use the year as we did before, Level 2, that is,
00:14:44 and make this a little bigger. And as we want to have the top five products per year,
00:14:53 let's use the Rank function again, Product ID, Top 5. And here we go, top five products per
year.
00:15:06 So that's basically it about the best-selling products story and the end of this second part of the
demo.

15 / 34
00:15:17 So now coming back to the theory part, there are multiple options for joins and unions.
00:15:24 So Datasphere supports several types of joins, like cross joins, full joins, left joins,
00:15:31 right joins, or inner joins. The inner join is the default value.
00:15:36 So that's what we have used also in the example. You can also define the cardinality on both
tables
00:15:43 to improve the performance of the join execution. Using the Distinct Values checkbox would
mean
00:15:51 that you return only unique values. So there are some datasets which have unique values,
00:15:59 and that's a very valuable feature which could be used in those scenarios.
00:16:08 So in this unit, you have learned a lot about remote tables and how to import them
00:16:14 from the Data Builder overview, as well as from the graphical view editor.
00:16:23 You also better understand now the semantic usage types of tables and views, as well as the
semantics for fields.
00:16:35 We use the dimensions to show you how to display texts instead of IDs and ID values in your
story
00:16:43 and by defining texts and IDs in your data model. You also learned about snapshots and
scheduling,
00:16:52 as well as joins and unions. So that's it for unit four.
00:17:01 Thank you and good luck with the quiz.

16 / 34
Week 1 Unit 5

00:00:05 Hello, and welcome to week one, unit five of our openSAP course, Introduction to SAP
Datasphere.
00:00:13 My name is Tim Huse from the SAP Analytics and Insight Consulting.
00:00:18 In this unit, you will learn how to create data flows, task chains, and SQL views with SAP
Datasphere.
00:00:25 Let's dive in. We start by looking at your task for this unit.
00:00:30 The story that has already been created needs to be extended so that the top sales person per
region
00:00:36 can also be displayed. The required data is stored in two remote tables in other systems,
00:00:42 which can be imported into SAP Datasphere using the data flow. For this purpose, the data is
joined
00:00:49 and persisted in a local table in your space. In order to load the data in a systematic process,

00:00:55 you will learn how to create a task chain to build a sequence of loading tasks.
00:01:01 The data flow is an artifact in SAP Datasphere that can be used to integrate
00:01:06 and transform data from a plethora of data sources. The data flow provides an intuitive
00:01:11 graphical modeling experience to meet extract, transform, and load requirements.
00:01:16 In a data flow, data from SAP and non-SAP data sources can be loaded and combined,
00:01:22 such as tables and ABAP CDS views. Standard transformations,
00:01:26 such as aggregation and filtering, can be used, as well as scripting for advanced requirements.

00:01:33 Data can be replicated via a filter-based delta, and also only specific columns
00:01:37 from these source tables can be replicated to reduce the data transfer.
00:01:42 Data flows can be started automatically via plan schedule, memory can be allocated
dynamically,
00:01:48 and data flows can be restarted automatically via an auto restart option in case of errors.
00:01:55 Now let's take a closer look at the script operator, as opposed to standard transformations.
00:02:01 As already mentioned, the data flow offers standard transformations.
00:02:05 Thereby data can be combined in a no-code environment with aggregations, joins, filters,
00:02:11 unions, as well as the addition of new tables. Tables, ABAP CDS views, OData services, and
remote files,
00:02:19 such as JSON or Parquet, can be selected as data sources. You can specify for the target
table
00:02:26 in which the entries are persisted whether data is appended at the end,
00:02:30 whether the table is truncated before the run, or whether existing entries are deleted
00:02:34 based on match conditions during the data flow run. In contrast to the standard
transformations,
00:02:41 the script operator can be used if there are more advanced requirements.
00:02:45 For example, the extraction of text fragments. The script editor is integrated in the Data Flow
Modeler
00:02:52 and supports the standard Python 3 scripting language. The script operator can provide data
manipulation
00:02:59 and vector operations, and supports the well-known modules NumPy and Pandas
00:03:04 without the need to import them explicitly. Now let's create your own data flow in a demo.
00:03:10 So in this demo, we want to build a data flow to retrieve data about employees

17 / 34
00:03:16 and the addresses of these employees that are stored within a HANA table.
00:03:19 Some columns from these sources are not needed and we also have some sensitive data,
00:03:23 phone numbers and email addresses, that we want to mask in our data model.
00:03:29 Okay, we start within Data Builder and go to the tab Flows.
00:03:38 There, we click on create a New Data Flow. We are going to rename the data flow
00:03:43 to DataFlow_EmployeesWithAddresses. Now we have the possibility to insert tables and views

00:03:53 from your repository or from other sources. As we want to use tables from a HANA source
system,
00:03:58 we click on Sources, and then we choose our SAP HANA connection.
00:04:06 Now we have to open the correct schema, which is DWC_DEMO.
00:04:16 And there you can see the Employees table, as well as the Addresses table.
00:04:21 You can just drag it to the panel. And then we can click on the join operator
00:04:34 and connect both source tables with the join operator. And if you click on the join operator,
00:04:46 you can already see that the system will propose you the join condition.
00:04:49 In our case, that's already fine. Nothing to do from our side.
00:04:56 But there are some columns that we don't need, like Street and Postal Code,
00:04:59 so we need to add a projection node and remove all the unnecessary columns.
00:05:07 Those columns are PHONENUMBER, EMAILADDRESS,
00:05:21 the STREET, POSTALCODE,
00:05:31 BUILDING, as well as ADDRESSTYPE.
00:05:39 For the masked columns, we are creating so-called calculated columns,
00:05:43 therefore we click on the plus button. The first one is PHONENUMBER.
00:05:49 We will rename the column to PHONENUMBER and set it as a string with 14 characters.
00:06:00 We want to apply a simple masking function that displays a placeholder
00:06:04 for the first digits of the phone number and then displays the last four digits
00:06:08 of the original column PHONENUMBER. So we are building our expression.
00:06:15 We can search for all available functions here as well, for all columns and operators.
00:06:19 We need CONCAT so I'm searching for that. Now I will insert a placeholder string,
00:06:25 which will be the first part of the masked number. Next, I'm searching for RIGHT,
00:06:37 which is a function to return the rightmost characters of a string.
00:06:48 I could also use the autocompletion to find the function.
00:06:55 And here, I'm using the name of the original column, which is also PHONENUMBER,
00:07:02 and a four as the second argument as we want to show the last four digits.
00:07:13 Now we can validate the expression, and that's it.
00:07:19 One more to go as we also want to mask the EMAILADDRESS.
00:07:26 This time, we need a more complex expression as we want to display the first four
00:07:30 as well as the last four characters of the email address and mask the middle part.
00:07:35 Therefore, we create another calculated column, we rename it to EMAILADDRESS,
00:07:39 and as a string with 12 characters. I just paste the statement,
00:07:45 which is using the CONCAT and RIGHT function again, as well as the LEFT function.
00:07:50 And let's see if the validation is successful. Yes, it is.
00:07:57 So let's continue. Finally, we need the target table,
00:08:01 where we want to persist the transformed data records. So we click on the last node in our flow

00:08:06 and press the table icon, and it will automatically add another node

18 / 34
00:08:09 for the target table. It will also already add all necessary columns.
00:08:14 We will rename this target table to EmployeesWithAddresses.
00:08:18 Now this local table is not deployed yet, so we will go to the top right
00:08:21 in the Properties panel and click Create and Deploy Table.
00:08:29 And now we also can define the mode of the target table. We will use TRUNCATE.
00:08:39 So each time the data flow runs, all the existing data in the target table
00:08:43 is truncated before the run. Now we are good to go
00:08:47 and can deploy the data by pressing the cloud icon, which will automatically save the data flow
as well.
00:08:57 Now that the deployment is finalized, we can click the run button,
00:09:02 which looks like a play button, to start the data flow once. You can check the run status on the
right side
00:09:11 and you could also refresh it. Okay, it is finished,
00:09:18 so we can open the target table and look at your data preview.
00:09:40 Okay, cool. So the data is here.
00:09:42 We cannot see the masked columns. This is because by default,
00:09:46 only 20 columns are displayed in the data preview, so we need to activate the
PHONENUMBER
00:09:50 and EMAILADDRESS as well. Great, the sensitive data is masked now.
00:10:10 Now that's it for this demo. Now let's have a look at an example
00:10:16 on how to utilize the script operator and data flows. No worries, this part is not included in your
exercises.
00:10:22 In this data flow, we have a table with first names and last names
00:10:26 and we want to generate email addresses for that. You can see we have three records on the
source table.
00:10:37 So as a second node, we have the script operator. You can see that we have added another
column,
00:10:45 the column Email as a string 100, which will be filled during this operation.
00:10:51 And in the Properties panel, you can also see the Python script.
00:11:02 Okay, so we can click here to maximize the script for you. So we have a function transform
here,
00:11:08 and the data of the input node will arrive as a Python dataframe here.
00:11:12 So we are just creating a new column for the dataframe, which we call Email,
00:11:16 as we did it in the configuration of the data flow. And we assign data to the column.
00:11:21 The data you will return within this transform function is sent to the next node of the data flow.
00:11:28 In our case, this is the target table. And as we already deployed the artifacts
00:11:33 and ran the data flow previously, you can see in the data preview
00:11:36 that the emails have been generated. So that's it for data flows now.
00:11:42 Next, we will take a look at task chains. Task chains offer the possibility to group multiple tasks

00:11:49 and to start them manually or to start them periodically via a schedule.
00:11:54 A task chain can include flow runs, replication of remote tables, and persistence of views.
00:12:01 The tasks of a task chain are always processed serially. In the settings,
00:12:06 you can specify that a customizable email notification is sent to a specified group of people
00:12:11 in the event of success and/or error of a task run. On the right-hand side, a sample task chain
is shown.
00:12:19 Here the remote table Master Data Attributes is replicated first,

19 / 34
00:12:23 then the data flow number seven is processed, and finally, the view Sales Orders is persisted.

00:12:31 Task chains can be started manually once or periodically via schedule.
00:12:36 On the left-hand side, you can see how a task chain can be started directly
00:12:39 in the task chain modeler by clicking on the play button.
00:12:42 On the right-hand side, you can see the task chains
00:12:45 can be scheduled and started manually in the so-called Data Integration Monitor.
00:12:50 The previous runs and the current statistics, such as the duration, and the start and end of the
last run,
00:12:55 are also displayed here. In the Data Flow Monitor tab,
00:12:59 data flows can also be started and scheduled analogously. Now let's continue with a short
demo
00:13:04 on creating a task chain. So we start in the Data Builder again
00:13:09 and we navigate to the Task Chains tab. We will create a new task chain,
00:13:13 which we rename simply as TaskChain. Now you can see on the left side
00:13:18 that we can insert remote tables, views, as well as data flows.
00:13:23 We want to start by replicating the BusinessPartners table, so we will just drag it to the middle.

00:13:36 And as a second step, we want to execute the previously created data flow.
00:13:40 So we also drag and drop it to the editor area. You can see it is already assigned
00:13:51 as a second step of the task chain. You can see in the Properties panel
00:13:58 that there are two objects in the task chain, and you can set an email notification
00:14:02 in case of a successful or failed task chain. But we don't need this now.
00:14:05 We are good to go. We can deploy the task chain,
00:14:08 which will also save the artifact. And it's finished now.
00:14:11 However, we don't need to start it now, but feel free to try it out on your own.
00:14:16 So that's it for this demo. Okay, we start again in the Data Builder,
00:14:22 and let's say we have identified a Time Dimension - Day view as a view with a bad
performance,
00:14:27 which we want to materialize. So we click on that view.
00:14:31 And you can see here that this is an SQL view with SQL coding.
00:14:36 In the Properties panel on the right, you can see that there is an area persistence.
00:14:41 Currently, there is no persistency and the access is virtual.
00:14:46 Now if you click on the database icon, you can manually create a snapshot of the data in the
view.
00:14:52 And then you can see the last updated time. However, if you click on the calendar icon,
00:15:01 you can schedule the view persistency periodically and utilize either simple schedules or cron
expressions.
00:15:19 We will not do this now. That's it for this demo.
00:15:22 Besides graphical views with drag and drop support, there is also the possibility to use SQL
views
00:15:28 in SAP Datasphere to develop views with SQL scripting capabilities.
00:15:34 Here, a distinction is made between two languages that can be utilized.
00:15:38 For standard SQL queries, which are based on a select statement,
00:15:42 SQL can be selected. For more complex structures,
00:15:45 for example, if statements and loops, SQLScript can be selected to develop a table function.
00:15:51 SQLScript is an SAP-own extension for SQL. SAP Datasphere supports a subset of the SQL
syntax

20 / 34
00:15:59 supported by SAP HANA Cloud. This includes operators,
00:16:03 predicates, expressions, and functions. Details are described in the documentation.
00:16:11 By persisting the view data, you can improve the performance while working with views.
00:16:16 You can enable view persistency for graphical and SQL views
00:16:19 to materialize the output results. By default, a view is run every time it is accessed.
00:16:25 And if the data is complex or a large amount of data is processed,
00:16:29 or the remote source is slow, this may impact the performance of other views
00:16:33 or dashboards built on top of it. You can improve performance by persisting the view data
00:16:39 and you can schedule regular updates to keep the data fresh. Similar to remote snapshots, the
result set is persisted.
00:16:47 Only the required data must be persisted, instead of a one-to-one replication of the remote
source.
00:16:53 The Data Integration Monitor can be used to monitor the view persistence,
00:16:57 to automatically trigger it via schedule, and to analyze the view.
00:17:01 View persistence supports partitioning and partition-wise refresh of data.
00:17:06 Furthermore, the view persistence can be included in the task chain
00:17:09 and started via that task chain. Now let's have a demo
00:17:13 on how to enable this view persistence. The next demo is about associating
00:17:17 the employees data that we just incorporated within the previous exercises.
00:17:21 Therefore, we create a new graphical view in the Data Builder.
00:17:24 We direct the local table EmployeesWithAddresses to the canvas,
00:17:31 and then we rename the view to Employees. We set semantic usage to Dimension
00:17:51 as Employees are one dimension that we want to associate to our dataset.
00:17:58 We set the EMPLOYEEID as the primary key of the view. Now we can deploy the Employees
view
00:18:05 by pressing the cloud icon. The next thing we can already start while it's deploying
00:18:19 is to open the Sales Order View because we want to associate the Employees dimension to it.

00:18:44 So we go to the Associations area and click on the plus to create a new association.
00:19:05 We want to connect the EMPLOYEEID field of our Employees dimension
00:19:09 to the field "Created by" in our Sales Order View. We do so by dragging and dropping the
"Created by" field
00:19:23 on the EMPLOYEEID field. Now we can hit the deploy button
00:19:35 to redeploy the analytic dataset and switch to the SAP Analytics Cloud.
00:19:41 And there, we can open our previously created story. We will insert a new chart
00:19:54 where we want to display the sales per person. Therefore, we will use the measure "Gross
amount",
00:20:08 and we will display the last name of the "Created by" association as Dimension.
00:20:25 In order to rank the results, we can restrict the results to the top 10 employees
00:20:29 with the highest gross amount. Et voilà - we have new data in our dashboard.
00:20:38 That's it for now. Let's summarize what you've learned in this unit.
00:20:43 You've learned what a data flow in SAP Datasphere is and how to use the Data Flow Editor.
00:20:49 You learned how to fulfill more complex transformation requirements
00:20:52 using the script operator in data flows. You learned how to create and schedule a task chain.
00:20:59 And finally, you've learned how to persist a view. That's it for this unit on data flows,
00:21:04 task chains, and SQL views. In the next unit, unit six,
00:21:08 I will show you how to share data, use access controls,

21 / 34
00:21:12 and create entity relationship models in SAP Datasphere. Thank you, and good luck with the
upcoming quiz.

22 / 34
Week 1 Unit 6

00:00:05 Hello and welcome to week one, unit six of our openSAP course, Introduction to SAP
Datasphere.
00:00:12 My name is Tim Huse from the SAP analytics and insight consulting.
00:00:17 In this unit, you will learn how to share data, create hierarchies, create access controls,
00:00:22 and create entity relationship models with SAP Datasphere. Let's dive in.
00:00:29 We start by looking at your task for this unit. You have created a story that shows a good view
of sales,
00:00:35 business partners, products, and employees. You now need to enhance the product
information into the sales order view
00:00:42 and then share it with the sales organization for their reporting. You will use the products view
from the master data space
00:00:48 and share the resulting sales order view to the sales org space for the consumption of the
view.
00:00:54 But before releasing this report, we need to ensure that users see data based on their
authorization,
00:00:59 which is region specific. Therefore, you will create and apply a data access control.
00:01:07 Let's start with hierarchies. A hierarchy is a systematic way of organizing members
00:01:12 of a dimension into a logical tree structure in order to support drill down and drill up
00:01:17 on the dimension in business intelligence clients, such as SAP Analytics Cloud.
00:01:23 An example would be to display the sales per country and then drill down to display the sales
00:01:28 for the respective regions of the selected country, and then to drill down to the respective city
00:01:33 of the selected region. This will be a so-called drill down
00:01:36 to the dimension location with the levels, country, region, and city.
00:01:42 There are three ways to create a hierarchy for your model. You can create level-based
hierarchies,
00:01:48 parent-child hierarchies, or you may use external hierarchies for your dimension.
00:01:53 A level-based hierarchy is non recursive and has a fixed number of levels.
00:01:58 This can be, for example, a time hierarchy like year, quarter, month, day. The example I just
gave with country and region is also a level-based hierarchy.
00:02:09 A parent-child hierarchy is recursive, can have different numbers of levels
00:02:13 and is defined by specifying a column for parents and a column for children in the dimension.

00:02:20 For example, a departmental hierarchy could be modeled with the parent department ID and
department ID columns.
00:02:27 An external hierarchy is a parent-child hierarchy, where the information of the hierarchy is
contained in a separate view,
00:02:35 which needs to be associated with the dimension. At this point, let's take a look at how to
create such a hierarchy in a demo.
00:02:43 In this demo, we are creating a dimension products for product master data that has been
shared with us from the master data space.
00:02:51 We want to create a hierarchy for this dimension So, we create a new view.
00:02:56 On the left side, we can see a folder called shared objects. Here we can find the MD products
view
00:03:01 that has been shared with us. We drag it to the canvas and renamed the view to products.
00:03:14 And we set a semantic usage type to dimension. Now, let's preview the data.
00:03:27 Looks good so far. You can see for every product we have a product category

23 / 34
00:03:31 and they are non-recursive. So, we can create a level-based hierarchy
00:03:35 for product category and product. In the Properties panel on the right,
00:03:42 we click on the Hierarchy icon and then we click plus to add a new level-based hierarchy.
00:03:47 We can keep the default name here. Level one will be product category ID
00:03:52 and level two will be the product ID. We have added a hierarchy.
00:03:59 Now, we can deploy the new dimension view. The next thing we have to do is associating
00:04:04 this dimension to our sales order view. We therefore go back to the data builder
00:04:10 and select our sales order view. We go to the Associations area
00:04:31 and add an association to the products view. And now we can deploy the change by pressing
the cloud icon.
00:04:51 Now, let's switch to the SAP Analytics Cloud story. We can change the top selling products
chart
00:05:01 to display the hierarchy or to display product names instead of the product ID now.
00:05:07 So, if we click on the dimension product ID, we can already see that a hierarchy has been
identified
00:05:11 and you can choose a level here. So, one would be root level,
00:05:15 two is the product category level, and three is the product level.
00:05:19 By clicking on one product category in the chart, you can just drill down to see all the products

00:05:24 within this category. We can also display the product name
00:05:33 instead of the product ID now. And we can rank the output to only show the top 10 best-selling
products.
00:05:47 Also, have a look at the sales organization in your story. You should see data for EMEA, APJ,
as well as America.
00:05:55 This is important for the next exercise, as we want to restrict the data that you see here.
00:06:00 Now, let's take a look at data access controls. These allow a more granular view on data at the
row level.
00:06:07 What does that mean? A user may only see the rows of a data set
00:06:11 for which he's authorized to see the data. Whether he's authorized depends on the previously
defined criteria,
00:06:18 which are defined in a data access control. Data access controls are applied to artifacts
00:06:24 in the data layer and cannot be overruled. For example, if a view is built on top of a view
00:06:29 that is restricted by a data access control, this restriction also applies to the view above it.
00:06:36 Data access controls, once created, can be applied to various artifacts within the data layer.
00:06:42 In the following demo, we will look at data access controls in action.
00:06:45 We will restrict the sales order view so that a user can only see the sales organizations that he
authorized for.
00:06:52 So, let's start. We start in the data builder and go to the tables
00:06:58 as we want to define the privilege for our data access control within a new local table.
00:07:04 We will call this table DACDefinition, and it will have two columns,
00:07:09 username as well as allowed values. With this table, we can define which user
00:07:17 will see which sales organization data. We can keep the default data types here
00:07:22 and deploy the table. Now, let's utilize the data editor
00:07:40 within the table to insert new data to the table. The data editor is on the top right side.
00:07:51 Now, you can add two rows with your username and EMEA, as well as APJ as allowed values.

00:08:41 Now, we can save these data entries and jump to the data access controls.

24 / 34
00:08:47 The data access controls have their own panel that you can find in the left navigation bar,
below the data builder.
00:08:52 We rename the new DAC to Region_DAC. We add a permission entity,
00:09:00 which will be our newly created DACDefinition table. We will only use the allowed values
column as criteria
00:09:07 for the access control. As the username column will be used
00:09:11 as the identifier column. Now, we can deploy the access control
00:09:16 by pressing the cloud icon, which will automatically save the artifact as well.
00:09:27 The data access control is created now, but we need to associate it to our sales order view
00:09:32 in order to apply it to the view. Therefore, we go back to the data builder
00:09:38 and jump into the sales order view. There is a special area for data access controls
00:09:44 and we can link it here. We will map the allowed values column
00:09:53 to the sales organization column of our sales order view. Afterwards, we will deploy the
change to the view.
00:10:07 Now finally, we can open the data preview in order to check if we can only see the regions
EMEA and APJ anymore.
00:10:15 You can check this in your Analytics Cloud story afterwards as well. Yes, so we can see EMEA
and also APJ.
00:10:39 Yes, looks good. So, that's it for this demo.
00:10:44 It is possible to share a table or view of the data layer to another space to allow members in
that space to use it as a source for their objects.
00:10:52 A space is a secure area, and its artifacts are not seen in other spaces unless you choose to
share them
00:10:58 with the other space members. When you share an entity to another space,
00:11:03 users in that space can use it as a source for their own views and other objects.
00:11:08 In the example on the slide, you can see how the products and sales tables originally reside in
a space
00:11:14 that is assigned to the IT department. This is because the data comes
00:11:18 from an external source system that is governed by IT. The products table is shared with the
sales space
00:11:25 and the view with sales data is also shared with the sales space.
00:11:29 So that a sales by product view can be built on top in the sales space. The artifacts from the IT
space are marked as shared.
00:11:38 We will now take a look at this concept in a demo. We start in the data builder in our openSAP
space.
00:11:44 As an example, we want to share our sales order view to a sales space. So we click on the
Sales Order View and hit the share icon.
00:11:53 Then we can search for any space in the tenant and edit with read privileges.
00:12:02 The view is now shared. You can see it because there is a small share icon
00:12:07 behind the name of the view. If you click on that icon, you can see all spaces the view is
shared with.
00:12:19 Now, I want to show you how to consume a view that is shared with your space.
00:12:23 We have seen this in a previous exercise already. If you create a new view,
00:12:27 you can see a folder in the left panel, Shared Objects. Here, you can access all tables and
views that have been shared with you.
00:12:34 In the example, the view MD_Products has been shared with my space
00:12:38 from the space OPENSAP_MASTERD_REF. With this, we can close this demo on cross-
space sharing

25 / 34
00:12:45 and go to the entity relationship models. Entity relationship models can be developed in SAP
Datasphere.
00:12:52 These so-called ER models provide a diagram that shows the data entities, tables, as well as
views of an environment
00:13:00 in relation to one another. You can use an ER model to better understand the subset of the
entities in your space
00:13:06 and to communicate this information to other stakeholders. With the ER model, physical or
remote database models
00:13:13 can be designed and afterwards also deployed. Furthermore, existing tables and views from
the data layer
00:13:19 can be reused in the ER model, which means that the model is capable of reverse
engineering.
00:13:25 New entities can be added on the fly in the entity relationship modeler.
00:13:30 Data can be previewed in real time in the editor. The integrated impact and lineage analysis
can be employed
00:13:38 to visually analyze how the artifacts of the data model depend on each other. Conveniently,
the source file of the ER model
00:13:45 can be exported and imported from SAP Datasphere. And now, let's take a look at the ER
model in a short demo.
00:13:53 In this short demo, I want to show you what an entity relationship model can look like in SAP
Datasphere.
00:13:59 You can see there are several tables as well as views in the model.
00:14:03 You can visualize the data types and the entities and also see the relation between these
entities.
00:14:09 You can also see that entities can have relations to themselves.
00:14:14 In this example, each employee has a manager, who is an employee himself.
00:14:23 You can click on the arrows to see more information on the relationship. By clicking on an
entity,
00:14:34 you can choose the create table icon in order to create a relation to a new table.
00:14:40 Then you could just define the columns of the new table, rename it, and even deploy the table
00:14:46 within the entity relationship modeler. We will not do this now.
00:14:52 The ER model can be easily imported and exported to a season file.
00:14:56 This means you can easily share your modeling thoughts with people that work in other
spaces or tenants.
00:15:03 That's it for this demo. Let's summarize what you've learned in this unit.
00:15:08 You've learned how to create hierarchies within dimensions. You've learned how to use data
access controls.
00:15:14 Furthermore, you've learned how to use cross- space sharing in SAP Datasphere. And finally,
you got to know
00:15:21 the entity relationship modeler. That's it for this unit on sharing data access controls and ER
models.
00:15:29 In the next unit, unit seven, you will learn how to utilize the data integration monitor in SAP
Datasphere.
00:15:35 Thank you, and good luck with the upcoming quiz.

26 / 34
Week 1 Unit 7

00:00:05 Hello, and welcome to week one, unit seven of our openSAP course, Introduction to SAP
Datasphere.
00:00:13 I'm Amogh Kulkarni, from the SAP Datasphere Product Management.
00:00:18 In this unit, you will learn how to use the different monitoring tools in the tenant
00:00:24 to understand your tenant health, as well as monitor your data integration tasks.
00:00:29 So let's start. This is a section called Know Your SAP Datasphere,
00:00:35 and the theme for this unit is integration and monitoring. The different topics covered in this
theme are -
00:00:41 the Data Integration Monitor, the System Monitor, configuring your tenant for optimal
monitoring,
00:00:47 working with database analysis users, and finally, the navigation to the SAP HANA Cloud
cockpit.
00:00:59 Data Integration Monitor is a central place in SAP Datasphere where you would monitor data
application for remote tables,
00:01:06 monitor data flows and task chain executions as well. You would add and monitor view
persistency,
00:01:14 as well as monitor queries that are sent from your Datasphere tenant
00:01:18 to your remote connected sources. Data Integration Monitor consists of five different monitors

00:01:25 separated by tabs. Let's go through the specific task


00:01:29 that is achieved by each of these monitors. The first one is the Remote Table Monitor.
00:01:38 The Remote Table Monitor lets you replicate data for the remote tables that have been
deployed
00:01:43 in the context of your space. So whenever you're modeling,
00:01:48 and if you deploy remote tables you would be able to control the data replication tasks
00:01:53 for your remote tables through the Remote Table Monitor
00:01:57 in the Data Integration Monitor. This replication can either be a snapshot-based replication
00:02:04 or you can also set up a real-time replication via the change-data-capturing, that is CDC.
00:02:11 That's the first one. The second one is the View Persistency Monitor.
00:02:16 So whenever you are creating views within your SAP Datasphere, you can also decide
whether you want to persist the data
00:02:23 of these views in your space. You would do such an activity to ensure
00:02:27 that they perform better when you are modeling views in your SAP Datasphere and you're
consuming them.
00:02:35 The next one is the Flow Monitor. So whether you have data flows,
00:02:40 or whether you have replication flows in your SAP Datasphere that you have deployed in your
space,
00:02:46 Flow Monitor would be the place where you would monitor the execution of the flows,
00:02:50 not just the present ones but also the past ones. Additionally, you would also be able to run
00:02:58 and schedule your flows directly from the Flow Monitor. So, all the flows that are deployed in
your space
00:03:05 would then be available here for monitoring, and you would come here
00:03:08 to look at the execution of all your flows. The next one is the Task Chain Monitor.
00:03:17 Oftentimes you want to string together multiple tasks and run them, maybe as a chain, either
sequentially or parallelly.
00:03:25 You would find such task chains for your monitoring on the screen.

27 / 34
00:03:30 And the last, the fifth monitor, that is the Remote Query Monitor.
00:03:34 It is a bit different from all the monitors that we have seen so far.
00:03:38 The Remote Query Monitor has two additional monitoring possibilities.
00:03:42 The first one is the remote table statistics. This lets you define query execution plans
00:03:48 to improve the performance, when the data is read from remote tables.
00:03:53 One thing to note here is that if the data access is set to replicate it,
00:03:58 which means that you've already persisted the data of your remote tables,
00:04:03 then the statistics creation is disabled. This is only applicable to federated datasets.
00:04:10 The three types of statistics that you could create are record count on a simple one,
00:04:15 which creates basic statistics for columns such as min, max, null count, distinct count,
00:04:23 and the last type is the histogram, which shows this distribution per column.
00:04:29 Remote table statistics also lets you delete existing statistics that you have created.
00:04:34 So whenever you want to create or delete, this is the place where you come
00:04:38 for remote table statistics. The second capability within the Remote Query Monitor
00:04:44 is tracking remote queries. It lets you track the queries that are executed
00:04:49 towards your remote connected sources from your space. So every time that you're querying a
remote source,
00:04:56 you would see that your statements are recorded in the Remote Query Monitor.
00:05:00 Additionally, this is the place where you would find statistics for your queries,
00:05:05 like the runtime, the number of rows that you have fetched, and the status of the query,
00:05:09 whether it is running or whether it has been closed. This is the place to monitor your remote
queries
00:05:16 between your SAP Datasphere and your remote sources. Additionally, what you would also
see here
00:05:23 is the actual statement that has been executed towards your remote source.
00:05:27 So you can also find out what statement is sent to the remote source.
00:05:32 So in a nutshell, every time that you want to track the activity between your SAP Datasphere
tenant
00:05:39 and your remote source, you would use the Remote Query Monitor
00:05:43 under the Data Integration Monitor. These are the five different monitors
00:05:48 that are available within the Data Integration Monitor, that let you effectively monitor the
execution
00:05:55 of all the different activities in your tenant. The different monitors
00:06:02 within the Data Integration Monitor need different privileges for you to be able
00:06:07 to effectively monitor and execute different tasks, whether it is the remote connection privilege,

00:06:14 or it is the data integration privilege. A user that is monitoring and executing tasks
00:06:21 needs to have the correct privileges to be able to perform these actions.
00:06:27 So, sort the privileges out before using the Data Integration Monitor.
00:06:33 In a nutshell, the Data Integration Monitor is a group of monitors that work in the context of a
space,
00:06:40 meaning one has to select a space before accessing any of the monitors,
00:06:46 and this is really important. For this demo, I'm logged in as a data integrator
00:06:53 and a space administrator, so I have both the roles for this user.
00:06:58 From the left navigation pane, we move to the Data Integration Monitor.
00:07:04 Now because I am part of only one space, it did not ask me for space selection,

28 / 34
00:07:09 but ideally, it will first ask you to select the space, and once you do, you can look at all the
different monitors
00:07:16 in the context of the space. The first one in the list is the Remote Table Monitor.
00:07:21 Here you can either replicate using snapshots or you could enable real-time access for a
remote table,
00:07:30 using the buttons here. Or you can also navigate to the additional details page,
00:07:38 and there you can not only see the current execution or the latest execution, but the previous
ones as well.
00:07:46 Also, it is possible then to load a new snapshot, remove existing snapshots,
00:07:52 or enable or disable real-time access. Additionally, you can also schedule the execution of a
remote table,
00:08:03 using simple schedule, where you specify the recurrence, the time,
00:08:08 the start and the end date, or in a chron expression manner,
00:08:14 where you specify the chron expression, and then the start and the end date,
00:08:18 which then tells you when will be the next runs according to the chron expression that you
have provided.
00:08:25 Every time that you have to work with your remote tables and their replication,
00:08:30 you will come directly to the Data Integration Monitor, and then use either the details screen
here
00:08:38 or the Remote Table Monitor to select a remote table
00:08:42 and specify the replication strategy. The next one is the view persistency.
00:08:50 Here you can perform a similar process for view persistencies. If you don't see a view in the
list,
00:08:59 you can select a view from the add view button here, and then perform the exact same
process
00:09:08 that we did for remote tables. The third one is the data flow
00:09:13 and the replication Flow Monitor, where you have the list of all the data flows in your space,
00:09:21 either to run them immediately or similarly schedule them for a future execution.
00:09:29 The last one, in this page but not on a list
00:09:34 is the Task Chain Monitor, where you find all the information that you need
00:09:39 about a task chain, whether you want to execute it,
00:09:42 whether you want to schedule it, or whether you want to see
00:09:45 where the previous runs failed or completed, how did they go -
00:09:49 this is visible in a three-page display where you select the execution,
00:09:53 then the step in the execution that you want to view more details about.
00:09:59 And on the rightmost side, then you find the execution details
00:10:02 for that particular task in that particular execution run. This is how you will then monitor
00:10:11 all your data integration tasks. Finally, we have the Remote Query Monitor.
00:10:17 I'm treating it differently because it also works a bit different from others.
00:10:21 Where you have your remote queries, you can see all the remote queries that have been fired

00:10:28 from your SAP Datasphere towards your remote datasets. Find most basic information about
their execution,
00:10:35 but also look at the SQL statement that was executed. And then the second part is the remote
table statistics.
00:10:45 Here you can select one of your remote tables and create or delete statistics.
00:10:51 Creation of statistics is a two-step process, where it'll first ask you what is the type of statistic
00:10:59 that you want to create, and then the relevant statistics are created.

29 / 34
00:11:03 Make sure that you create statistics for your remote table because they improve the
performance
00:11:08 of your query execution plan. Now we move to the System Monitor.
00:11:16 This is the main monitoring tool in SAP Datasphere. And we will look at what is the difference
00:11:24 between a Data Integration Monitor and a System Monitor. Administrators in SAP Datasphere

00:11:30 would use the System Monitor to take a look at the performance of the tenant
00:11:34 and monitor the storage, the tasks, out-of-memory issues, and all the other problems across all
these spaces.
00:11:41 This is the central monitoring in SAP Datasphere. It is a tenant-wide monitor,
00:11:48 which means that only administrators at this point can have access to the System Monitor.
00:11:55 The role required to access the System Monitor is thus the SAP Datasphere administrator.
00:12:02 This is the role that you would need to use the System Monitor.
00:12:06 The System Monitor provides you with a top-level view of all the activities happening across
the tenant,
00:12:12 and it also has a navigation path to the Data Integration Monitor that we've seen previously.
00:12:18 The System Monitor is divided into two parts. The first one is the dashboard,
00:12:23 so that's the landing page that gives you all the information
00:12:27 about the storage distribution, the disk, the memory utilization by spaces,
00:12:32 but also an aggregated count for failed tasks and out-of-memory issues in the tenant.
00:12:37 So this is the dashboard that you would like to visit if there are any issues in your tenant.
00:12:43 The available information is further delineated into the top five spaces that are contributing
00:12:48 to the out-of-memory errors, or the number of failed tasks
00:12:51 in the last seven days, in the last 24 hours, as well as in the last 48 hours.
00:12:55 So this is really useful. The dashboard is thus a culmination of all the information
00:13:02 that is available for your monitoring, and this is the place
00:13:05 where you would start your monitoring journey. The second tab in the System Monitor
00:13:10 provides you with information about tasks that have been executing in the tenant.
00:13:16 Just note that only an administrator has access to the System Monitor.
00:13:20 I'm stating it again. Here you will find out about all the executions
00:13:26 of all the tasks with their execution statistics, like duration, memory consumption,
00:13:31 the status of the tasks, the number of records that they have fetched.
00:13:36 Further, it is also possible to look at the statements that have been executed
00:13:41 owing to these tasks and are classified as expensive statements.
00:13:45 So one task can have one or multiple statements. The System Monitor lets you drill down
00:13:52 from a top-level KPI, to actual task, and the resulting individual expensive statements as well.
00:14:02 For this demo, I'm logged in as an administrator. From the left navigation pane,
00:14:08 because I'm an administrator, I now see the System Monitor. I navigate to the System Monitor,

00:14:14 where it shows me the dashboard with all the most important KPIs
00:14:20 that tell you the health and status of your SAP Datasphere tenant,
00:14:25 be it the space consumption or the memory consumption. Moving on, it'll also show you all the
number
00:14:32 of failed tasks in your tenant, in the last seven days, or out-of-memory errors, run duration,
memory consumption,
00:14:42 as well as some statistics on the MDS requests. This is the page where you will come
00:14:49 if you want to find out more about the current status of your tenant.

30 / 34
00:14:55 Here you get it in an aggregated manner, but if you move to the logs,
00:15:00 you can dive into individual executions and the different tasks that have been executed.
00:15:07 Now, here you will find all the different executions that have gone in the tenant in all the
spaces.
00:15:17 Remember, I'm an administrator, and that's why I see all the spaces
00:15:21 and all the tasks that have been executed. I can directly move to the Data Integration Monitor
00:15:27 if I click on the activity, or I can also go to the modeling screen
00:15:31 if I click on the object name. I scroll to the right.
00:15:36 It also provides me with the statements that have been executed,
00:15:41 but because this statement was not classified as an expensive statement,
00:15:46 to which we'll come later on, you won't see any statements here.
00:15:51 But if you clear the filter out, you'll see all the other statements
00:15:54 that have been executed in the tenant. And if there are any tasks that are related to it,
00:16:01 so scheduled tasks or manually executed tasks, you will see them with a task log ID.
00:16:11 This is the page where you would then come when you want to look at all the different
executions
00:16:17 that have gone either in a sorted manner, or in a timely manner.
00:16:22 So you can also select the start time and the end time to give it a timestamp,
00:16:28 and then find out what has gone in the tenant in that given time duration,
00:16:36 so that you can filter out from all these tasks that have executed
00:16:41 and only look at the ones that you are willing to look at. We will now look at the monitoring
configuration,
00:16:51 which is available under system, followed by configuration. So system, configuration.
00:16:58 There are two configurations that are interesting here. The first one is the monitoring view
enablement,
00:17:05 which lets you select up to two spaces that get an access to the monitoring views
00:17:10 in the Data Builder. By the virtue of these monitoring views,
00:17:14 you'll be able to consume System Monitoring views from the underlying SAP HANA Cloud
instance.
00:17:20 So whenever you want to look at how my instance is behaving, you can use these monitoring
views.
00:17:26 One of the spaces is set as the SAP monitoring content space, where you would deploy the
delivered monitoring content from SAP,
00:17:34 so that is the standard business content for monitoring. Whereas the second space is a user-
defined space
00:17:42 that gets an access to the underlying monitoring views. So the first one is the technical content
space from SAP,
00:17:48 and the second one is the custom user space. When enabled, a modeler in the space can
consume and model
00:17:57 on top of the System Monitoring views to derive additional insights.
00:18:01 You will do that if you want to fetch more insights other than what System Monitor shows you.
00:18:07 The second configuration is expensive statements tracing, and this is the configuration that
controls the statements
00:18:14 that are classified as expensive statements and are available for your monitoring
00:18:18 in the System Monitor under the Statements tab. This is what I was referring to as the
expensive statements.
00:18:25 So to classify a query or a statement as an expensive statement,
00:18:30 you will have to specify thresholds that let the system filter statements

31 / 34
00:18:34 that are costlier than the configured values. A threshold and a statement act together there.
00:18:41 As soon as a statement consumes either more CPU time, or memory, or duration
00:18:46 than the threshold value that you have configured, it is automatically classified as an
expensive statement,
00:18:53 and this statement is available for monitoring in the System Monitoring dashboard.
00:18:58 So you see how these two things work together. Now, a database analysis user
00:19:07 is an SAP HANA Cloud database user with wide-ranging read privileges,
00:19:12 over the underlying runtime HANA Cloud instance. Optionally, this user can also be granted
space schema privileges,
00:19:21 which means that this user can also read the data that is stored in the space schema,
00:19:27 but this is clearly optional. You can decide not to grant these privileges.
00:19:33 You would create such a user to support monitoring, analyzing, tracing, and debugging
00:19:39 of your SAP Datasphere runtime HANA Cloud instance. These database analysis users can
be configured
00:19:46 with an expiration date, or without one. You can choose when you're creating a user.
00:19:52 This user could either be leveraged to consume the underlying HANA Cloud monitoring views
00:19:57 in the database explorer, similar to what you will do in the graphical modeler,
00:20:02 or to access the SAP HANA Cloud cockpit. Now, the database analysis users that we have
looked at
00:20:13 on the previous slide, these could be used to access the HANA Cloud cockpit.
00:20:19 This is a tool for administering and monitoring the underlying cloud runtime database.
00:20:25 So whenever you want to look at how my underlying instance is behaving, use the HANA
Cloud cockpit
00:20:32 as a complementary tool to the System Monitor. The HANA Cloud cockpit lets you monitor
alerts,
00:20:38 resource usages, and performance of your SAP HANA Cloud instance.
00:20:43 This could be done by the virtue of the performance monitor,
00:20:46 as well as the workload analyzer. You will use the tools that fit your requirements
00:20:51 when you are monitoring your HANA Cloud instance. This will also be the place where you will
analyze
00:20:57 trace and diagnosis files, as well as query the system
00:21:00 using SAP HANA database explorer. Now, because the database analysis user
00:21:07 has a read privilege over the underlying instance, with this user, we will not be able to make
00:21:12 any configuration changes in the underlying SAP HANA Cloud instance.
00:21:16 This still remains outside of the borders or limitations of the database analysis user.
00:21:23 The database analysis user, as the name states, is merely for analyzing the SAP HANA Cloud
instance
00:21:29 and not for making any configurational changes. In this demo, now I am going
00:21:37 to use the Configuration, under System. And here I'm going to go to the Monitoring view.
00:21:47 As mentioned in the slides, you can select one of your spaces in the tenant
00:21:54 to be a candidate for the monitoring view, and this space now gets an access,
00:21:59 and all the modelers in the space also get an access to consume the underlying monitoring
views.
00:22:05 The second part which is of interest is the expensive statements tracing,
00:22:10 where you can find out how to configure, or what are the current configurations.
00:22:15 At the moment, the thresholds are set only for memory and duration, which means that every
statement
00:22:23 that passes the threshold either for its memory consumption or takes more time to complete

32 / 34
00:22:29 than the duration mentioned here, will be classified as an expensive statement,
00:22:33 and it will be visible under the System Monitor in the statement step.
00:22:37 Make sure that you set this optimally, because if you set it too low,
00:22:41 then you are actually tracing everything and then everything is treated as an expensive
statement.
00:22:47 If you set the thresholds too high, then you are not filtering anything
00:22:51 and then you're losing out on any insights that you get out of expensive statements.
00:22:56 So make sure that you always set this optimally, and the optimal threshold values can be
found out
00:23:03 by some trial and error when you're trying to filter some out,
00:23:08 check the query execution, and then come back and configure it.
00:23:14 For this demo, now we are going to go again into the System and Configuration,
00:23:21 where we find a tab that says Database Access, and under Database Access, we have
Database Analysis Users.
00:23:29 Here you can find out all the users that have been created, but also create new users.
00:23:36 Creation of a user lets you specify the name, and then whether the user has an access to the
space schema.
00:23:44 Additionally, you can also specify the expiration date for this user, so if you want the user to
expire in a certain number of days,
00:23:51 up to five, or you never want it to expire,
00:23:55 you can specify that and create. Once you create,
00:23:59 and we will quickly create one for our demo... Once you do so, you have all the details that you
need
00:24:08 to either connect it using an external database tool, or you can use this user to open the SAP
HANA Cloud cockpit.
00:24:20 Opening this would then jump from your SAP Datasphere onto your SAP HANA Cloud cockpit,

00:24:28 to let you monitor, analyze, and troubleshoot everything that you need to do
00:24:31 with your underlying SAP HANA Cloud instance. Let's summarize what we have learned in this
unit.
00:24:40 We started with the Data Integration Monitor, to look at all the different monitoring capabilities
00:24:44 that it offers. Then we went to the System Monitor,
00:24:48 which offers a dashboard with KPIs, and also lets you drill down
00:24:52 the individual task execution and statements. We also learned about expensive statements
tracing,
00:24:59 the thresholds and the configuration, the database analysis user,
00:25:03 and the navigation to SAP HANA Cloud cockpit. I hope you can now use all the tools that you
need
00:25:10 to monitor and understand the health of your SAP Datasphere tenant.
00:25:15 That's it for this unit on Data Integration Monitor. Thank you, and good luck with the upcoming
quiz.

33 / 34
© 2023 SAP SE or an SAP affiliate company. All rights reserved.
See Legal Notice on www.sap.com/legal-notice for use terms,
disclaimers, disclosures, or restrictions related to SAP Materials
for general audiences.

You might also like