The Data Cleanse transform is used to perform parsing, standardizing, and enhancement of customer and operational data.
Parsing identifies individual data elements and breaks them down into their component parts. It rearranges data elements in a single field or moves multiple data elements from a single data field to multiple discrete fields. Data Cleanse parses name, title, firm, phone, US social security number, date, e-mail, and user-defined patterns.
Standardization includes business rules around formats, abbreviations, acronyms, punctuation, greetings, casing, order, and pattern matching – all examples of elements you can control to meet your business requirements and prepare the data for validation, correction, and accurate record matching.
We need to create two data stores, one data store is created to MSSQL and another is created to HANA .
Steps
Start -> Microsoft SQL server 2012 -> SQL server management studio.
Right click on tables -> Click on new table.
Now, we create a table with column names ZBUKRS1, ZKNDNR1, ZVKORG1, ZARTNR1, ZVV010 and ZVV070.
Click on save icon.
Go to Data Services Workbench.
In order to create a project: File -> New -> Click on Project.
Enter Project name and select the Data Services Repository.
Click on Finish.
Now, we can view the project that we have created.
In order to create Data Store: Right click on Project -> new -> select Data Store.
Provide name and click on next.
Select Data store type as :- Database.
Select Database Type as :- Microsoft SQL server.
Enter Database name , User Name and Password.
Click on test connection.
Pop-up appear "Connection test succeeded ".
Click on close.
In order to create a data store in HANA: Click on Project -> New -> Select Data Store.
Provide the name and description and click on next
Provide the credentials and click on finish.
Click on the Data Store.
Click on Tables -> click on “Import object by name” icon.
Provide the Table name( MSSQL) and click on finish.
Table has been imported.
Right click on object and click on view data .
There is no data in imported table.
Click on save .
We view the fields in the table.
Right click on Project -> New -> Select Data Flow.
Enter the Data flow and Description.
Click on finish.
Import the table as Source onto the Data Flow editor.
looks like,
Drag and drop Basic Cleanse onto the data flow editor.
Drag and drop Template Table onto the Data Flow editor.
Provide the Table Owner Name and Table Name.
Click on OK.
Drag and drop the Query Transform onto the data flow editor.
Join the links between the tables as shown below
Click on Query Transform.
Drag and drop the fields from source table to Query transform.
Click on save all icon .
Go back to MSSQL server.
Right click on Table name -> click on edit top 200 rows.
Now, enter the data into the table as shown below, we can enter data with special chars and spaces etc ( Just to understand the Cleanse concept ).
Right click -> Click on execute SQL.
Go back to Data Services workbench .
Right click on Source Table -> select on view data .
Now, we can view the data that we have created in MSSQL server.
Click on close.
Click on Basic Cleanse.
Now, we can eliminate the special char values by using basic cleanse: If we have to eliminate the special chars in ZBUKRS1 column, click on that object -> Enter the special char under searched value -> Leave replacement value as empty or blank.
Click on template table.
We can see that the objects are mapped.
Click on validate icon.
Validation successful.
Click on close.
Click on execute icon .
Click on Finish.
Job execution successful.
Go to HANA studio.
Go to table,
Right click on table and click on refresh.
Right on Table -> Click on Data Preview.
Now, we can view the data in clear form without special chars, spaces etc
No comments:
Post a Comment