DronaBlog

Sunday, June 27, 2021

What is Common Table Expression (CTE) in Snowflake?

    Are you working on Snowflake technology and would like to understand Common Table Expression also known as CTE? If so, then you reached the right place. In this article, we will understand what is Common Table Expression. We will also explore how to use CTE in queries.

A) What is Common Table Expression i.e CTE?

      A Common Table Expression (CTE) is a named subquery. It is defined in a WITH Clause. It is equivalent to a temporary view.

      The  CTE Contains an optional list of columns and a SELECT Statment. The output of CTE is a table with a column defined.





B) What is the syntax for CTE?

     The syntax for CTE is as below

      WITH

            TEMP_CTE (COL_1 ,COL_2) AS

              (SELECT col_nm1 , col_nm2 From TABLE)

     SELECT COL_1 , COL_2 FROM TEMP_CTE

C) What is recursive CTE?

      A recursive CTE is a CTE that references itself. In a recursive CTE, we can join Table to itself as many times as needed.


D) Benefits of CTE?

      1) CTE increases the modularity of the SQL programs

       2) CTE is helpful for simplified maintenance

       3) Recursive CTE can be used to process hierarchical data in the table.





E) Naming conventions for CTE?

      We need to avoid CTE names that match with database views or table the reason for this is - if we define query with CTE names then CTE takes precedence over table or view names which will produce unexpected results.

         For more details about snowflake refer to this video  



Tuesday, June 22, 2021

When Tokenization job gets executed in Informatica MDM?

Are you looking for the details about when the tokenization job executed in Information? Would you be interested in knowing the relationship between a match job and a tokenization job? If so, then you reached the right place. In this article we will explore the instances during which tokenization job gets triggered:

A. Why we need a tokenization process in MDM?
          Tokenization is a process that generates the token-based fuzzy match key and other match column configuration. The tokens generated are stored in the table <BO> STRP. This STRP table is used as input to match the process. The match process matches the records based on these tokens.





B. When the tokenization process executed
           The tokenization process gets triggered at various points and These are:
          1)  Manual tokenization execution
          2)  During the match process
          3)  During the load process

1) Manual tokenization job execution
       We can execute the tokenization job in Informatica MDM manually whenever it is needed. It is recommended that tokenization need to executed separately instead of in conjunction with load or match job.





2)  Tokenization process as part of match job
       If it is required to update the match token, the match process in Informatica automatically starts the match job's tokenization process. This scenario may happen if new records have been added or updated existing records.


3) Tokenization process during load job.
         If we enable the `Generate Match Tokens on Load' property on the base object then, when records loaded in the base object at that time the tokenization job will be triggered. It is not recommended to enable this property as it will adversely impact load job.






Learn more about the tokenization process here-


Friday, June 18, 2021

How to Use LIKE operator in Snowflake?

   Are you looking for details of what is collate in Snowflake? Are you also interested in knowing what is the purpose of collate and how to use it? If so, then you reached the right place. We will understand how to use collate to achieve LIKE functionality in Snowflake. In this article, we will more about collate or collation in Snowflake.




A) What is collate?

          The collate function in Snowflake allows specifying alternative rules for comparing strings.

B) What is the purpose of collate in Snowflake?

        The collate function in Snowflake is used to compare and sort the data. The comparison and sorting will be based on a particular language or other user-specified rules.

         The text strings in Snowflake are stored using UTF-8 character set. Comparing based on Unicode will not provide the desired output because of the following reasons :

          1. The special character in a language does no sort based on the language standards.

          2. In case we would like to achieve sorting based on special rules .e.g case insensitive sort.


C) What type of rules can be used with collate in Snowflake

     Here is a list of rules that can be used with collate 

     1. Different character sets for different language

     2. To achieve case insensitive comparisons 

    3.  Accent sensitivity e.g a,á,ä

    4 . Punctuation sensitivity e.g P-Q-R and PQR

    5. Sorting based on the first letter in the strings.

    6. Trimming leading and trailing spaces and then sorting

     7. Other options can be implemented based on business needs.





D) Where to use collate in Snowflake SQL?

       1. Simple comparison in where clause

              WHERE  FIELD1= FIELD2

        2. Join condition

              ON  EMP. EMP_NM =MANAGER.MNGR_NM

         3. Sorting condition

               ORDER BY FIELD 1

         4. Aggregation condition

                GROUP BY

          5. Aggregate functions

                  MAX ( FIELD1)

          6. Scaler functions

                 LEAST (FIELD1, FIELD2, FIELD3)

         7. Data clustering conditions

                CLUSTER BY (FIELD1)

     There are other several usages of collate in SQL, however above mentioned are commonly used.


E) How to use collate with LIKE operator

      here is an example of collate using like operation

     SELECT * FROM EMP WHERE COLLATE (NAME,") Like%ABC%


Learn more about snowflake here -



Understanding Survivorship in Informatica IDMC - Customer 360 SaaS

  In Informatica IDMC - Customer 360 SaaS, survivorship is a critical concept that determines which data from multiple sources should be ret...