DronaBlog

Saturday, September 16, 2023

What is SSA Name3 Fuzzy Match Engine in Informatica MDM?

 


SSA Name3 Fuzzy Match Engine in Informatica MDM

SSA Name3 is a fuzzy match engine that is used in Informatica Master Data Management (MDM) to match records that contain names, addresses, and other identification data. SSA Name3 is a powerful engine that can match records even when there are errors or inconsistencies in the data.





SSA Name3 uses a variety of techniques to match records, including:

  • Phonetic matching: SSA Name3 can match records based on the phonetic similarity of the data. This is useful for matching records that contain different spellings of the same name or that are in different languages. Some of the phonetic matching algorithms used in SSA Name3 include:
    • Soundex
    • Double Metaphone
    • Cologne Phonetic
    • Metaphone 3
    • NYSIIS
    • Refined Soundex
  • Exact matching: SSA Name3 can also match records based on the exact match of the data. This is useful for matching records that contain the same data, such as the same name and address.
  • Fuzzy matching: SSA Name3 can also match records based on a fuzzy match of the data. This is useful for matching records that contain similar data, but not the same data. For example, SSA Name3 can match records that contain the names "John Smith" and "Jon Smith." Fuzzy match algorithms are:
    • Jaro-Winkler
    • Levenshtein distance
    • Dice coefficient
    • Needleman-Wunsch algorithm

SSA Name3 is a very flexible engine, and it can be configured to meet the specific needs of your organization. You can configure SSA Name3 to match records based on different criteria, such as the type of data, the match thresholds, and the match weights.

To use SSA Name3 in Informatica MDM, you need to create a fuzzy match rule. A fuzzy match rule specifies the criteria that SSA Name3 will use to match records. You can create a fuzzy match rule to match any type of data, such as names, addresses, or product numbers.

Once you have created a fuzzy match rule, you can use it to match records in Informatica MDM. You can match records in a variety of ways, such as matching records in a batch or matching records in real time.

SSA Name3 is a powerful and flexible fuzzy match engine that can be used to improve the accuracy and efficiency of data matching in Informatica MDM.

Here are some examples of how SSA Name3 can be used in Informatica MDM:

  • Matching customer records: SSA Name3 can be used to match customer records from different sources, such as CRM systems and ERP systems. This can help to create a single, unified view of each customer.
  • Matching product records: SSA Name3 can be used to match product records from different sources, such as e-commerce systems and supply chain management systems. This can help to improve the accuracy of product data and reduce the risk of errors.
  • Matching employee records: SSA Name3 can be used to match employee records from different sources, such as HR systems and payroll systems. This can help to create a single, unified view of each employee.

SSA Name3 is a valuable tool for any organization that needs to match data from different sources. It can help to improve the accuracy and efficiency of data matching and reduce the risk of errors.


Phonetic matching in Informatica MDM:

  • Matching records with different spellings of the same name, such as "John Smith" and "Jon Smith."
  • Matching records with names in different languages, such as "Juan PĂ©rez" and "John Perez."
  • Matching records with names that contain common abbreviations or nicknames, such as "Bill" and "William."
  • Matching records with names that contain typos or other errors, such as "Michale" and "Michael."

SSA Name3, the phonetic matching engine used in Informatica MDM, uses a variety of techniques to match records, including:

  • Soundex: Soundex is a phonetic algorithm that converts words into a four-digit code based on the pronunciation of the word. For example, the words "John Smith" and "Jon Smith" would both convert to the Soundex code "J523."
  • Double Metaphone: Double Metaphone is a phonetic algorithm that converts words into a two-digit code based on the pronunciation of the word. For example, the words "John Smith" and "Jon Smith" would both convert to the Double Metaphone code "JN."
  • Cologne Phonetic: Cologne Phonetic is a phonetic algorithm that converts words into a two-digit code based on the pronunciation of the word in German. For example, the words "John Smith" and "Jon Smith" would both convert to the Cologne Phonetic code "JN."





SSA Name3 also supports a number of other phonetic algorithms, such as Metaphone 3, NYSIIS, and Refined Soundex. The algorithm that is best for you will depend on the specific type of data that you are trying to match.

To use SSA Name3 for phonetic matching in Informatica MDM, you need to create a fuzzy match rule. A fuzzy match rule specifies the criteria that SSA Name3 will use to match records. You can configure a fuzzy match rule to use phonetic matching by selecting the appropriate phonetic algorithm in the match rule settings.

Once you have created a fuzzy match rule, you can use it to match records in Informatica MDM. You can match records in a variety of ways, such as matching records in a batch or matching records in real time.

Phonetic matching can be a very effective way to improve the accuracy and efficiency of data matching in Informatica MDM. It can help to match records that would not be matched using other methods, such as exact matching.


Learn more about Informatica MDM here



Tuesday, September 12, 2023

What are STRP and MTCH tables in Informatica MDM?

The STRP and MTCH tables are two important tables in Informatica MDM. They are used to store data related to the matching process.





STRP Table

The STRP table stores the SSA_KEYS generated by SSA Name3 for a given record. The keys are used for finding like records from similar keys.

The STRP table is an IOT table in Oracle. This means that it is an index that contains all the data as well. This makes it very efficient for searching the table.

The STRP table contains the following columns:

  • SSA_KEY: This is the primary key of the table. It is a unique identifier for each record.
  • ROWID_OBJECT: This is the ROWID of the base object record that the SSA_KEY belongs to.
  • DATA_ROW: This is the row number of the SSA_DATA column in the STRP record.
  • DATA_COUNT: This is the number of rows in the SSA_DATA column.
  • SSA_DATA: This is the compressed data for the match columns.

MTCH Table

The MTCH table stores the match results for a given record. The results include the match score, the match path, and the match rules that were used.

The MTCH table is a relational table. This means that it is a table that is made up of rows and columns.

The MTCH table contains the following columns:

  • SSA_KEY: This is the primary key of the table. It is a foreign key to the STRP table.
  • MATCH_SCORE: This is the score for the match. It is a number that indicates how similar the two records are.
  • MATCH_PATH: This is the path that was used to match the two records.
  • MATCH_RULES: This is the list of match rules that were used.





How STRP and MTCH Tables Work Together?

The STRP and MTCH tables work together to provide the matching functionality in Informatica MDM. The STRP table is used to find similar records, and the MTCH table is used to store the match results.

When a new record is loaded into Informatica MDM, the STRP table is updated with the SSA_KEY for the new record. The SSA_KEY is then used to search the MTCH table for any existing matches.

If there are any matches, the match results are stored in the MTCH table. The match results can then be used to consolidate the two records.


Conclusion

The STRP and MTCH tables are two important tables in Informatica MDM. They are used to store data related to the matching process. By understanding how these tables work together, you can better understand how the matching functionality in Informatica MDM works.



Learn more about Informatica MDM here



Sunday, September 3, 2023

Org, Secure Agent, and Chicklets in Informatica IDMC

Org

In Informatica IDMC, an org is a logical grouping of users, resources, and data. It is used to manage access to data and applications, and to ensure that data is secure. Each org has its own set of permissions, which define who can access what data and applications.





Secure Agent

The Secure Agent is a software component that runs on a physical or virtual machine in the customer's environment. It is responsible for connecting to the Informatica Cloud and running tasks. The Secure Agent also provides security features, such as encryption and authentication, to protect data in transit and at rest.

Chiclets

Chiclets are small, rectangular icons that represent tasks or applications in Informatica IDMC. They are displayed in the IDMC user interface, and can be used to launch tasks, view data, and manage resources.

How do Orgs, Secure Agents, and Chicklets work together?

Orgs, Secure Agents, and Chicklets work together to provide a secure and scalable environment for data integration.

  • Orgs are used to manage access to data and applications. Each org has its own set of permissions, which define who can access what data and applications.
  • Secure Agents are used to connect to the Informatica Cloud and run tasks. The Secure Agent also provides security features, such as encryption and authentication, to protect data in transit and at rest.
  • Chicklets are used to launch tasks, view data, and manage resources. They are displayed in the IDMC user interface, and can be easily customized to meet the needs of the user.





Benefits of using Orgs, Secure Agents, and Chicklets

There are many benefits to using Orgs, Secure Agents, and Chicklets in Informatica IDMC. These include:

  • Improved security: Orgs and Secure Agents help to protect data by providing a secure environment for data integration.
  • Increased scalability: Chicklets can be used to launch tasks and view data, which can help to improve the scalability of data integration workflows.
  • Improved usability: The IDMC user interface is easy to use and navigate, and chicklets can be customized to meet the needs of the user.

Orgs, Secure Agents, and Chicklets are essential components of Informatica IDMC. They work together to provide a secure and scalable environment for data integration. By using these components, organizations can improve the security, scalability, and usability of their data integration workflows.



Learn more about Informatica MDM Cloud (SaaS) here



Wednesday, August 30, 2023

How to fix error - ERROR: 'Specified alias is not a private key: com.linoma.commons.crypto.CryptoException', while setting up an AS2 Server in IDMC

 Understanding Problem:

When setting up an AS2 Server in IDMC, you may encounter the error message "ERROR: 'Specified alias is not a private key: com.linoma.commons.crypto.CryptoException: Specified alias is not a private key'". 


Root cause:

The error message "Specified alias is not a private key:" may occur if we keep the Decryption Certificate Alias empty. 






Error message:

2023-05-07 11:22:32,388 IST ERROR [AS2MessageWorker] {https-jsse-nio-1.1.1.1-15400-exec-8} Specified alias is not a private key: com.linoma.commons.crypto.CryptoException: Specified alias is not a private key:

        at com.linoma.dpa.crypto.x509.KeyStoreManager.getPrivateKey(KeyStoreManager.java:750)

        at com.linoma.dpa.ghttps.as2.AS2MessageWorker.decryptData(AS2MessageWorker.java:553)

        at com.linoma.dpa.ghttps.as2.AS2MessageWorker.processMessage(AS2MessageWorker.java:158)

        at com.linoma.dpa.ghttps.servlets.AS2Servlet.doPost(AS2Servlet.java:161)

        at javax.servlet.http.HttpServlet.service(HttpServlet.java:681)

        at javax.servlet.http.HttpServlet.service(HttpServlet.java:764)

        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227)

        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)

        at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)

        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189)

        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162)






Solution: 

You need to provide the decryption alias i.e. private key alias to resolve the issue. Here is screenshot about it




Learn more about Informatica MDM Cloud here





Thursday, August 24, 2023

What is a Secure Agent in Informatica IDMC? Which one is better?

 

What is a Secure Agent in Informatica IDMC?





A Secure Agent is a lightweight program that runs all tasks and enables secure communication across the firewall between Informatica IDMC and the machines where the data resides. It is responsible for managing the connections to the data sources, transferring data, and executing tasks.

Different Types of Secure Agents in Informatica IDMC

There are two types of Secure Agents in Informatica IDMC:

  • Local Secure Agent: This type of agent is installed on a machine within the corporate network. It is the most common type of agent and is used for most data integration tasks.
  • Cloud Secure Agent: This type of agent is hosted in the cloud. It is used for tasks that require access to cloud-based data sources.

Which Secure Agent is Better?

The best type of Secure Agent for a particular task depends on the following factors:

  • The location of the data sources: If the data sources are located in the corporate network, then a Local Secure Agent is the best option. If the data sources are located in the cloud, then a Cloud Secure Agent is the best option.
  • The security requirements: If the data is sensitive, then a Local Secure Agent is the best option because it provides more security. If the data is not sensitive, then a Cloud Secure Agent is a good option because it is more cost-effective.




  • The performance requirements: If the task requires high performance, then a Local Secure Agent is the best option. If the task does not require high performance, then a Cloud Secure Agent is a good option.


Conclusion

The choice of Secure Agent depends on the specific requirements of the task. In general, a Local Secure Agent is the best option for tasks that require access to data sources in the corporate network and have high security requirements. A Cloud Secure Agent is a good option for tasks that require access to cloud-based data sources and are cost-sensitive.

Here are some additional things to consider when choosing a Secure Agent:

  • The number of tasks that need to be run: If you need to run a large number of tasks, then a Local Secure Agent may be the best option because it can handle more concurrent connections.
  • The size of the data sets: If you need to work with large data sets, then a Local Secure Agent may be the best option because it can handle more data throughput.
  • The level of technical expertise: If you have a lot of technical expertise, then you may be able to manage a Cloud Secure Agent yourself. If you do not have a lot of technical expertise, then a Local Secure Agent may be the best option because it is easier to manage.

What is Undermatching in Informatica MDM?

 What is Undermatching in Informatica MDM?

Undermatching is a situation in which two or more records in a master data management (MDM) system do not match, even though they should.



This can happen for a variety of reasons, such as:

  • The records have different values for some of the key attributes.
  • The records have been created by different systems or applications.
  • The records have been corrupted or incorrectly entered.

Undermatching can lead to a number of problems, such as:

  • Inaccurate data analysis.
  • Duplicate data.
  • Poor decision-making.

How to Identify Undermatching

There are a number of ways to identify undermatching in an MDM system. One common approach is to use SQL queries to compare the records in different tables. For example, if the match rule contains both parent (Party) and child (Address) table fuzzy columns. Then try to write sql statement with all the match columns and make sure duplicate records are not returning.


In sql below, we made the assumption that First Name, Last Name from Party table and Address Line 1, Country from Address table are match rule columns.

SQL
select sub1.*, sub2.* from 
(SELECT c.Rowid_object, c.First_Name, c.Last_Name, c.Display_Name, a.Address_Line_1, a.Country, a.State
FROM Customer c
LEFT JOIN Address a
ON c.rowid_object = a.Party_Rowid) sub1,

(SELECT c.Rowid_object, c.First_Name, c.Last_Name, c.Display_Name, a.Address_Line_1, a.Country, a.State
FROM Customer c
LEFT JOIN Address a
ON c.rowid_object = a.Party_Rowid) sub2
WHERE sub1.ROWID_OBJECT <> sub2.ROWID_OBJECT
and sub1.First_Name = sub2.First_Name
and sub1.Last_Name = sub2.Last_Name
and sub1.Address_Line_1 = sub2.Address_Line_1
and sub1.Country = sub2.Country





This query will return a list of all records that are present in the Customer table but found duplicates of those. These records are likely to be undermatched.

Another way to identify undermatching is to use a data profiling tool. Data profiling tools can analyze the data in an MDM system and identify a variety of problems, including undermatching.

How to Fix Undermatching

Once undermatching has been identified, it can be fixed in a number of ways. One common approach is to manually merge the unmatched records. This can be a time-consuming and error-prone process, but it is often the only option when the undermatching is caused by human error.

Another approach is to use automated matching algorithms. These algorithms can compare the records in different tables and identify the ones that are most likely to be matches. Once the matches have been identified, they can be merged automatically.

The best approach to fixing undermatching will depend on the specific situation. However, it is important to fix undermatching as soon as possible to avoid the problems that it can cause.


Learn more about Match process in Informatica MDM here



How to fix error : SDKC_37015 Plug-in Error: Invalid index for transformation type [0]: Should range from [0] to [-1]

 The error message - SDKC_37015 Plug-in Error: Invalid index for transformation type [0]: Should range from [0] to [-1] normally occurs when we try to execute Ingress or egress job in Informatica IDMC ( Intelligent Data Management Cloud ).





a) Error Message:

In order to investigate this issue, we need to check session logs. The session logs will have entry like this -

WRITER_1_*_1> SDKC_37015 [2023-08-17 12:17:24.732] Plug-in Error: Invalid index for transformation type [0]: Should range from [0] to [-1]

WRITER_1_*_1> MDM_10000 [2023-08-17 12:17:25.030] [INFO] JobInstanceId recieved from taskflow is $$jobInstanceId

WRITER_1_*_1> CMN_1761 [2023-08-17 12:17:25.031] Timestamp Event: [Mon Aug 17 12:17:25 2022]

WRITER_1_*_1> MDM_10000 [2023-08-17 12:17:25.031] [ERROR] jobInstanceId is empty or has default value, hence quitting

WRITER_1_*_1> MDM_10000 [2023-08-17 12:17:25.031] [WARNING] Failed to establish connection with the Data Souce.

WRITER_1_*_1> CMN_1761 [2023-08-17 12:17:25.031] Timestamp Event: [Mon Aug 17 12:17:25 2022]

WRITER_1_*_1> MDM_10000 [2023-08-17 12:17:25.031] [ERROR] Connection failed for CCI Client


Notice that there is error related to JobInstanceId - j



obInstanceId recieved from taskflow is $$jobInstanceId


b) How to fix it?

As the issue is related to jobInstanceId, we need to make sure job instance id is available at the below location -

1. Mapping Level - Make sure Inout Parameter jobInstanceId is defined.

2. Mapping Task Level - Make sure Inout Parameter jobInstanceId will be automatically picked from Mapping

3. Taskflow Level - Verify that jobInstanceId has added on both Start and Data Task


Once you fix the jobInstanceId in all these 3 levels, the issue will be resolved. 


Learn more about Informatica MDM Cloud (SaaS) here





Understanding Survivorship in Informatica IDMC - Customer 360 SaaS

  In Informatica IDMC - Customer 360 SaaS, survivorship is a critical concept that determines which data from multiple sources should be ret...