Technology World

Monday, October 23, 2023

What is Checksum - understanding with an example

What is a checksum?

A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. Checksums are often used to verify data integrity, but they are not relied upon to verify data authenticity.

How does a checksum work?

A checksum is generated by running a mathematical algorithm on a piece of data. The algorithm produces a unique value, called the checksum, which is based on the content of the data. If the data is changed in any way, the checksum will also change.

Example of a checksum:

Suppose we have a file called myfile.txt with the following contents:

This is a test file.

We can generate a checksum for this file using the following command:

md5sum myfile.txt

This will output the following checksum:

d41d8cd98f00b204e9800998ecf8427e myfile.txt

If we now change the contents of the file to be:

This is a test file with some changes.

And then generate a checksum again, we will get the following output:

ba948517d011032327d7224464325882 myfile.txt

As you can see, the checksum has changed because the contents of the file have changed.

Uses of checksums

Checksums are used in a variety of ways, including:

To verify the integrity of downloaded files. Many software developers provide checksums for their downloads so that users can verify that the files have not been corrupted during the download process.
To verify the integrity of data transmitted over a network. For example, checksums can be used to detect errors in TCP/IP packets.
To verify the integrity of data stored on disk. For example, checksums can be used to detect errors in file systems.

Checksums: A simple way to protect your data

Checksums are a simple but effective way to protect your data from errors. By generating a checksum for a piece of data and then comparing it to the checksum later on, you can verify that the data has not been corrupted.

Checksums are used in a variety of ways, including:

To verify the integrity of downloaded files. Many software developers provide checksums for their downloads so that users can verify that the files have not been corrupted during the download process.
To verify the integrity of data transmitted over a network. For example, checksums can be used to detect errors in TCP/IP packets.
To verify the integrity of data stored on disk. For example, checksums can be used to detect errors in file systems.

How to generate a checksum

There are many different ways to generate a checksum. The most common method is to use a cryptographic hash function such as MD5 or SHA-256. These functions produce a unique value, called the checksum, which is based on the content of the data.

To generate a checksum using a cryptographic hash function, you can use the following command:

md5sum myfile.txt

This will output the following checksum:

d41d8cd98f00b204e9800998ecf8427e myfile.txt

How to verify a checksum

To verify a checksum, you can simply compare it to the checksum that was generated for the data. If the checksums match, then the data has not been corrupted. If the checksums do not match, then the data has been corrupted.

Checksums are a simple and effective way to protect your data from errors. By generating a checksum for a piece of data and then comparing it to the checksum later on, you can verify that the data has not been corrupted.

Additional tips

It is important to use a strong checksum algorithm, such as MD5 or SHA-256. Weak checksum algorithms are more likely to produce false positives or negatives.
It is also important to store the checksums in a safe place. If the checksums are lost or corrupted, then you will not be able to verify the integrity of your data.
If you are verifying the integrity of downloaded files, be sure to download the checksums from a trusted source. Do not download checksums from the same website where you downloaded the files.

Checksums are a valuable tool for protecting your data from errors. By following the tips above, you can use checksums to ensure that your data is always accurate and reliable.

Learn about Oracle here

Wednesday, October 18, 2023

What is MD5 hashing?

What is MD5 hashing?

MD5 hashing is a cryptographic hash function that converts data of any length into a fixed-length digest value of 128 bits. It is a one-way function, meaning that it is impossible to reverse the process and obtain the original data from the hash value.

MD5 hashing is used in a variety of applications, including:

File integrity verification: MD5 hashes can be used to verify the integrity of a file by comparing the hash of the downloaded file to the hash of the original file. This can be used to detect data corruption or tampering.
Password storage: MD5 hashes can be used to store passwords in a secure manner. When a user logs in, their password is converted into an MD5 hash and compared to the hash stored on the server. If the hashes match, the user is authenticated.
Digital signatures: MD5 hashes can be used to create digital signatures. A digital signature is a mathematical algorithm that can be used to verify the authenticity of a digital message or document.

Example of MD5 hashing

To generate an MD5 hash, you can use a variety of online or offline tools. For example, to generate the MD5 hash of the string "Hello, world!", you can use the following command in a terminal window:

md5sum Hello, world!

This will generate the following output:

b7472054d87b705583691f84a60a9e66  Hello, world!

The first 32 characters of the output are the MD5 hash of the string "Hello, world!".

MD5 hashing is a powerful tool that can be used to protect data and ensure its integrity. However, it is important to note that MD5 is not considered to be a secure cryptographic hash function anymore. This is because it is possible to create two different files with the same MD5 hash, which is known as a collision.

Despite its security weaknesses, MD5 is still widely used in a variety of applications. This is because it is a relatively fast and easy-to-use hash function.

Here are some of the pros and cons of using MD5 hashing:

Pros:

Fast and easy to use
Widely supported
Can be used to detect data corruption and tampering

Cons:

Not considered to be a secure cryptographic hash function anymore
Possible to create collisions

If you are looking for a secure cryptographic hash function to protect your data, you should consider using a newer algorithm such as SHA-2 or SHA-3. However, MD5 may still be a suitable option for some applications, such as file integrity verification.

Learn about Oracle here -

Monday, October 16, 2023

What are difference between Cloud Data Integration and Cloud API Integration in Informatica IDMC?

The main difference between Cloud Data Integration and Cloud API Integration in Informatica IDMC is the focus of each platform. Cloud Data Integration is designed to help organizations integrate data from multiple sources, including cloud and on-premises systems. Cloud API Integration is designed to help organizations integrate applications and data using APIs.

Cloud Data Integration

Informatica Cloud Data Integration (CDI) is a cloud-native data integration platform that enables organizations to automate, scale, and govern their data integration processes. CDI supports a wide range of data sources and targets, including cloud and on-premises databases, files, and streaming data sources. CDI also provides a variety of features to help organizations improve their data quality, including data profiling, cleansing, and transformation capabilities.

Cloud API Integration

Informatica Cloud API Integration (CAI) is a cloud-native API integration platform that enables organizations to connect applications and data using APIs. CAI provides a variety of features to help organizations design, develop, manage, and deploy APIs, including:

API design and development tools
API management and lifecycle management capabilities
API security and governance features
API monitoring and analytics capabilities

Key Differences

The following table summarizes the key differences between Cloud Data Integration and Cloud API Integration in Informatica IDMC:

Feature	Cloud Data Integration	Cloud API Integration
Focus	Data integration	API integration
Supported data sources and targets	Databases, files, and streaming data sources (cloud and on-premises)	APIs (cloud and on-premises)
Key features	Data profiling, cleansing, and transformation	API design, development, management, deployment, security, governance, monitoring, and analytics
Use cases	Data warehousing, data lakes, data analytics, and business intelligence	API-driven applications, B2B integration, and microservices architectures

Which Platform to Choose?

The best platform for your organization will depend on your specific needs and requirements. If you need to integrate data from multiple sources, including cloud and on-premises systems, then Cloud Data Integration is a good choice. If you need to integrate applications and data using APIs, then Cloud API Integration is a good choice.

Many organizations use both Cloud Data Integration and Cloud API Integration together to create a comprehensive data integration and API management solution. For example, an organization might use Cloud Data Integration to integrate data from their on-premises CRM system and their cloud-based marketing automation system into a data warehouse. They might then use Cloud API Integration to expose the data in the data warehouse to their sales and marketing teams through APIs.

Cloud Data Integration and Cloud API Integration are both powerful platforms that can help organizations integrate data and applications. The best platform for your organization will depend on your specific needs and requirements. If you are unsure which platform is right for you, then Informatica offers a variety of resources to help you make a decision, including free trials, demos, and consultations.

Learn more about Informatica IDMC here

Sunday, October 15, 2023

How to import CSV data in Reference 360 using REST Call.

To import CSV data in Reference 360 using REST Call in Informatica IDMC, you can follow these steps:

Prepare your CSV file. The CSV file must start with two header rows, followed by the data rows. The first header row must contain the names of the columns in the CSV file. The second header row must contain the following values:
- System Reference Data Value - The key value for the system reference data value that you want to assign.
- Code Value - The code value for the system reference data value.
Upload the CSV file to a cloud storage location, such as Amazon S3 or Google Cloud Storage.

Send a POST request to the following endpoint:

https://XXX-mdm.dm-us.informaticacloud.com/rdm-service/external/v2/import

In the request body, include the following information:
- file - The name of the CSV file that you uploaded to cloud storage.
- importSettings - A JSON object that specifies the import settings. The following import settings are required:
  - delimiter - The delimiter that is used in the CSV file.
  - textQualifier - The text qualifier that is used in the CSV file.
  - codepage - The codepage that is used in the CSV file.
  - dateFormat - The date format that is used in the CSV file.
  - containerType - The type of container to which you want to import the data. For example, to import code values, you would specify CODELIST.
  - containerId - The ID of the container to which you want to import the data.
In the request headers, include the following information:
- Authorization - Your Informatica IDMC API token.
- IDS-SESSION-ID - Your Informatica IDMC session ID.
Send the request.

If the request is successful, Informatica IDMC will start importing the data from the CSV file. You can check the status of the import job by sending a GET request to the following endpoint:

https://XXX-mdm.dm-us.informaticacloud.com/rdm-service/external/v2/import/{jobId}

Where {jobId} is the ID of the import job.

Once the import job is complete, you can view the imported data in Reference 360.

Here is an example of a POST request to import code values from a CSV file:

POST https://XXX-mdm.dm-us.informaticacloud.com/rdm-service/external/v2/import HTTP/1.1
Authorization: Bearer YOUR_API_TOKEN
IDS-SESSION-ID: YOUR_SESSION_ID
Content-Type: multipart/form-data; boundary=YUWTQUEYJADH673476Ix1zInP11uCfbm

--YUWTQUEYJADH673476Ix1zInP11uCfbm
Content-Disposition: form-data; name=file; filename=import-code-values.csv

--YUWTQUEYJADH673476Ix1zInP11uCfbm
Content-Disposition: form-data; name=importSettings
Content-Type: application/json;charset=UTF-8

{
  "delimiter": ",",
  "textQualifier": "\"",
  "codepage": "UTF8",
  "dateFormat": "ISO",
  "containerType": "CODELIST",
  "containerId": "676SJ1990a54dcdc86f54cf",
  "startingRow": null
}

--YUWTQUEYJADH673476Ix1zInP11uCfbm--

Replace YOUR_API_TOKEN with your Informatica IDMC API token and YOUR_SESSION_ID with your Informatica IDMC session ID. Replace import-code-values.csv with the name of your CSV file and 676SJ1990a54dcdc86f54cf with the ID of the code list to which you want to import the data.

Learn more about Importing CSV Data in Reference 360 in IDMC

Sunday, September 24, 2023

What is consolidation process in Informatica MDM?

In Informatica MDM (Master Data Management), the consolidation process is a fundamental and crucial step in managing and maintaining master data. The consolidation process aims to identify and merge duplicate or redundant records within a master data domain, such as customer, product, or supplier data. This process is essential for ensuring data accuracy, consistency, and reliability across an organization's various systems and applications.

Here are the key aspects and steps involved in the consolidation process in Informatica MDM:

Data Source Integration: The consolidation process begins with the integration of data from various source systems into the MDM hub. These source systems might have their own data structures and formats.

Data Matching: Once data is integrated into the MDM hub, the system performs data matching to identify potential duplicate records. Data matching algorithms and rules are used to compare and evaluate data attributes to determine if records are similar enough to be considered duplicates.
Data Survivorship Rules: Data survivorship rules are defined to specify which data values should be retained or prioritized during the consolidation process. These rules help determine which data elements from duplicate records should be merged into the final, consolidated record.
Record Linking: The consolidation process creates links between duplicate or related records, essentially establishing relationships between them. This linkage allows the system to group similar records together for consolidation.
Conflict Resolution: In cases where conflicting data exists between duplicate records, conflict resolution rules come into play. These rules specify how conflicts should be resolved. For example, a conflict resolution rule might prioritize data from a certain source system or use predefined business rules.
Data Merge: Once the system identifies duplicate records, resolves conflicts, and determines the survivorship rules, it consolidates the data from duplicate records into a single, golden record. This golden record represents the best and most accurate version of the data.
Data Enrichment: During consolidation, the system may also enrich the data by incorporating additional information or attributes from related records, ensuring that the consolidated record is as complete as possible.
Data Validation: After consolidation, the data is subject to validation to ensure it adheres to data quality and business rules. This step helps maintain the integrity of the consolidated data.
History and Audit Trail: It is essential to keep a history of consolidation activities and changes made to the data. An audit trail is maintained to track who made changes and when.
Data Distribution: Once consolidation is complete, the cleansed and consolidated master data is made available for distribution to downstream systems and applications through the use of provisioning tools or integration processes.

The consolidation process is a continuous and iterative process in Informatica MDM because new data is constantly being added and existing data may change. Regularly scheduled consolidation activities help ensure that the master data remains accurate and up-to-date, providing a single source of truth for the organization's critical data.

By implementing a robust consolidation process, organizations can reduce data duplication, improve data quality, and enhance their ability to make informed decisions based on accurate and consistent master data.

Learn more about Informatica MDM consolidation process here

Tuesday, September 19, 2023

Troubleshooting the "No Supported Authentication Methods Available" Error in SSH

Introduction:

Encountering the "No supported authentication methods available server sent public key" error when connecting to an EC2 instance via SSH is a common frustration. This error can prevent you from accessing your remote server. In this article, we'll explore the causes of this error and provide solutions to resolve it.

Understanding the Error: The "no supported authentication methods available (server sent public key)" error message occurs when your SSH client cannot successfully authenticate with the remote EC2 instance. Several factors can contribute to this issue:

Incorrect Login Credentials: Entering an incorrect username or password during the SSH connection attempt will result in failed authentication.
Incorrect SSH Key: If you're using SSH keys for authentication, an invalid or incorrect key can prevent successful connection to the remote server.
Server Configuration: If the remote server is not properly configured to allow the chosen authentication method, you won't be able to authenticate.

Solving the Problem: Let's explore steps to resolve the "no supported authentication methods available server sent public key" error:

Address SSH Public Key Issues:
- Edit the /etc/ssh/sshd_config file.
- Set PasswordAuthentication and ChallengeResponseAuthentication to 'yes'.
- Restart SSH:
  - Option 1: sudo /etc/init.d/ssh restart
  - Option 2: sudo service sshd restart
Refer to AWS Documentation:
- AWS provides comprehensive documentation on connecting to EC2 instances using various SSH clients. You can find detailed instructions here: AWS EC2 SSH Documentation.
Verify Correct Logins for Specific AMIs:
- Depending on the Amazon Machine Image (AMI) you're using, the login usernames may vary. Use the following logins based on your AMI:

ec2-user or root for RHEL AMIs, SUSE AMIs, and others.

Using SSH on Different Operating Systems:
- For Windows:
  - Obtain the PEM key from the AWS website and generate a PPK file using PuttyGen. Then, use Putty for SSH, selecting the PPK file under "Connection -> SSH -> Auth" for authorization.
- For Linux:
  - Run the following command: ssh -i your-ssh-key.pem login@IP-or-DNS.
Accessing Your EC2 Instance:
- Open an SSH client or refer to PuTTY for Windows users.
- Locate your private key file (e.g., test_key.pem), ensuring it has the appropriate permissions (use chmod 400 if needed).
- Connect to your EC2 instance using its Public DNS, e.g., ssh -i "test_key.pem" ubuntu@xxx-yyy-100-10-100.us-east-2.compute.amazonaws.com.

Conclusion: While the "no supported authentication methods available server sent public key" error can be frustrating, it is often resolvable. By double-checking your login credentials, SSH key, and trying different authentication methods, you can usually overcome this issue. If problems persist, it's advisable to investigate the server's configuration and consult server logs or contact the server administrator for assistance.

Learn more about Oracle here