Some users may experience issues accessing the case portal. For more information, please click here.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Deduplication Rules in Data Platform Normalize

Deduplication Rules in Data Platform Normalize

Software evidence --packages like Add/Remove data, file evidence like exe files, and operating system data-- go through various deduplication logic during the normalization process in Data Platform. The deduplication eliminates evidence that is considered redundant, inaccurate, or needs to be consolidated with other evidence to make sense of the software installation.

 

Deduplication Rules

By default, deduplication principles follow several “hidden” flags, where each flag indicates where the evidence ends up:



Hidden Value

Hidden Label

Meaning

0

 

(none)

None of the other scenarios below are found. In other words:

  • No duplicates identified

  • No duplicate discovery evidence sources are excluded

  • No duplicate product and/or versions are excluded

  • No lower versions of the same product are excluded

  • No duplicate product and/or versions with different editions are excluded.

1

add/remove and exe file resulting in the same product

Discovery evidence identifies the same product and/or version more than once. When both Add/Remove and exe file evidence are discovered at the same machine, Add/Remove evidence takes precedence and the exe file is deduplicated.

2

match result shows duplicated products on the same host

Normalize identifies multiple instances of the same product and/or version. This is similar to the above, except that the evidence involved is either both Add/Remove or both exe files.

3

de-duping products by PRI

Normalize identifies different versions of the same product. The latest version (that is, the version with the higher version group order) takes precedence while the older versions are deduplicated, except whencoexist version’ flag = 1.

This deduplication logic is intended to remove any traces of evidence from old versions' installations because they are considered as invalid, unsupported, or unsanctioned by the manufacturer.

5

de-duping edition_rid=-1 in same host,same product

Normalize identifies multiple instances of the same product and/or version but for different editions. The edition with the higher edition order takes precedence while editions with lower edition order get deduplicated, except when ‘coexist edition’ flag = 1.

This deduplication logic is intended to remove any traces of evidence from other editions' installations because they are considered as invalid, unsupported, or unsanctioned by the manufacturer.

 

Notes on terminology:

Coexist version flag indicates products that our customers report to coexist, multiple versions, in the same machine at the same time and is sanctioned/supported by the vendor. When the Technopedia content team validates this report, these products will carry coexist version flag = ‘yes,’ indicating that deduplication logic in Normalize will not suppress evidences with lower version group orders (that is, the default deduplication logic behavior is bypassed, and all version groups that are installed in the same machine are shown).

Unless the coexist version flag for a product is set to 'yes', by default, hidden value '3' applies to evidence with lower version orders. They are assumed to be invalid, unsupported or unsanctioned by the vendor. Note that even though the installation might be working/functional due to various workaround that the user has undertaken, there is no indication that the vendor supports this scenario (hence, invalid, unsupported/unsanctioned). The majority of Technopedia products do not carry this flag.

Version order (or version group order) is a sequential order of all versions (or version groups) under the same product, where the highest order indicates the latest version group (or version) and the lowest order indicates the earliest version group (or version).

Coexist edition (aka “coedition”) flag indicates products that our customers report to coexist, multiple editions, in the same machine at the same time and is sanctioned/supported by the vendor. When the Technopedia content team validates this report, these products will carry coexist edition flag = ‘yes,’ indicating that deduplication logic in Normalize will not suppress evidences with lower edition group orders (that is, the default deduplication logic behavior is bypassed, and all editions that are installed in the same machine will be shown).

Unless the coexist edition flag for a product is set to 'yes', by default, hidden value '4' applies to evidence with lower edition orders. They are assumed to be invalid, unsupported or unsanctioned by the vendor. Note that even though the installation might be working/functional due to various workaround that the user has undertaken, there is no indication that the vendor supports this scenario (hence, invalid, unsupported/unsanctioned). The majority of Technopedia products do not carry this flag.

Edition order is an order of importance/significance of all editions under the same product, where the highest order indicates the most important/significant edition (for example, could be the most expensive, the most feature-packed, and so on) and the lowest order indicates the least important/significant edition.

Curation of either coexist version or coexist edition flags in Technopedia is triggered by customers who report this behavior as being supported. Flexera follows the due diligence to validate these reports (e.g., by reproducing the installations of the application according to vendor's supported methodologies).

Examples

Scenario 1:

The following evidence is mapped to Microsoft Word 2021 in Normalize:

  • Add/Remove: Microsoft Word 2021

  • Exe: winword.exe 2021

Normalize will deduplicate the second evidence (that is, winword.exe will be flagged with hidden = 1) and the final normalization result is represented by the first evidence (that is, the Add/Remove evidence will be flagged with hidden = 0)

 

Scenario 2:

The following evidence is mapped to Microsoft Word 2021 in Normalize:

  • Add/Remove: Microsoft Word 2021

  • Add/Remove: MS Word 2021

Normalize deduplicates one evidence (hidden flag = 2) while keeping the other evidence (hidden flag = 0).

 

Scenario 3:

The following evidence is mapped to Microsoft Word 2021 in Normalize:

  • Exe: winword.exe 2021

  • Exe: word.exe 2021

Normalize deduplicates one evidence (hidden flag = 2) while keeping the other evidence (hidden flag = 0).

 

Scenario 4:

The following evidence is identified on the same machine:

  • Add/Remove: Microsoft Internet Explorer 11

  • Add/Remove: Microsoft Internet Explorer 10

Normalize deduplicates the second evidence due to its lower version group order (hidden flag = 3) while keeping the first evidence with the higher version group order (hidden flag = 0). Note that Microsoft Internet Explorer doesn’t carry the ‘coexist version’ flag in Technopedia.

 

Scenario 5:

The following evidence is identified in the same machine:

  • Add/Remove: Microsoft Money 2008 Premium Edition

  • Add/Remove: Microsoft Money 2008 Essentials Edition

Normalize deduplicates the second evidence due to its lower edition order (hidden flag = 5) while keeping the first evidence with the higher edition order (hidden flag = 0). Note that Microsoft Money doesn’t carry the ‘coexist edition’ flag in Technopedia.

 

Scenario 6:

The following evidence is identified in the same machine:

  • Add/Remove: Microsoft .NET Framework 4.5

  • Add/Remove: Microsoft .NET Framework 4.0

Normalize will not deduplicate either evidence (hidden flag = 0 for both evidences). This is because Microsoft .NET Framework carries the ‘coexist version’ flag in Technopedia.

 

Scenario 7:

The following evidence is identified in the same machine:

  • Add/Remove: Microsoft Visual Studio 2022 Enterprise Edition

  • Add/Remove: Microsoft Visual Studio 2022 Community Edition

Normalize will not deduplicate either evidence (hidden flag = 0 for both evidences). This is because Microsoft Visual Studio carries the ‘coexist edition’ flag in Technopedia.

 

Scenario 8:

The following evidence is identified in the same machine:

  • Add/Remove: Microsoft Office 2019 Personal Edition

  • Add/Remove: Microsoft Office 2016 Professional Edition

Normalize will not deduplicate either evidence (hidden flag = 0 for both) even though the second evidence has the lower version group order while the first evidence has the lower edition order. This is because Microsoft Office carries both the ‘coexist version’ flag and the ‘coexist edition’ flag in Technopedia.

 

In Data Platform, when the user wants to view the normalization result in BDNA Publish database, the table above contains the expected Normalize result with all deduplication rules being applied: MATCH_HOST_SW_PROD_<inventory_id>. This table is consistent with what the user sees on the Admin Console as well as in ServiceNow integration by default.

If the user wants to view the pre-deduplication Normalize result (that is, deduplication rules are not being applied, all evidence is shown), the above table should be used: MATCH_HOST_SW_PROD_ALL_<inventory_id>. This table is only recommended for final consumption of the Normalize result for specific purposes only (e.g., troubleshooting with mapping, creation of private catalog entries, or any other use-cases requested by the customer).

Was this article helpful? Yes No
100% helpful (1/1)
Version history
Last update:
‎May 20, 2022 10:27 AM
Updated by:
Contributors