Scanning Guidelines and Best Practices (v6)

Yes

Scanning Guidelines and Best Practices

FlexNet Code Insight v6

Overview

There are many variables that can affect the success of running a source code scan in FlexNet Code Insight v6, some examples include: hardware and software limitations, the size and type of your codebase and your scan settings. This document outlines best practices and recommendations to increase the likelihood of running a scan successfully to completion.

Workspace Maximum Recommended Limits

Breaking up your codebase into workspaces allows you to scan smaller logical groups with independent scan settings, while providing a solution for creating a single analysis across all files scanned within the project. Please keep the following limits in mind when creating workspaces.

Source Code:
- Recommended limit:
  - 3 Million SLOC and 150 Mb of source files
- With SCF scanning enabled:
  - 2 to 3 Million Source Lines of Code (SLOC)
  - 150MB of source
- If you have more than 150MB of source or more than 2 to 3 million SLOC, you will want to create another workspace
- With SCF disabled:
  - No source limitations
- Total Files Per Workspace:
  - Recommended limit:
    - 100,000 files per workspace
  - If you have more than 80,000 to 100,000 files, we recommend creating another workspace. We have seen up to 150K files successfully scan but in practice try to limit workspace maximum files counts to less than 100K files.
- Jar files:
  - Recommended limit:
    - 1000 Jars per workspace
  - If you have more than 1000 jars, we recommend creating another workspace. We have seen up to 2500 jars scan successfully, but don’t recommend it. Other factors can include Jars bundled inside other Jars. You can also increase the number of jars per workspace by disabling namespace matching.

All limits should consider both files on disk plus files contained in archives if you have scanning files within archives enabled.

Scanning Options

The primary features that affects the time it takes to complete a scan can be found in Workspace Settings > Detection tab (see the Configuring Workspaces tab in the Audit and Analysis Guide for details).

Figure 1: Detection Tab in Workspace Settings

The following to scan options can significantly affect scan times:

Source Code Fingerprint Scanning (SCF)
Archive Scanning options

Source Code Fingerprint Scanning (SCF)

SCF scanning allows you to detect and review snippet matches within source files. This option should be used in cases where you want to detect the modification of open source software (OSS) source code or where developers incorporate fragments of OSS source code within their code. Because this is a very rigorous and time-consuming scanning event, we recommend you turn off SCF scanning if you are not modifying OSS nor use code fragments.

If you do require SCF scanning, we recommend that you configure the Min Match setting on the Source Code Options tab. The Min Match setting sets the minimum number of snippet matches that are required to match between the codebase and the data library file before the result can be considered a match. The default minimum snippet match setting is 3—that is, 3 snippets must match between the source and library file for the file to be considered a source match and to appear in the results. Our in-house testing shows that you can typically increase the Min Match count to a value of 10 to improve scan times without adversely affecting the scan results. With a Min Match value of 10, only files with 10 snippet matches between the scanned codebase and data library file will appear in the scan results. Files with fewer matches will not.

Figure 2: Source Code Options in Workspace Settings

Archive Scanning

The default setting is to not scan files in archives. We generally recommend that you do not enable this option as it can quickly expand the scan set beyond maximums and cause scan failures. Often many archives contain test files or sample files and do not contain 3^rd-party components and consequently don’t require scanning. We recommend that you expand only the archive types that may contain 3^rd-party components prior to scanning.

If you prefer not to scan inside archives but would still like to see that contents of archive files in your codebase tree, you may do so by setting the displayContentsOfUnscannedArchives=true property in <FNCI_INSTALLATION_DIR>/config/core.properties and restarting the server. Although this will not save you the same amount of scan time as keeping the archives setting off completely, it will still favorably impact your scan time.

Namespace Matching

For workspaces that contain a lot of jar files, scanning with namespaces turned on may significantly impact scan time. For this reason, we recommend keeping Namespace Matching turned off (it is off by default) and using other techniques, such as Analyzer Group Builder to identify jar packages.

Other Considerations

Code Metrics

Flexera has a code estimator tool you can use to determine code metrics, including: estimated SLOC, megabytes of source, number of jar files, and total number of files. Please contact your Flexera Account Representative to obtain a copy of the estimator tool.

Logical Workspace Groupings

WinDirStat on Windows and KDirStat or QDirStat on Linux, are disk usage statistics viewers that let you view both the directory tree as well as a visual representations of your codebase. We recommend using these utilities to help you determine the best approach to breaking your code into smaller logical groups. For example, breaking out node modules, gems or jars into separate groups can be one way you can effectively divide your code to facilitate scanning by workspace and tools like WinDirStat, make identifying these groups much easier.