Skip to main content

1.5 Vulnerability Repository

Vulnerability Repository is a collection of Vulnerability Sources (VulSource).
Vulnerability Repository scans public vulnerability databases periodically and builds Supported Open-Source Projects from source. The public vulnerability databases are first processed into a baseline Vulnerability structure:

FieldTypeDescriptionExample
IDstringCNVD/CNNVD/CVE IDCNVD-2025-03269
TitlestringShort description of the vulnerabilitySAP NetWeaver Application Server Java 跨站脚本漏洞
Products[]stringList of products affected by the vulnerabilitySAP SAP NetWeaver Application Server Java null
RiskLevelintRisk level (-1: Unknown, 0: Critical, 1: High, 2: Medium, 3: Low)2
CVEIDstringRelated CVE ID (if applicable)CVE-2023-12345
SubmittedDatestringDate of submission (optional)2025-02-19
PublishedDatestringDate of publication2025-02-20
UpdatedDatestringDate of update (optional)2025-02-21
DescriptionstringDescription of the vulnerability (HTML escaped)SAP NetWeaver Application Server Java...
TypestringType or category of the vulnerability通用型漏洞
PatchstringPatch information (if applicable)目前厂商已发布升级补丁以修复漏洞...

1.5.1 National Vulnerability Database (NVD)

CVE: Common Vulnerabilities and Exposures is hosted by NIST under NVD - Vulnerabilities.
They are available in the GitHub repo CVEProject/cvelistV5: CVE cache of the official CVE List in CVE JSON 5 format. This repo contains CVE from project start (1999) and updated regularly.

Each CVE consists of its CVE-ID, publish date, description, associate reference links, vulnerable product configuration, CWE weakness categorization, Common Vulnerability Scoring System (CVSS) and other metadata.

1.5.1.1 Archival Data

cvelistV5 is cloned and processed to generate VulSource.

1.5.1.2 Live Data

Vulnerability Repository synchronize cvelistV5 with upstream to get the latest CVE listing.
Then the updated CVEs are processed to generate VulSource.

This task is scheduled to run once a week.

1.5.2 China National Vulnerability Repository of Information Security (CNNVD)

info

CNNVD 技术支撑单位计划主要面向信息安全厂商、软硬件厂商与互联网公司等, 以平等自愿的原则,通过签约合作的方式与这些单位开展合作。

本计划通过整合业内资源, 联合技术支撑单位,提高重大漏洞和重要安全事件的发现、分析、处置能力, 进一步助力信息安全漏洞研究、事件解读,形成漏洞/事件的收集、分析、处置、披露的良性机制, 从而提高我国信息安全漏洞/事件的研究水平和通报能力。

中国信息安全测评中心运行和管理

There's no consistent method to directly associate open-source projects from CNNVD listing. As most of CNNVD references to CVE, project filter is not applied during scraping and use CNNVD info to enrich the corresponding CVE vulnerability.

1.5.2.1 Archival Data

Archival data is available as daily, monthly or yearly zip from 国家信息安全漏洞库.
This download requires login.

Findings

1.5.2.2 Live Data

An automated browser is used to visit the public listing at https://www.cnnvd.org.cn/home/loophole and fetch CNNVD vulnerabilities newer than a "CNNVD last fetch date" stored in Vulnerability Repository's database.

This task is scheduled to run once a week.

1.5.3 China National Vulnerability Repository (CNVD)

info

国家信息安全漏洞共享平台(China National Vulnerability Database,简称 CNVD)是由国家计算机网络应急技术处理协调中心(中文简称国家互联网应急中心,英文简称 CNCERT)联合国内重要信息系统单位、基础电信运营商、网络安全厂商、软件厂商和互联网企业建立的国家网络安全漏洞库。

建立 CNVD 的主要目标即与国家政府部门、重要信息系统用户、运营商、主要安全厂商、软件厂商、科研机构、公共互联网用户等共同建立软件安全漏洞统一收集验证、预警发布及应急处置体系,切实提升我国在安全漏洞方面的整体研究水平和及时预防能力,进而提高我国信息系统及国产软件的安全性,带动国内相关安全产品的发展。

国家互联网应急中心运行和管理

There's no consistent method to directly associate open-source projects from CNNVD listing. As most of CNNVD references to CVE, we do not apply project filter during scraping and use CNNVD info to enrich the corresponding CVE vulnerability.

1.5.3.1 Archival Data

Archival data is available as weekly XML from 国家信息安全漏洞共享平台. This download requires login and solving captcha.

Findings

1.5.3.2 Live Data

The customized browser is used to visit the public listing at https://www.cnvd.org.cn/flaw/list and fetch CNVD vulnerabilities newer than a "CNVD last fetch date" stored in Vulnerability Repository's database.

This task is scheduled to run once a week.

1.5.3.3 Anti-bot measures

The CNVD website employs anti-bot protections, including a script that checks for the presence of webdriver in the window object. To enable automated browsing, we need to customize the browser to remove property.

1.5.4 VulSource

A VulSource for a Supported Open-Source Project consists of:

  • corresponding pseudo assembly codes before and after the vulnerability fix
  • syntactic features for code blocks related to the vulnerability
  • meta-data for reporting:
    • path of source code in project tree
    • mapping of source code to pseudo assembly code (in line numbers)

for each vulnerability in the CVE.

VulSource's are is deployed manually to SourceGuard, via offline archive or downloading from remote Vulnerability Repository. It can be customized according to license agreement or the use case of the customer.

SourceGuard's /vulsource/ API query the local VulSource snapshot.

1.5.4.1 Vulnerability Source Preparation

  1. Generate baseline Vulnerabilitys for NVD.
  2. We use Products field to filter the Supported Open-Source Project.
  3. We developed a Vulnerability Enrichment Tool to scrape GitHub using both GitHub API and web scraping technology to enrich a vulnerability.
  4. For each Supported Open-Source Project we:
    1. enrich the vulnerabilities with associate reference links, vulnerable product configuration, affected versions, commit IDs and version to date mappings using Vulnerability Enrichment Tool
    2. enrich the vulnerabilities with related vulnerabilities in CNNVD and CNVD
    3. do Vulnerability Feature Extraction (see below) for the vulnerabilities
    4. add the pseudo assembly code and source code diff to the code blocks relating to the vulnerability fix to this vulnerability
  5. Expose the database and Vulnerability Features with FastAPI-based web service

Sample of an enriched vulnerability for Linux kernel:

{
"id": "CVE-2022-23222",
"note": "NULL Pointer Dereference",
"poc_collected": true,
"references": {
"Ubuntu": "https://ubuntu.com/security/CVE-2022-23222",
"Debian": "https://security-tracker.debian.org/tracker/CVE-2022-23222",
"SUSE": "https://www.suse.com/security/cve/CVE-2022-23222",
"Red Hat": "",
"ExploitDB": "https://www.exploit-db.com/search?cve=2022-23222",
"NVD": "https://nvd.nist.gov/vuln/detail/CVE-2022-23222"
},
"format": "sourceguard-collected",
"description": "kernel/bpf/verifier.c in the Linux kernel through 5.15.14 allows local users to gain privileges because of the availability of pointer arithmetic via certain *_OR_NULL pointer types.",

"poc_location": "https://github.com/JlSakuya/Linux-Privilege-Escalation-Exploits/tree/main/2022/CVE-2022-23222",
"fix_info": {
"breaks": "",
"guidance": "v5.17",
"message": "",
"fixes": "c25b2ae136039ffa820c26138ed4a5e5f3ab3841"
},
"repositories": [
{
"repo_label": "CNNVD",
"note": "Linux kernel 代码问题漏洞",
"description": "Linux kernel是美国Linux基金会的开源操作系统Linux所使用的内核。 Linux kernel 5.15.14及之前版本存在代码问题漏洞,攻击者可利用该漏洞获得特权。",
"id": "CNNVD-202201-1165"
},
{
"repo_label": "CNVD",
"note": "Linux kernel代码问题漏洞(CNVD-2022-06892)",
"description": "Linux kernel是美国Linux基金会的开源操作系统Linux所使用的内核。\nLinux kernel 5.15.14及之前版本存在安全漏洞,攻击者可利用该漏洞获得特权。",
"id": "CNVD-2022-06892"
},
{
"repo_label": "NVD",
"note": "NULL Pointer Dereference",
"description": "kernel/bpf/verifier.c in the Linux kernel through 5.15.14 allows local users to gain privileges because of the availability of pointer arithmetic via certain *_OR_NULL pointer types.",
"id": "CVE-2022-23222"
}
],
"affected_versions": {
"date_mappings": {
"v5.17-rc1": "2022-01-22T16:00:00Z",
"v2.6.12-rc2": "2005-04-15T16:00:00Z"
},
"last_affected": "[UNKNOWN]",
"from": "v2.6.12-rc2",
"to": "v5.17-rc1"
},
"cvss": {
"risk_level": 1,
"score": 7.800000190734863,
"v2": 7.199999809265137,
"v3": 7.800000190734863
}
}

Pseudo assembly code diff for a kernel vulnerability:
Pseudo assembly code diff

Source code diff for a kernel vulnerability:
Source code diff

1.5.4.2 Vulnerability Feature Extraction

To enhance vulnerability identification, we further process Supported Open-Source Projects listed in CVE records to extract detailed vulnerability features. For each specific vulnerability, the following steps are performed:

1. Source Code Retrieval

2. Binary Compilation

  • Build the project's binaries before and after vulnerability fix using commit IDs provided in the patch information.

  • If no fix is available:

    • Use the latest commit as the before-fix version.
    • Leave the after-fix version empty.

3. Disassembly

  • Disassemble the binaries into architecture-independent pseudo assembly code.
  • This removes variations caused by different CPU architectures and compiler settings.

4. Syntactic Feature Extraction

For each code block, analyze the pseudo assembly code to extract the following features:

  • Unmangled Function Name
    Since the binaries are built from source, we can reconstruct the un-mangled function name using its scope and symbol information.
    Example: std::ostream& std::operator<< <std::char_traits<char>>(std::ostream&, char const*)
  • Function Parameter Lists
    The type and order of input and output parameters are extracted.
  • Constant Values
    The declaration site and constant names and values are extracted.
  • Global Variable Usage
    The calling site and global variable called are extracted.
  • External Function Calls
    The calling site and the function called are extracted.
  • Local Variable Count
    The calling site and the number of local variables are extracted.

Vulnerability Feature Extraction

5. Integration with Hybrid Vulnerability Identification Engine

  • The extracted syntactic features are used in the Static Analysis phase to quickly identify code blocks related to the vulnerability.
  • The pseudo assembly code from both before and after vulnerability fix is also used in the Dynamic Analysis phase to determine if the vulnerability if found in the binary.

For more details, refer to Hybrid Vulnerability Identification Engine.