Reduce memory footprint by 0intro · Pull Request #2314 · OpenSCAP/openscap

0intro · 2026-02-18T14:11:24Z

Avoid keeping full libxml2 DOM trees in memory when only the parsed object model is needed. The Source DataStream XML (typically 20+ MB) was previously held as a DOM for the entire evaluation lifetime. Now it is parsed via a streaming xmlTextReader and freed early.

It includes the following changes:

Add oscap_source_get_streaming_xmlTextReader that parses directly from file or memory without building a persistent DOM.
Use the streaming reader in all OVAL model importers.
Serialize extracted SDS components to memory buffers instead of cloning DOM subtrees, so the full SDS DOM can be freed after extraction.
Free XCCDF and OVAL source DOMs immediately after their object models are built.

Add oscap_source_get_streaming_xmlTextReader() that creates an xmlTextReader directly from file contents or memory buffer without loading the full XML DOM first. For file-based sources, the file is read into a memory buffer and parsed with xmlReaderForMemory. For memory-based sources, the buffer is parsed directly. BZ2- compressed sources fall back to the existing DOM-based path. Also switch oscap_source_get_scap_type() and oscap_source_get_schema_version() to use the streaming reader, avoiding unnecessary DOM construction for document type detection and schema version extraction.

Switch oval_definition_model, oval_syschar_model, oval_variable_model, oval_directives_model, and oval_results_model import functions to use oscap_source_get_streaming_xmlTextReader() instead of oscap_source_get_xmlTextReader(). This avoids loading the full XML DOM into memory when importing OVAL documents, since the OVAL parsers only use streaming-compatible xmlTextReader API calls.

Instead of keeping cloned DOM trees for extracted DataStream components, serialize them to compact XML text buffers via xmlDocDumpMemory() and immediately free the cloned DOM. The component oscap_source is then created from the memory buffer using oscap_source_new_take_memory(). This reduces peak memory during SDS decomposition because serialized XML text is typically 3-5x smaller than its libxml2 DOM representation. The streaming xmlTextReader can also parse directly from these buffers without constructing an intermediate DOM.

Release the xmlDoc held by OVAL and XCCDF sources as soon as the corresponding object models have been built from them. In xccdf_session_load_oval(), call oscap_source_free_xmlDoc() on each OVAL source right after oval_definition_model_import_source(). In _xccdf_session_load_xccdf_benchmark(), free the XCCDF source DOM right after xccdf_benchmark_import_source(). This eliminates the window where both the XML DOM and the parsed object model coexist in memory during the loading phase.

sonarqubecloud · 2026-02-18T14:12:36Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

src/source/oscap_source.c

+	}
+
+	if (source->origin.filepath != NULL) {
+		int fd = open(source->origin.filepath, O_RDONLY);


0intro added 4 commits February 18, 2026 15:00

github-advanced-security bot found potential problems Feb 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce memory footprint#2314

Reduce memory footprint#2314
0intro wants to merge 4 commits intoOpenSCAP:mainfrom
0intro:djc/xml-memory

0intro commented Feb 18, 2026

Uh oh!

sonarqubecloud bot commented Feb 18, 2026

Uh oh!

Check failure

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

0intro commented Feb 18, 2026

Uh oh!

sonarqubecloud bot commented Feb 18, 2026

Quality Gate passed

Uh oh!

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments