[Repo Assist] Fix #748: HTML-encode XML doc text nodes and unresolved cref values by github-actions[bot] · Pull Request #994 · fsprojects/FSharp.Formatting

github-actions · 2026-02-23T17:23:34Z

🤖 This is an automated pull request from Repo Assist.

Closes #748

Summary

This PR fixes two related HTML-encoding gaps in GenerateModel.fs that could cause broken output or HTML injection when XML documentation comments contain special characters.

Root Cause

In readXmlElementAsHtml, two code paths were appending content to the HTML output without proper HTML encoding:

Text nodes (line 1901): html.Append(text) — if XML doc text contains <, >, or & characters (e.g. in LaTeX math like \[ 1 < 2 < 3 > 0 \]), these would be emitted as raw HTML characters, breaking the document structure.
Unresolved (see cref) values (line 1945): html.Append(cref.Value) — unresolved cross-references fall back to emitting the raw cref string (e.g. T:TheNamespace.GenericClass2\1), which was already noted in the code with a commented-out HtmlEncode` call.

Fix

HTML-encode text nodes: html.Append(HttpUtility.HtmlEncode text)
Enable the already-commented-out cref encoding: let crefAsHtml = HttpUtility.HtmlEncode cref.Value

Trade-offs

HTML entities in the source XML doc (like <) are decoded by the XML parser before being stored as text node values. My encoding re-encodes them correctly for HTML output. Browsers decode HTML entities before passing text to MathJax (which reads from the DOM), so LaTeX math with < and > operators continues to work correctly.

The existing test for LaTeX math content was updated to expect the correctly HTML-encoded output (1 < 2 < 3 > 0 instead of 1 < 2 < 3 > 0).

Test Status

✅ dotnet build src/FSharp.Formatting.ApiDocs/ — succeeded (0 errors)
✅ dotnet test tests/FSharp.ApiDocs.Tests/ — 68/68 passed (4 skipped, 0 failed)

Generated by Repo Assist

To install this workflow, run gh aw add githubnext/agentics/workflows/repo-assist.md@828ac109efb43990f59475cbfce90ede5546586c. View source at https://github.com/githubnext/agentics/tree/828ac109efb43990f59475cbfce90ede5546586c/workflows/repo-assist.md.

- HTML-encode text nodes in readXmlElementAsHtml to prevent HTML injection when XML doc text contains characters like '<', '>', '&' - HTML-encode unresolved <see cref> values (the commented-out crefAsHtml code was already there, just never enabled) - Update test to expect HTML-encoded output for math expressions with '<' and '>' Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-02-23T17:23:36Z

✅ Pull request created: #994

github-actions bot added automation repo-assist labels Feb 23, 2026

This was referenced Feb 23, 2026

When using XML docs, backtick characters break the output #748

Open

[Repo Assist] Monthly Activity 2026-02 #973

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[Repo Assist] Fix #748: HTML-encode XML doc text nodes and unresolved cref values#994

[Repo Assist] Fix #748: HTML-encode XML doc text nodes and unresolved cref values#994
github-actions[bot] wants to merge 1 commit intomainfrom
repo-assist/fix-748-xml-doc-html-encoding-v2-6c850b1f53e8d053

github-actions bot commented Feb 23, 2026

Uh oh!

github-actions bot commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Comments

Conversation

github-actions bot commented Feb 23, 2026

Summary

Root Cause

Fix

Trade-offs

Test Status

Uh oh!

github-actions bot commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants