Vulnerability Assessment Report: PDF Generation Service
Executive Summary
During the assessment of the target application, a critical exploit chain was discovered leading to Arbitrary Local File Read (LFI). By chaining a cryptographic flaw (Modulo Bias) with an exposed debug endpoint and an insecure PDF rendering engine, an attacker can extract sensitive system files (e.g., /flag.txt, /etc/passwd).
initial Reconnaissance & The Sanitizer Trap
Initial analysis focused on the /documents endpoint, which utilized markdown-pdf and sanitize-html.
While a mutation XSS bypass was identified in the sanitizer configuration (allowing an attacker to smuggle an <iframe> inside an <a> tag's style attribute), this path proved unnecessarily complex due to the presence of a completely unsanitized debug endpoint.
Register as Admin
The /register page doesn't prevent you from signing up with the "admin" username. To gain admin access, you simply need to create a new user with that name.
The Vulnerable Endpoint (/document/debug/export)
The application exposes a debug route intended for administrative use.
Critical Finding: Unlike the standard document route, the content parameter here is passed directly to generatePDF() without any sanitization or Markdown translation. If the access_pass check can be bypassed, arbitrary HTML/JavaScript can be executed by the underlying PhantomJS engine.
3. The Cryptographic Flaw (Modulo Bias)
The verifyPass function checks the user's input against a rotating 4-digit PIN stored in a local file. If the guess is incorrect, the application immediately generates a new PIN using the following logic:
The Flaw: Generating random numbers by mapping a larger, non-multiple range (16-bit integer: $0$ to $65,535$) into a smaller range ($10,000$) introduces severe Modulo Bias.
- Numbers $0000$ to $5535$ appear $7$ times in the modulus cycle.
- Numbers $5536$ to $9999$ appear only $6$ times.
Statistical Impact:
- $P(0000 \le x \le 5535) \approx 59.1%$
- $P(5536 \le x \le 9999) \approx 40.9%$
This heavily skews the probability distribution, making numbers in the lower half significantly more likely to be generated.
4. The Exploit Chain: "Collision Attack"
Because the PIN rotates on every failed attempt, a sequential brute-force attack (guessing 0000, 0001, 0002...) is ineffective against a moving target.
Instead, the vulnerability is exploited using a Collision Attack:
- Static Target: The attacker continuously sends the exact same guess (e.g.,
0000). - Probability Leverage: Because
0000falls within the biased range, it has a $1$ in $\approx 9,362$ chance of being generated. - Middleware Bypass: The application's Express server lacks
express.json()middleware but acceptsapplication/x-www-form-urlencodedpayloads. The attacker sends the static PIN and the LFI payload via URL-encoded form data. - Execution: Once the server randomly generates
0000, the validation passes, and the malicious<iframe src='file:///flag.txt'>is rendered into the PDF, exfiltrating the local file.
5. Remediation Recommendations
- Fix the RNG: Replace the modulo arithmetic with a secure integer generation method, such as
crypto.randomInt(0, 10000). - Sanitize Debug Inputs: Apply the same
sanitize-htmllogic to the debug route as the standard application routes. - Modernize Dependencies: Deprecate
markdown-pdf(which relies on the vulnerable and unmaintained PhantomJS) in favor of a sandboxed, modern library like Puppeteer with local file access strictly disabled. - Implement Rate Limiting: Add strict rate limiting (e.g.,
express-rate-limit) to the debug route to prevent rapid, automated guessing.
Appendix: Exploit Script
Python