How to Convert XML to JSON: Complete Guide with Examples
XML (eXtensible Markup Language) powered the web services revolution of the early 2000s. Two decades later, JSON has become the dominant data interchange format for APIs, configuration files, and real-time communication. Yet XML hasn't disappeared — it remains entrenched in enterprise systems, government data feeds, healthcare (HL7 CDA, FHIR resources wrapped in XML), financial services (ISO 20022, FpML), publishing (DITA, DocBook), and legacy SOAP services that still process millions of transactions daily.
If you work anywhere along this boundary between legacy XML and modern JSON, you need reliable conversion strategies. This guide covers the mapping rules, edge cases, and practical patterns you'll encounter when transforming XML into JSON, whether you're doing it manually with our converter or programmatically in your codebase.
1. Why XML-to-JSON Conversion Matters
The shift from XML to JSON isn't just a syntax preference — it reflects a broader architectural change. REST APIs replaced SOAP, NoSQL databases like MongoDB and CouchDB store JSON natively, and frontend frameworks (React, Vue, Angular) consume JSON directly. Converting XML to JSON bridges the gap between systems that produce XML and consumers that expect JSON.
Common scenarios where you'll need this conversion include:
- API gateway transformation: Accept XML from a legacy SOAP backend and return JSON to mobile/web clients
- Data lake ingestion: Parse XML feeds (RSS, Atom, government open data) into JSON for storage in cloud-native data platforms
- Configuration modernization: Convert Maven pom.xml, Spring applicationContext.xml, or .NET web.config into JSON equivalents
- Testing and debugging: View complex XML payloads in a more compact, readable JSON format to spot structural issues quickly
- Healthcare interoperability: Transform CDA documents or HL7v3 messages into FHIR-compatible JSON bundles
2. Core Mapping Rules
Every XML-to-JSON converter must address fundamental differences between the two formats. XML is a tree of named elements with optional attributes and text content. JSON has objects (key-value maps), arrays (ordered lists), and primitive types (string, number, boolean, null). Here are the foundational mapping rules:
Elements become object properties
Each XML child element maps to a key in a JSON object. The tag name becomes the key, and the element's content becomes the value.
<!-- XML -->
<person>
<name>Lee</name>
<age>29</age>
<email>lee@example.com</email>
</person>
// JSON
{
"person": {
"name": "Lee",
"age": "29",
"email": "lee@example.com"
}
}Hierarchy is preserved
Nested XML elements become nested JSON objects. The tree structure translates directly — no data is lost during conversion if the mapping rules are applied consistently.
<!-- XML -->
<order>
<id>1001</id>
<customer>
<name>Jordan</name>
<address>
<city>Portland</city>
<state>OR</state>
</address>
</customer>
</order>
// JSON
{
"order": {
"id": "1001",
"customer": {
"name": "Jordan",
"address": {
"city": "Portland",
"state": "OR"
}
}
}
}Empty elements
Self-closing or empty XML elements (like <notes/>) can be represented as empty strings, null, or empty objects in JSON depending on your convention. Choose one approach and apply it consistently.
3. Handling Attributes
XML attributes are metadata attached to elements. JSON has no native concept of attributes — everything is a property. The most common solution is to prefix attribute keys with a special character to distinguish them from child elements.
The @ prefix convention
This is the most widely adopted approach, used by libraries including xml2js (Node.js), Jackson (Java), xmltodict (Python), and Newtonsoft.Json (C#). Attributes get an @ prefix and sit alongside child element properties.
<!-- XML with attributes -->
<user id="42" role="admin">
<name>Sam</name>
<email>sam@company.com</email>
</user>
// JSON with @ prefix
{
"user": {
"@id": "42",
"@role": "admin",
"name": "Sam",
"email": "sam@company.com"
}
}Alternative conventions
Some systems use different prefixes: - (Badgerfish convention),$ (used in some .NET serializers), or group all attributes under a nested _attributes object. Our converter uses the @ prefix because it's the most recognized and interoperable option.
Elements with attributes and text
When an element has attributes AND text content but no child elements, the text is typically stored under a#text key:
<!-- XML -->
<price currency="USD">29.99</price>
// JSON
{
"price": {
"@currency": "USD",
"#text": "29.99"
}
}4. Arrays and Repeated Elements
This is one of the trickiest aspects of XML-to-JSON conversion. In XML, repeated sibling elements with the same tag name naturally represent a list. In JSON, they map to an array. The problem arises when sometimes there's one item and sometimes there are many.
The single-item vs. array ambiguity
Consider an XML response from an API. When there's one result, naive conversion produces an object. When there are two results, it produces an array. This inconsistency breaks client code that always expects an array.
<!-- One item: naive conversion gives an object -->
<results>
<item>Widget</item>
</results>
→ { "results": { "item": "Widget" } }
<!-- Two items: naive conversion gives an array -->
<results>
<item>Widget</item>
<item>Gadget</item>
</results>
→ { "results": { "item": ["Widget", "Gadget"] } }Best practice: force arrays
Most production systems configure their converter to always wrap certain elements in arrays, regardless of count. Libraries like xml2js support a forceArray option. Our online converter preserves the natural mapping, so you may need to wrap single items in arrays in your client code.
5. Mixed Content and Text Nodes
Mixed content is when an element contains both text and child elements. This is common in document-oriented XML (like HTML fragments embedded in data) but awkward to represent in JSON because JSON objects don't natively mix free text with structured properties.
<!-- Mixed content XML -->
<description>
This product is <b>highly rated</b> and <i>affordable</i>.
</description>
// One possible JSON representation
{
"description": {
"#text": ["This product is ", " and ", "."],
"b": "highly rated",
"i": "affordable"
}
}Mixed content is inherently lossy when converting to JSON. If preserving the exact reading order matters, consider keeping the XML fragment as a raw string in your JSON (e.g., store the inner XML as an escaped string) or use a more sophisticated tree representation with ordered nodes.
6. Namespaces in Conversion
XML namespaces prevent tag-name collisions when combining vocabularies from different sources. For example, an invoice document might include elements from both a billing namespace and a shipping namespace. A<address> in the billing context and a<address> in shipping have different semantics.
<invoice xmlns:bill="http://example.com/billing"
xmlns:ship="http://example.com/shipping">
<bill:address>
<bill:street>123 Main</bill:street>
</bill:address>
<ship:address>
<ship:street>456 Warehouse Rd</ship:street>
</ship:address>
</invoice>When converting to JSON, you have several options:
- Strip namespaces: Use just the local tag name. Simplest but risks collisions.
- Preserve prefixes: Use keys like
"bill:street". Readable but depends on prefix stability. - Use Clark notation: Fully qualified namespace URIs in keys. Unambiguous but verbose.
For most use cases, stripping namespaces and using flat keys works fine if your XML doesn't have colliding tag names. Our converter preserves namespace prefixes in the output so you don't lose information.
7. Type Coercion: Strings, Numbers, and Booleans
A critical difference: XML element text is always a string. JSON has distinct types for numbers, booleans, and null. When converting XML to JSON, a value like <count>42</count>becomes the string "42", not the number42.
This matters because downstream code may fail silently. JavaScript's "42" + 1 gives"421" (string concatenation), not 43.
Strategies for type-safe conversion:
- Define an XML Schema (XSD) that specifies types, then use a converter that reads the schema
- Apply a post-processing step that coerces known fields to their expected types
- Use libraries like fast-xml-parser (Node.js) that offer built-in number/boolean parsing options
- Validate the final JSON against a JSON Schema to catch type mismatches
8. CDATA Sections
CDATA sections let you embed raw text (including characters like <,&, and >) without escaping them. They're commonly used for embedding HTML, SQL, or code snippets inside XML.
<template>
<html><![CDATA[
<div class="header">
<h1>Welcome & hello</h1>
</div>
]]></html>
</template>When converting to JSON, the CDATA wrapper is stripped and the inner text becomes a regular string value. The raw content (including HTML tags) is preserved as a string, so your JSON consumer receives the exact text that was inside the CDATA section.
9. Real-World Migration Patterns
Converting a few elements manually is straightforward. Migrating entire systems from XML to JSON at scale requires a disciplined approach. Here are patterns used by engineering teams in production:
Pattern 1: API Gateway Translation
Place a reverse proxy or API gateway (like AWS API Gateway, Kong, or NGINX with XSLT) between your SOAP backend and REST clients. The gateway converts XML responses to JSON on the fly. This lets you modernize the client experience without touching the backend.
Pattern 2: ETL with Schema Mapping
In data engineering, define a formal mapping specification (which XML elements map to which JSON fields, type coercion rules, array handling). Implement this in your ETL tool (Apache NiFi, Talend, custom scripts) and validate output against a JSON Schema. This is how healthcare systems convert CDA documents to FHIR bundles.
Pattern 3: Dual-Format Support Period
During migration, support both XML and JSON simultaneously. Serve both formats from the same API using content negotiation (Accept: application/json vsAccept: application/xml). This gives consumers time to migrate without a hard cutover.
Pattern 4: Event Stream Transformation
For real-time systems, use a message broker (Kafka, RabbitMQ) with a transformation layer. XML messages published by legacy producers are consumed by a converter service that republishes as JSON to a separate topic. Downstream services subscribe to the JSON topic.
10. Code Examples
Here's how to convert XML to JSON programmatically in the three most common languages:
JavaScript (Node.js with xml2js)
const xml2js = require('xml2js');
const xml = `
<book id="978">
<title>Clean Code</title>
<author>Robert Martin</author>
<year>2008</year>
</book>`;
xml2js.parseString(xml, {
attrkey: '@',
charkey: '#text',
explicitArray: false,
}, (err, result) => {
if (err) throw err;
console.log(JSON.stringify(result, null, 2));
});Python (with xmltodict)
import xmltodict import json xml = """ <book id="978"> <title>Clean Code</title> <author>Robert Martin</author> <year>2008</year> </book> """ result = xmltodict.parse(xml) print(json.dumps(result, indent=2))
Java (with Jackson)
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
import com.fasterxml.jackson.databind.ObjectMapper;
String xml = "<book><title>Clean Code</title></book>";
XmlMapper xmlMapper = new XmlMapper();
JsonNode node = xmlMapper.readTree(xml.getBytes());
ObjectMapper jsonMapper = new ObjectMapper();
String json = jsonMapper.writerWithDefaultPrettyPrinter()
.writeValueAsString(node);
System.out.println(json);11. Common Mistakes and How to Fix Them
Mistake: Not handling the single-item array case
Your code expects data.items to always be an array, but when there's only one item, the converter returns an object. Your .map() call throws.
Fix: Use Array.isArray(x) ? x : [x] to normalize, or configure your parser's forceArray option.
Mistake: Assuming values are typed
XML converts <active>true</active> to the string "true", not the booleantrue. Comparisons like if (data.active) always pass because both "true" and "false" strings are truthy.
Fix: Explicitly compare strings or add a type coercion layer.
Mistake: Losing attribute data
If your converter strips attributes, metadata like IDs, currencies, and language codes disappear silently.
Fix: Use a converter that preserves attributes (like this one) and verify attribute-heavy elements appear in the output.
Mistake: Ignoring XML declaration encoding
XML documents may declare encoding="ISO-8859-1" or other charsets. If you parse as UTF-8 without re-encoding, special characters (accents, symbols) will be corrupted.
Fix: Check the XML declaration's encoding and transcode to UTF-8 before parsing.
12. Choosing a Conversion Library
If you're building an automated pipeline, choose a library that handles your specific edge cases. Here's a comparison of popular options:
| Library | Language | Strengths | Watch Out For |
|---|---|---|---|
| xml2js | Node.js | Mature, highly configurable (forceArray, attrkey, charkey) | Callback-based API (wrap in promisify) |
| fast-xml-parser | Node.js | Fast, supports number/boolean parsing natively | Different default conventions than xml2js |
| xmltodict | Python | Simple one-liner API, handles attributes with @ | Limited streaming support for large files |
| Jackson XML | Java | Enterprise-grade, annotation-based mapping | Complex configuration for advanced cases |
| Newtonsoft.Json | C# / .NET | Native .NET support, JsonConvert.SerializeXNode | Namespace handling requires careful config |
Try It Yourself
Understanding these mapping rules and pitfalls gives you the knowledge to convert XML to JSON reliably — whether you're doing a one-off conversion or building an automated pipeline. For quick conversions, use ourXML to JSON converter — it runs entirely in your browser and handles attributes, arrays, and nested structures automatically.
For the reverse direction, see our guide onconverting JSON to XML, or explore the XML vs JSON comparison to decide which format is right for your project.