07-24-2023, 06:04 AM
I'm having a lot of trouble figuring out how to select contents from specific HTML elements (which are in fact nodes) from an HTML file.
I'll admit first of all that this isn't "well-formed xml" but unless this really is my problem, I doubt it really matters. Take this file:
<html>
<body id="464">
<div id="fullname"> Use Cases </div>
<div id="intro"> <b><font color="#000033">Use Cases </font></b> </div>
</body>
</html>
And this very barebone code I extracted from my full script:
xmlSourceTranslation = new ActiveXObject("Msxml2.DOMDocument.6.0");
xmlSourceTranslation.async="false";
xmlSourceTranslation.load(file.html);
xmlSourceTranslation = xmlSourceTranslation.documentElement;
var sourceNode = xmlSourceTranslation.selectSingleNode("//*[@id = 'fullname']");
if (typeof sourceNode === 'object') {
sourceText = sourceNode.firstChild.nodeValue;
}
The problem is, depending if I get the `fullname` or `intro` div's, and the method I use (`.firstChild.nodeValue` , `.innerHTML`, `.firstChild.innerHMTL`, `.childNodes`), I'll either get a value of `null`, `undefined`, or I'll et an `Object Required` error when trying to access it. The only reliable method I can use is `sourceNode.text`, which works every time, but only gets "Use cases " as a value in the `intro` div, instead of the HTML which is what I need.
I have been hitting my head on my desk for almost 2 days trying to figure it out.
I'll admit first of all that this isn't "well-formed xml" but unless this really is my problem, I doubt it really matters. Take this file:
<html>
<body id="464">
<div id="fullname"> Use Cases </div>
<div id="intro"> <b><font color="#000033">Use Cases </font></b> </div>
</body>
</html>
And this very barebone code I extracted from my full script:
xmlSourceTranslation = new ActiveXObject("Msxml2.DOMDocument.6.0");
xmlSourceTranslation.async="false";
xmlSourceTranslation.load(file.html);
xmlSourceTranslation = xmlSourceTranslation.documentElement;
var sourceNode = xmlSourceTranslation.selectSingleNode("//*[@id = 'fullname']");
if (typeof sourceNode === 'object') {
sourceText = sourceNode.firstChild.nodeValue;
}
The problem is, depending if I get the `fullname` or `intro` div's, and the method I use (`.firstChild.nodeValue` , `.innerHTML`, `.firstChild.innerHMTL`, `.childNodes`), I'll either get a value of `null`, `undefined`, or I'll et an `Object Required` error when trying to access it. The only reliable method I can use is `sourceNode.text`, which works every time, but only gets "Use cases " as a value in the `intro` div, instead of the HTML which is what I need.
I have been hitting my head on my desk for almost 2 days trying to figure it out.