<div class="field_81">
<div class="kn-detail-label">
<span>ICD C</span>
</div>
</div>
<div class="field_81">
<div class="kn-detail-body">
<span>H25.13</span>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="kn-details-column column is-horizontal " style="flex-basis: 50%;">
<div class="kn-details-group column-3 columns">
<div class="field_82">
<div class="kn-detail-label">
<span>ICD D</span>
</div>
</div>
<div class="field_82">
<div class="kn-detail-body">
<span>H52.4</span>
</div>
</div>
Help to extract value from HTML
-
- Posts: 61
- Joined: 30 Aug 2017, 08:43
Help to extract value from HTML
I want to extract some value from the HTML data. There are an unique value called Field_any number. i want to capture all value which are associated with Field_any number. There are almost 160 Field_any number. i have no idea how to capture. However, have pasted below the HTML example. I want to capture red values
Re: Help to extract value from HTML
// removed
Last edited by Qysh on 21 Jun 2018, 12:01, edited 1 time in total.
Re: Help to extract value from HTML
Code: Select all
#SingleInstance, Force
HTML =
(LTrim
<div class="field_81">
<div class="kn-detail-label">
<span>ICD C</span>
</div>
</div>
<div class="field_81">
<div class="kn-detail-body">
<span>H25.13</span>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="kn-details-column column is-horizontal " style="flex-basis: 50`%;">
<div class="kn-details-group column-3 columns">
<div class="field_82">
<div class="kn-detail-label">
<span>ICD D</span>
</div>
</div>
<div class="field_82">
<div class="kn-detail-body">
<span>H52.4</span>
</div>
</div>
)
Pos := 1
While (Pos := RegExMatch(HTML, "<div class=""field_\d+"">(.*?)<\/div>", MatchDIV, Pos+StrLen(MatchDIV))) {
RegExMatch(MatchDIV, "<span>(.*?)<\/span>", MatchSpan)
MsgBox, % MatchSpan1
}
-
- Posts: 61
- Joined: 30 Aug 2017, 08:43
Re: Help to extract value from HTML
Thanks for help me but the HTML format could be variable and this one only works with example data. I have another idea to capture the value. To do that, we need to separate line for each Field_Any value.
Code: Select all
<div class="field_81"><div class="kn-detail-label"><span>ICD C</span></div></div>
<div class="field_81"><div class="kn-detail-body"><span>H25.13</span></div></div></div></div></div></div><div class="kn-details-column column is-horizontal " style="flex-basis: 50`%;"><div class="kn-details-group column-3 columns">
<div class="field_82"><div class="kn-detail-label"><span>ICD D</span></div></div>
<div class="field_82"><div class="kn-detail-body"><span>H52.4</span></div></div>
Re: Help to extract value from HTML
I don't understand how separate lines will be any different.
My previous solution will work exactly the same.
Can you be more specific about the formatting of the HTML?
My previous solution will work exactly the same.
Can you be more specific about the formatting of the HTML?
Code: Select all
#SingleInstance, Force
HTML =
(LTrim
<div class="field_81"><div class="kn-detail-label"><span>ICD C</span></div></div>
<div class="field_81"><div class="kn-detail-body"><span>H25.13</span></div></div></div></div></div></div><div class="kn-details-column column is-horizontal " style="flex-basis: 50`%;"><div class="kn-details-group column-3 columns">
<div class="field_82"><div class="kn-detail-label"><span>ICD D</span></div></div>
<div class="field_82"><div class="kn-detail-body"><span>H52.4</span></div></div>
)
Pos := 1
While (Pos := RegExMatch(HTML, "<div class=""field_\d+"">(.*?)<\/div>", MatchDIV, Pos+StrLen(MatchDIV))) {
RegExMatch(MatchDIV, "<span>(.*?)<\/span>", MatchSpan)
MsgBox, % MatchSpan1
}
Re: Help to extract value from HTML
Alternatively, ComObjCreate("HTMLfile"), write your html to it and traverse the DOM as you normally would.
-
- Posts: 61
- Joined: 30 Aug 2017, 08:43
Re: Help to extract value from HTML
Yes, the script with working with example data but not capturing below pasted data because the format is little bit change.
Code: Select all
<div class="kn-detail field_227">
<div class="kn-detail-label" style="min-width: 149px; max-width: 149px;">
<span>Query Date</span>
</div>
<div class="kn-detail field_227">
<span>06/22/2018</span>
</div>
</div>
Re: Help to extract value from HTML
the solution cant account for unknown unknowns. either post the whole thing and say which parts need picked, or modify it yourself, following the example provided
how to get content from elements:
how to get content from elements:
Code: Select all
html =
(LTrim %
<div class="kn-detail field_227">
<div class="kn-detail-label" style="min-width: 149px; max-width: 149px;">
<span>Query Date</span>
</div>
<div class="kn-detail field_227">
<span>06/22/2018</span>
</div>
</div>
<div class="field_81">
<div class="kn-detail-label">
<span>ICD C</span>
</div>
</div>
<div class="field_81">
<div class="kn-detail-body">
<span>H25.13</span>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="kn-details-column column is-horizontal " style="flex-basis: 50%;">
<div class="kn-details-group column-3 columns">
<div class="field_82">
<div class="kn-detail-label">
<span>ICD D</span>
</div>
</div>
<div class="field_82">
<div class="kn-detail-body">
<span>H52.4</span>
</div>
</div>
)
document := ComObjCreate("HTMLfile")
document.write(html)
Spans := document.getElementsByTagName("span")
Loop % Spans.length
MsgBox % Spans[A_Index - 1].innerHTML
-
- Posts: 61
- Joined: 30 Aug 2017, 08:43
Re: Help to extract value from HTML
Regexmatch not working when i have replaced data to variable.
HTML = %Clipboard%
HTML = %Clipboard%
Re: Help to extract value from HTML
The regex has to be amended to account for data in variable formats, in that case, which may or may not be possible at all. Or you need to make sure the data stays in the same format every time, or reformat it as needed yourself before passing it to the regex.
-
- Posts: 61
- Joined: 30 Aug 2017, 08:43
Re: Help to extract value from HTML
TheDewd wrote:
This code is not working when i have added Fileread command there. Have saved same data in txt file but not sure what I am missing.
Code: Select all
#SingleInstance, Force HTML = (LTrim <div class="field_81"> <div class="kn-detail-label"> <span>ICD C</span> </div> </div> <div class="field_81"> <div class="kn-detail-body"> <span>H25.13</span> </div> </div> </div> </div> </div> </div> <div class="kn-details-column column is-horizontal " style="flex-basis: 50`%;"> <div class="kn-details-group column-3 columns"> <div class="field_82"> <div class="kn-detail-label"> <span>ICD D</span> </div> </div> <div class="field_82"> <div class="kn-detail-body"> <span>H52.4</span> </div> </div> ) Pos := 1 While (Pos := RegExMatch(HTML, "<div class=""field_\d+"">(.*?)<\/div>", MatchDIV, Pos+StrLen(MatchDIV))) { RegExMatch(MatchDIV, "<span>(.*?)<\/span>", MatchSpan) MsgBox, % MatchSpan1 }
Code: Select all
#SingleInstance
F4::
Fileread, HTML, %A_ScriptDir%\HTML_File.txt
Pos := 1
While (Pos := RegExMatch(HTML, "<div class=""field_\d+"">(.*?)<\/div>", MatchDIV, Pos+StrLen(MatchDIV))) {
RegExMatch(MatchDIV, "<span>(.*?)<\/span>", MatchSpan)
MsgBox, % MatchSpan1
}
Re: Help to extract value from HTML
You might wanna take a look at this. It will help you parse HTML more easily.
https://autohotkey.com/boards/viewtopic.php?p=398#p398
https://autohotkey.com/boards/viewtopic.php?p=398#p398
Windows 10 x64 Professional, Intel i5-8500, NVIDIA GTX 1060 6GB, 2x16GB Kingston FURY Beast - DDR4 3200 MHz | [About Me] | [About the AHK Foundation] | [Courses on AutoHotkey]
[ASPDM - StdLib Distribution] | [Qonsole - Quake-like console emulator] | [LibCon - Autohotkey Console Library]