inspect element.innertext

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
nil blivion
Posts: 7
Joined: 27 Nov 2017, 21:26

inspect element.innertext

27 Nov 2017, 21:36

am having difficulty scraping data from this site https://market24hclock.com/
I have inspected the elemebt for the EUR/USD Price but I am unable to extract the number 1.19295 from from this element......... <td class ="symbol-last" style="curser: pointer;">1.19295</td>

I am having dificulty reading Mickers com tutorial as I am partially sighted. I have tried inspectelementsbyid and inspectelementsbyname but I get nothing from these

I am not requesting somebody to code this for me but that would be very useful. I just want to be directed to the appropriate tutorial/documentation.

Thanx in advance

0blivion.
bloody capchas, they are so hard to read
nil blivion
Posts: 7
Joined: 27 Nov 2017, 21:26

Re: inspect element.innertext

28 Nov 2017, 09:37

Hello My name is nil Blivion.

I have reviewed my original request for help and noticed that I come accross as an idiot attempting to get someone to write code for me for free. so here I will tell you about myself and publish my code.

I have been using AHK for 7yrs and become rather good at writing scrips to seearch for pixel colours
on a horizontal or vertical line when selected from a inside a gui. I have also frequently used Outputvar, to extract text from inside a Htlm page.
Unfortunatly I have never learned Html so am unfamiliar with elements hence my struggling here.
I cannot get iwb2 to work on my computers (Win10, Win7 and Vista), I am partially sighted and I find it difficult to communicate as I have Aspergers.

I am currently in the process of converting Mickers com tutorial to speech which can be viewed here
https://www.youtube.com/watch?v=Gj7QhlQQmRw

my code so far
;#############################################################################
;#### ####
;#### Scraping Tool for https://market24hclock.com/ ####
;#### ####
;#############################################################################

;#############################################################################
;## ###
;## Top of code ###
### ###
;#############################################################################


;******************************************************************
;** 1. Create IE Object ***
;** 2. Navigate to https://market24hclock.com/ ***
; 3. Make IE Visible ***
*******************************************************************

pwb := ComObjCreate("InternetExplorer.Application")
pwb.navigate("; https://market24hclock.com/")
pwb.visible true

;*********************************************************
;** Wait while page loads ***
;*********************************************************

while pwb.readyState!=4 || pwb.document.readyState != "complete" || pwb.busy
continue

;********************************************************************************
;** InspecElement and extract innertext to a variable ***
;** ***
;** Attempting to resd from following live element ***
;** <td class="symbol-last" style="curser: pointer;">1.18851</td> ***
;**********************************************************************************

Price := pwb.getElementsByIdTagName ("td" [2].Innertext

;*********************************************************
;** Output Value To Screen ***
;*********************************************************

Msgbox Current EUR/USD is %Price%

exitapp
;#############################################################################
;## ###
;## Bottom of code ###
### ###
;#############################################################################
User avatar
Xeo786
Posts: 760
Joined: 09 Nov 2015, 02:43
Location: Karachi, Pakistan

Re: inspect element.innertext

29 Nov 2017, 00:51

https://autohotkey.com/board/topic/5698 ... otkey-v11/

there missing ")" and it has spaces which will not make it work
Price := pwb.getElementsByIdTagName ("td" [2].Innertext

Code: Select all

Price := pwb.getElementsByIdTagName("td")[2].Innertext
"When there is no gravity, there is absolute vacuum and light travel with no time" -Game changer theory
gregster
Posts: 9054
Joined: 30 Sep 2013, 06:48

Re: inspect element.innertext

29 Nov 2017, 02:49

Well, it is a tricky one, not only because of the

Code: Select all

<td class="symbol-last" style="curser: pointer;">
part: The class name is constantly changing from "symbol-last" to "symbol-last growing" and "symbol-last falling" and back.

I am really no expert with dynamic webpages, so I don't even know if I can solve this, but a few remarks that might help to get you further:

1. You are missing ; in some of your comment lines, which causes errors
2. Replace

Code: Select all

pwb.navigate("; https://market24hclock.com/")
pwb.visible true
with

Code: Select all

pwb.navigate("https://market24hclock.com/")
pwb.visible := true
3. I don't think that there is a DOM element called "getElementsByIdTagName".
It's either "getElementbyID", which gives you a single element of a unique ID (but there are no IDs in this part of the Html source code) or "getElementsByTagName", which gives you a collection of elements (not sure how to handle a collection). <td> would be a tag, but there are many of these...
There is also "getElementsByClassName" which would give you a collection, but since the class name is constantly changing, that might get complicated, too, even if you can handle a collection.
4. Xeo786 is right, you are missing a closing )-bracket and you are not allowed to have spaces between ...TagName and the opening bracket, but like i said, the whole DOM identifier seems to be wrong in this case.

There are most probably ways to do that somehow with DOM, but I would consider making a "snapshot" of the whole dynamic source code instead, with the AHK command UrlDownloadToFile (or even better the function UrlDownloadtoVar which you can find somewhere in the forum) and then applying some "dirty" string operations (or RegEx magic) on the html source.
If I have time in the next day or two, I will try to look into that (also in the DOM option), but perhaps someone with more experience with dynamic html can already help you in the meantime.

EDIT:You should also be able to get the html via:

Code: Select all

html := pwb.document.body.innerhtml
(perhaps you can even narrow it down further by adding more elements of the DOM tree - and then apply string operations)
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: inspect element.innertext

29 Nov 2017, 06:02

You have frames on the page. Here's what I wrote.

Code: Select all

pwb := WBGet()
frame:= ComObjActive(9,ComObjQuery(pwb.document.parentwindow.frames[2],"{332C4427-26CB-11D0-B483-00C04FD90119}","{332C4427-26CB-11D0-B483-00C04FD90119}"),1).document.documentElement ;Get pointer to pointer similar to pwb.document.  ;querying the Comobject of the iframe's contentWindow one gets a pointer to its interface. This pointer needs to be wrapped with ComObjectActive()
;***********now extract data*******************
Data:=[]
loop, % frame.All.Tags("TABLE")[0].Rows.Length-1 {
	Row:=frame.All.Tags("TABLE")[0].Rows[A_Index-1]
	rows:="" ;clear out rows
	loop, % row.cells.length{
		rows.= row.cells[A_Index-1].innerTEXT a_tab
	}
	if(A_Index=1)
		Headers:=RegExReplace(rows,"\t","|")
	else
		Data.Push(StrSplit(rows,"	")) ;add rows to data object
}

Gui,Add,ListView,h900 w1200,%Headers%
for a,b in Data
	LV_Add("",b*) ;use variadic function to add columns
Loop,% LV_GetCount("Column")
	LV_ModifyCol(A_Index,"AutoHDR") ;adjust column width based on data
gui, show
Table_List:=""
return

;~ http://www.autohotkey.com/board/topic/47052-basic-webpage-controls-with-javascript-com-tutorial/
;~ wb := WBGet()
WBGet(WinTitle="ahk_class IEFrame", Svr#=1) {               ;// based on ComObjQuery docs
	static msg := DllCall("RegisterWindowMessage", "str", "WM_HTML_GETOBJECT")
	, IID := "{0002DF05-0000-0000-C000-000000000046}"   ;// IID_IWebBrowserApp
	;//     , IID := "{332C4427-26CB-11D0-B483-00C04FD90119}"   ;// IID_IHTMLWindow2
	SendMessage msg, 0, 0, Internet Explorer_Server%Svr#%, %WinTitle%
	if (ErrorLevel != "FAIL") {
		lResult:=ErrorLevel, VarSetCapacity(GUID,16,0)
		if DllCall("ole32\CLSIDFromString", "wstr","{332C4425-26CB-11D0-B483-00C04FD90119}", "ptr",&GUID) >= 0 {
			DllCall("oleacc\ObjectFromLresult", "ptr",lResult, "ptr",&GUID, "ptr",0, "ptr*",pdoc)
			return ComObj(9,ComObjQuery(pdoc,IID,IID),1), ObjRelease(pdoc)
		}
	}
}
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!
nil blivion
Posts: 7
Joined: 27 Nov 2017, 21:26

Re: inspect element.innertext

29 Nov 2017, 10:23

Hi Xeo786, Thank you for your quick response.
Hi gregster Thank you for your quick response.
Hi Joe Thank you for your quick response,

The code you provided is a bit to advanced for me, so I have attempted a new approach as suggested by you within your

"4) Advanced Dealing with Frames" Video tutorial


In this tutorial you explain that a frame is a page within a page and in order to use general scraping methods we should navigate to said page.


Please review my latest attempt

....................................................


; Nill Blivion Web Scraping Code For EUR/USD Live Price.


; Resources.

; Mickers Com Tutoral.
; https://autohotkey.com/board/topic/6456 ... -webpages/

; Joe Glines AutoHotKey WebScraping Video nTutorials.
; http://the-automator.com/web-scraping-with-autohotkey/

; aboutscript AutoHotkey: COM with Internet Explorer Tutorial. Part 1.
; https://www.youtube.com/watch?v=lr4AVCS6y0g

; aboutscript AutoHotkey: COM with Internet Explorer Tutorial. Part 2.
; https://www.youtube.com/watch?v=9uoh6cPyJi8&t=23s


; aboutscript AutoHotkey: COM with Internet Explorer Tutorial. Part 3.
; https://www.youtube.com/watch?v=daU4SPgZ404

; EJ Media aScript Tutorial for Beginners - 01 - Introduction Part 1 of 43.
; Thiese Tutorials Go UpTo 43.
; https://www.youtube.com/watch?v=xpZLS6R91rQ

; The Web Page I Am Attempting To scrape EUR/USD Live Price.
; https://market24hclock.com/Currencies/M ... ies-Quotes


; My Code:

; Create IE Object.
pwb := ComObjCreate("InternetExplorer.Application")


; Navigate To Page.
pwb.navigate("https://market24hclock.com/Currencies/M ... ies-Quotes")


; Wait Until Page Has Fully Loaded.
while pwb.readyState!=4 || pwb.document.readyState != "complete" || pwb.busy
continue


; Make Browser Window Visible.
pwb.Visible := True


; The Element I Am Attempting To Scrape.
; <td class="symbol-last">1.18464</td>.

; Get Value From Ellement.
var := pwb.document.getElementsByClassName("symbol-last")[0] .Value

;Output The Value To Screen.
msgbox %var%



; Exit From Script.
exitApp
....................................................
; Error Message When Run.
; Error Unknown Name.
....................................................


Thank You All for your support.
Mark.
gregster
Posts: 9054
Joined: 30 Sep 2013, 06:48

Re: inspect element.innertext

29 Nov 2017, 11:00

Hey, Joe, trying out your code, I get this error message:

Code: Select all

Error:  No valid COM object!

	Line#
	029: msg := DllCall("RegisterWindowMessage", "str", "WM_HTML_GETOBJECT")
	029: IID := "{0002DF05-0000-0000-C000-000000000046}"
	001: pwb := WBGet()
--->	002: frame := ComObjActive(9,ComObjQuery(pwb.document.parentwindow.frames[2],"{332C4427-26CB-11D0-B483-00C04FD90119}","{332C4427-26CB-11D0-B483-00C04FD90119}"),1).document.documentElement
	004: Data := []
	005: Loop,frame.All.Tags("TABLE")[0].Rows.Length-1
	005: {
	006: Row := frame.All.Tags("TABLE")[0].Rows[A_Index-1]
	007: rows := ""
	008: Loop,row.cells.length
	008: {
Not sure, what to look for to solve this. Im using Win 7 (64bit) and AHK 1.1.26.01 (64bit).

Edit: Ok, on closer inspection, I guess you are connecting to an existing IE instance. Let my try that out...
gregster
Posts: 9054
Joined: 30 Sep 2013, 06:48

Re: inspect element.innertext

29 Nov 2017, 11:10

gregster wrote:Edit: Ok, on closer inspection, I guess you are connecting to an existing IE instance. Let my try that out...
Oh yeah, that was the reason. After the site loaded completely in a regular IE, I got a nice data table. Thanks for that interesting piece of code.

@nil blivion: you should really try this. Just open the site in Explorer and then run Joe's code.
nil blivion
Posts: 7
Joined: 27 Nov 2017, 21:26

Re: inspect element.innertext

29 Nov 2017, 12:52

Success !!!

Thank you very much Joe,
I just pasted these lines above your code and all worked fantastic...........


pwb := ComObjCreate("InternetExplorer.Application")

pwb.navigate("https://market24hclock.com/Currencies/M ... ies-Quotes")

; Wait Until Page Has Fully Loaded.
while pwb.readyState!=4 || pwb.document.readyState != "complete" || pwb.busy
continue

pwb.Visible := true

You are the man,
I cant thank you enough.
I have been searching for the answer to this problem for 4 years and earlier this week I installed Python with Beautiful Soup 4 and lxml addons but I did not understand how to use them. From watching Youtube video;s about using python for web scraping I discovered that inspect elements was the way to go So after a google search for autohotkey and inspect elements I found your web scraping tutorial site. This in turn put me in touch with you.
I would like to thank you again and wish you a very mery xmas
Stay Happy.
Mark (Nill Blivion)
User avatar
Joe Glines
Posts: 770
Joined: 30 Sep 2013, 20:49
Location: Dallas
Contact:

Re: inspect element.innertext

29 Nov 2017, 13:34

You should also watch our webinar on Web Scraping as we talk through the DOM and other things...
Sign-up for the 🅰️HK Newsletter

ImageImageImageImage:clap:
AHK Tutorials:Web Scraping | | Webservice APIs | AHK and Excel | Chrome | RegEx | Functions
Training: AHK Webinars Courses on AutoHotkey :ugeek:
YouTube

:thumbup: Quick Access Popup, the powerful Windows folders, apps and documents launcher!

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: erann, Google [Bot], jdfnnl, Rohwedder and 344 guests