Done EQ vs ==

Aug 16, 2016
11
2
#1
I am writing a batch script that reads a file of HTML code line-by-line and looks for a certain string in a certain position. After reading each line with:
Code:
set t="%@fileread[%h]"
I extract the portion of the line that I'm looking for with:
Code:
set p=%@instr[19,15,%t]
If I then compare the extracted string to the string I desire using ==, I get the correct result:
Code:
iff "%p" == "SmallDataHeader" then ...
That is, the above code returns only the lines in which the desired string appears at the desired position. However, if I use EQ (or even EQC), I get spurious, erroneous results:
Code:
iff "%p" EQ "SmallDataHeader" then ...
That is, the above code returns lines that do not contain the desired string in the desired position.

The help file documentation (Take Command / TCC Help v.22) says that == and EQ are equivalent, but in my code this does not appear to be the case. I would appreciate it if someone would please explain to me what's going on here. MIght EQ and EQC not be working properly because I have to use the SETDOS command with parameter /x-567 to read the HTML code file? Or is this a bug? Or is the documentation is simply wrong? Or am I doing something wrong?

Thank you.
 
#4
Code:
Do not use parenthesis or quotes if you are comparing numeric values with an IF command.
For example
IF (2) GEQ (15) echo "bigger"
or
IF "2" GEQ "15" echo "bigger"
Will perform a character comparison and will echo "bigger"

however the commands:
IF 2 GEQ 15 echo "bigger"
or  
IF (2 GEQ 15) echo "bigger"
  Will perform a numeric comparison and return the correct result.
Joe

Ref: SS64
 
Aug 16, 2016
11
2
#5
Can you give us a value for %p so that these differ in behavior?

Code:
iff "%p" == "SmallDataHeader" then ...

iff "%p" EQ "SmallDataHeader" then ...
Sure. Here are some of the incorrect results. The numbers on the left are line numbers in the file being read. The string to the right is the value of %p extracted using the %@instr function. The lines that are followed by asterisks and then the TCC error message are those that incorrectly matched the Boolean in the iff ... then statement.
Code:
1000
1100 <scr'+'ipt type
1200 " tms="navLink"
********************
TCC: C:\E-Drive\Finance\g.btm [51]  Unknown command "endiff"
1300 align="left" he
********************
TCC: C:\E-Drive\Finance\g.btm [51]  Unknown command "endiff"
1400
1500 lText" valign="
********************
TCC: C:\E-Drive\Finance\g.btm [51]  Unknown command "endiff"
1600 Content-Type" c
1700
1800
1900 oad bs');"
2000 " tms="navLink"
********************
TCC: C:\E-Drive\Finance\g.btm [51]  Unknown command "endiff"
2100
2200
2300 <!-- $HTMLid: /
2400 css">
The correct behavior (using ==) produces correct output, flagging only the line that contains "SmallDataHeader" (a class name) and no TCC error message:
Code:
5700 l) {"
5800
5900 !(IE)]><!-->"
6000 " tms="navLink"
6100 /td>"
6200
6300
6400 {font: 700 12px
6500
6600 "
6700 n" role="banner
6800 " tms="navLink"
6900
7000 SmallDataHeader
********************
7100 ><input type="s
7200    {font: 700 12p
7300 open (theURL, '
7400
7500 class="pnld"><a
7600 pnlogout" tms="
Here's the actual code:
Code:
1  set h=%@fileopen["%@drive[%0]\e-drive\finance\quotes1.txt",r]
 2  do until %t eq "**EOF**"
 3    set k=%@eval[%k+1]
 4    set t="%@fileread[%h]"
 5    set p=%@instr[19,15,%t]
 6    iff "%p" eq "SmallDataHeader" then
 7      echo %@format[4,%k] : %p
 8      echo **********************
 9    endiff
10  enddo
11  set h=%@fileclose[%h]
To get the correct result, change eq to == in line 6.

Thank you for your time in looking at this.
 
#7
I can't figure out what you're doing, Jesse. But one thing might make things a lot easier, namely, get rid of the HTML/XML before you process the file. For example ...
Code:
TPIPE /input="%@drive[%0]\e-drive\finance\quotes1.txt" /output=notags.txt /simple=16
... now process notags.txt.
 
Aug 16, 2016
11
2
#8
I can't figure out what you're doing, Jesse. But one thing might make things a lot easier, namely, get rid of the HTML/XML before you process the file. For example ...
Code:
TPIPE /input="%@drive[%0]\e-drive\finance\quotes1.txt" /output=notags.txt /simple=16
... now process notags.txt.
Thank you for your reply, "vefatica," but I'm afraid that doing as you suggest would not help me. You said that you "can't figure out what [I'm] trying to do," so let me explain.

As you may know, Yahoo! Finance recently discontinued its service of providing Excel integration with its investment database. Literally thousands -- perhaps even tens or hundreds of thousands -- of amateur investors like me relied on that service to update our investment portfolios each day and track our progress. There are an uncountable number of posts on the Web about this and suggestions for alternative solutions. I have tried several, but have yet to find one that is truly satisfactory and lives up to the job.

This past week I discovered that I can use the TakeCommand webform command to query Fidelity for mutual fund prices with statements such as this:
Code:
webform /v /w"http://quotes.fidelity.com/webxpress/get_quote", "QUOTE_TYPE", "R", "SID_VALUE_ID", "FCNTX", "submit", "Quote" >> quotes1.txt
This downloads the Fidelity quote page with all the information I'm looking for. I created a 2D array of mutual fund ticker symbols and their names and call the above statement in a loop to create a file (quotes1.txt) containing all the funds in my portfolio.
Code:
set k=0
do %@eval[%@arrayinfo[funds,1]-1]
  set k=%@eval[%k+1]
  echos %funds[%k,0]` `
  webform /v /w"http://quotes.fidelity.com/webxpress/get_quote", "QUOTE_TYPE", "R", "SID_VALUE_ID", "%funds[%k,0]", "submit", "Quote" >> quotes1.txt
enddo
It is relatively easy to extract the last closing price because it is on a line with the class name "SmallDataHeader."
Code:
<td class="SmallDataHeader" align=right>55.79</td>
This class is not used for any other data on the page, so finding the closing prices is easy. This is not true for finding other data, as the classes and other HTML attributes on those lines are not unique. The other data that I'm interested in includes the mutual fund net change for the day and net change percentage. These data are in the file, but they're a little harder to identify by the surrounding tags due to the lack of uniqueness.

Thus I began using other techniques to read my generated quotes1.txt file. I found that I needed to use the SETDOS command to avoid getting errors caused by the special symbols in HTML code. No problem there. But then I ran into this EQ vs. == issue. As I wrote in my original post, the documentation says that they are equivalent, but clearly this is not the case. I can program around the issue, but again, that's not the issue. I would like to understand the difference between EQ and == in TakeCommand batch files.

I hope that this provides sufficient explanation of the background of my question. If I strip all the HTML as you suggest, I won't have the tags I need to parse the generated file and pull out the data I want. Again, thank you for your reply.
 
#9
Try this for one investment (using only the file for FCNTX).
Code:
v:\> webform /v /w"http://quotes.fidelity.com/webxpress/get_quote", "QUOTE_TYPE", "R", "SID_VALUE_ID", "FCNTX", "submit", "Quote" > quotes1.txt

v:\> tpipe /input=quotes1.txt /grep=3,0,0,0,0,0,0,0,"class.*SmallDataHeader" /simple=16 /replace=4,0,0,0,0,0,0,0,0,"\t",""
122.92
Note: after locating the line ("grep") "simple=16" removes the tags leaving only a bunch of tabs and the price, and "replace" gets rid of the tabs.

You could probably wrap that up to handle several investments. Untested:

Code:
do inv in /L investment1 investment2 ...
    echos %inv^t >> allinvestments.txt
    webform ..... "%inv" ... > quote.tmp
    tpipe /input=quote.tmp /output=allinvestments.txt /outputappend=1 /grep=3,0,0,0,0,0,0,0,"class.*SmallDataHeader" /simple=16 /replace=4,0,0,0,0,0,0,0,0,"\t",""
enddo
(or something like that). Allinvestments.txt should wind up looking like this:

name<tab>price
name<tab>price
...

Here's another way you might handle several investments. I used FCNTX three times.

1514002685389.png
 
Aug 16, 2016
11
2
#10
Thank you, Vince. I'm a long-time TakeCommand user and I have written literally hundreds of BTM files -- I've been in love with JPSoft products since I started using 4DOS when it was first released in 1989 -- but I have never explored the tpipe command. I see that I've seriously been missing something in my programming repertoire and that I of course still have a lot to learn! :) Thanks for pointing me in that direction. I'll have to explore this command thoroughly, and that'll take me some time.
 
#11
TPIPE is a monster! If you know about UNIX text utilities (grep, sed, tr, ...) see if TPIPE can do them for you. In any event, "regular expressions" will be indispensible. Here are a couple non-TPIPE solutions. This will only pick out the first such line.
Code:
v:\> echo %@rereplace[".*right>|</td>",,%@execstr[ffind /k /m /v /e"class.*SmallDataHeader" quotes1.txt]]
122.92
To get more than one (or just one), you could use a do loop, picking out the lines with @REGEX.
Code:
v:\> do l in @quotes1.txt ( if "%@regex["class.*SmallDataHeader","%l"]" == "1" echo %@rereplace[".*right>|</td>",,%l] )
122.92
122.92
122.92
 
Aug 16, 2016
11
2
#12
Vince:

I am ecstatic! Yes, "TPIPE is a monster" to be sure, but what power! I spent a good part of today playing with it and was able to produce the precise results that I want. Yay! What a nice Christmas present! :)

And just BTW, yes, I am quite familiar with regular expressions, but not an expert with the more complex structures. Those can be a bear, too.

I am now able to produce the output you see below (even with color! :), by extracting the data using grep and tpipe. (These are closing prices and changes as of yesterday, Friday, December 22nd.) This is exactly what I wanted. The price data also gets copied to the Windows clipboard so that I can paste it into Excel. Fantastic.

I couldn't attach the actual, final BTM file to this message board (probably for security reasons), so I printed it to a PDF and attached that. Very, very cool. My sincere thanks to you for pointing me toward the techniques I needed to create the solution. Merry Christmas!

Jesse


1514062136660.png
 

Attachments

Aug 23, 2010
191
2
#14
You can't reliable use generic text parsing tools to parse structural formats like HTML.
Get normal XSL/XPATH parser/interpreter and just get what you want instead of wasting your time like that.
 
Aug 16, 2016
11
2
#15
You can't reliable use generic text parsing tools to parse structural formats like HTML.
Get normal XSL/XPATH parser/interpreter and just get what you want instead of wasting your time like that.
I'm afraid that I strongly disagree with you here. You can't use an XSL parser on HTML5 because HTML5 documents are not "well-formed." They have opening tags that are not closed, such as <br> and <img src="...">.

TPIPE, as recommended Vince in previous responses, solved my problem beautifully. Take a look at that powerful tool.
 
Aug 23, 2010
191
2
#16
They MAY have not well-formed tags. But a recommendation is to adhere to XML standards, and many page designers do, as it is easier to find an automated tool for XML processing, than a special-crafted tool for HTML processing. And there's always HTML Tidy.