Done EQ vs ==

Jesse Heines · Dec 22, 2017

I am writing a batch script that reads a file of HTML code line-by-line and looks for a certain string in a certain position. After reading each line with:

Code:

set t="%@fileread[%h]"

I extract the portion of the line that I'm looking for with:

Code:

set p=%@instr[19,15,%t]

If I then compare the extracted string to the string I desire using ==, I get the correct result:

Code:

iff "%p" == "SmallDataHeader" then ...

That is, the above code returns only the lines in which the desired string appears at the desired position. However, if I use EQ (or even EQC), I get spurious, erroneous results:

Code:

iff "%p" EQ "SmallDataHeader" then ...

That is, the above code returns lines that do not contain the desired string in the desired position.

The help file documentation (Take Command / TCC Help v.22) says that == and EQ are equivalent, but in my code this does not appear to be the case. I would appreciate it if someone would please explain to me what's going on here. MIght EQ and EQC not be working properly because I have to use the SETDOS command with parameter /x-567 to read the HTML code file? Or is this a bug? Or is the documentation is simply wrong? Or am I doing something wrong?

Thank you.

vefatica · Dec 22, 2017

Can you give us a value for %p so that these differ in behavior?

Code:

iff "%p" == "SmallDataHeader" then ...

iff "%p" EQ "SmallDataHeader" then ...

Joe Caverly · Dec 22, 2017

Not sure if this is related, but;

Code:

IF will only parse numbers when one of (EQU,  NEQ, LSS, LEQ, GTR, GEQ) is used. 
The == comparison operator always results in a string comparison.

I realize that this is for CMD, but might give an idea.

Joe

Ref: SS64

Joe Caverly · Dec 22, 2017

Code:

Do not use parenthesis or quotes if you are comparing numeric values with an IF command.
For example
IF (2) GEQ (15) echo "bigger"
or
IF "2" GEQ "15" echo "bigger"
Will perform a character comparison and will echo "bigger"

however the commands:
IF 2 GEQ 15 echo "bigger"
or  
IF (2 GEQ 15) echo "bigger"
  Will perform a numeric comparison and return the correct result.

Joe

Ref: SS64

Jesse Heines · Dec 22, 2017

vefatica said:
Can you give us a value for %p so that these differ in behavior?

Code:

iff "%p" == "SmallDataHeader" then ... iff "%p" EQ "SmallDataHeader" then ...

Sure. Here are some of the incorrect results. The numbers on the left are line numbers in the file being read. The string to the right is the value of %p extracted using the %@instr function. The lines that are followed by asterisks and then the TCC error message are those that incorrectly matched the Boolean in the iff ... then statement.

Code:

1000
1100 <scr'+'ipt type
1200 " tms="navLink"
********************
TCC: C:\E-Drive\Finance\g.btm [51]  Unknown command "endiff"
1300 align="left" he
********************
TCC: C:\E-Drive\Finance\g.btm [51]  Unknown command "endiff"
1400
1500 lText" valign="
********************
TCC: C:\E-Drive\Finance\g.btm [51]  Unknown command "endiff"
1600 Content-Type" c
1700
1800
1900 oad bs');"
2000 " tms="navLink"
********************
TCC: C:\E-Drive\Finance\g.btm [51]  Unknown command "endiff"
2100
2200
2300 <!-- $HTMLid: /
2400 css">

The correct behavior (using ==) produces correct output, flagging only the line that contains "SmallDataHeader" (a class name) and no TCC error message:

Code:

5700 l) {"
5800
5900 !(IE)]><!-->"
6000 " tms="navLink"
6100 /td>"
6200
6300
6400 {font: 700 12px
6500
6600 "
6700 n" role="banner
6800 " tms="navLink"
6900
7000 SmallDataHeader
********************
7100 ><input type="s
7200    {font: 700 12p
7300 open (theURL, '
7400
7500 class="pnld"><a
7600 pnlogout" tms="

Here's the actual code:

Code:

1  set h=%@fileopen["%@drive[%0]\e-drive\finance\quotes1.txt",r]
 2  do until %t eq "**EOF**"
 3    set k=%@eval[%k+1]
 4    set t="%@fileread[%h]"
 5    set p=%@instr[19,15,%t]
 6    iff "%p" eq "SmallDataHeader" then
 7      echo %@format[4,%k] : %p
 8      echo **********************
 9    endiff
10  enddo
11  set h=%@fileclose[%h]

To get the correct result, change eq to == in line 6.

Thank you for your time in looking at this.

Jesse Heines · Dec 22, 2017

Thank you, Joe, but I'm not comparing numbers, I'm comparing strings.

vefatica · Dec 22, 2017

I can't figure out what you're doing, Jesse. But one thing might make things a lot easier, namely, get rid of the HTML/XML before you process the file. For example ...

Code:

TPIPE /input="%@drive[%0]\e-drive\finance\quotes1.txt" /output=notags.txt /simple=16

... now process notags.txt.

Jesse Heines · Dec 22, 2017

vefatica said:
I can't figure out what you're doing, Jesse. But one thing might make things a lot easier, namely, get rid of the HTML/XML before you process the file. For example ...

Code:

TPIPE /input="%@drive[%0]\e-drive\finance\quotes1.txt" /output=notags.txt /simple=16

... now process notags.txt.

Thank you for your reply, "vefatica," but I'm afraid that doing as you suggest would not help me. You said that you "can't figure out what [I'm] trying to do," so let me explain.

As you may know, Yahoo! Finance recently discontinued its service of providing Excel integration with its investment database. Literally thousands -- perhaps even tens or hundreds of thousands -- of amateur investors like me relied on that service to update our investment portfolios each day and track our progress. There are an uncountable number of posts on the Web about this and suggestions for alternative solutions. I have tried several, but have yet to find one that is truly satisfactory and lives up to the job.

This past week I discovered that I can use the TakeCommand webform command to query Fidelity for mutual fund prices with statements such as this:

Code:

webform /v /w"http://quotes.fidelity.com/webxpress/get_quote", "QUOTE_TYPE", "R", "SID_VALUE_ID", "FCNTX", "submit", "Quote" >> quotes1.txt

This downloads the Fidelity quote page with all the information I'm looking for. I created a 2D array of mutual fund ticker symbols and their names and call the above statement in a loop to create a file (quotes1.txt) containing all the funds in my portfolio.

Code:

set k=0
do %@eval[%@arrayinfo[funds,1]-1]
  set k=%@eval[%k+1]
  echos %funds[%k,0]` `
  webform /v /w"http://quotes.fidelity.com/webxpress/get_quote", "QUOTE_TYPE", "R", "SID_VALUE_ID", "%funds[%k,0]", "submit", "Quote" >> quotes1.txt
enddo

It is relatively easy to extract the last closing price because it is on a line with the class name "SmallDataHeader."

Code:

<td class="SmallDataHeader" align=right>55.79</td>

This class is not used for any other data on the page, so finding the closing prices is easy. This is not true for finding other data, as the classes and other HTML attributes on those lines are not unique. The other data that I'm interested in includes the mutual fund net change for the day and net change percentage. These data are in the file, but they're a little harder to identify by the surrounding tags due to the lack of uniqueness.

Thus I began using other techniques to read my generated quotes1.txt file. I found that I needed to use the SETDOS command to avoid getting errors caused by the special symbols in HTML code. No problem there. But then I ran into this EQ vs. == issue. As I wrote in my original post, the documentation says that they are equivalent, but clearly this is not the case. I can program around the issue, but again, that's not the issue. I would like to understand the difference between EQ and == in TakeCommand batch files.

I hope that this provides sufficient explanation of the background of my question. If I strip all the HTML as you suggest, I won't have the tags I need to parse the generated file and pull out the data I want. Again, thank you for your reply.

vefatica · Dec 22, 2017

Try this for one investment (using only the file for FCNTX).

Code:

v:\> webform /v /w"http://quotes.fidelity.com/webxpress/get_quote", "QUOTE_TYPE", "R", "SID_VALUE_ID", "FCNTX", "submit", "Quote" > quotes1.txt

v:\> tpipe /input=quotes1.txt /grep=3,0,0,0,0,0,0,0,"class.*SmallDataHeader" /simple=16 /replace=4,0,0,0,0,0,0,0,0,"\t",""
122.92

Note: after locating the line ("grep") "simple=16" removes the tags leaving only a bunch of tabs and the price, and "replace" gets rid of the tabs.

You could probably wrap that up to handle several investments. Untested:

Code:

do inv in /L investment1 investment2 ...
    echos %inv^t >> allinvestments.txt
    webform ..... "%inv" ... > quote.tmp
    tpipe /input=quote.tmp /output=allinvestments.txt /outputappend=1 /grep=3,0,0,0,0,0,0,0,"class.*SmallDataHeader" /simple=16 /replace=4,0,0,0,0,0,0,0,0,"\t",""
enddo

(or something like that). Allinvestments.txt should wind up looking like this:

name<tab>price
name<tab>price
...

Here's another way you might handle several investments. I used FCNTX three times.

Jesse Heines · Dec 23, 2017

Thank you, Vince. I'm a long-time TakeCommand user and I have written literally hundreds of BTM files -- I've been in love with JPSoft products since I started using 4DOS when it was first released in 1989 -- but I have never explored the tpipe command. I see that I've seriously been missing something in my programming repertoire and that I of course still have a lot to learn! :) Thanks for pointing me in that direction. I'll have to explore this command thoroughly, and that'll take me some time.

vefatica · Dec 23, 2017

TPIPE is a monster! If you know about UNIX text utilities (grep, sed, tr, ...) see if TPIPE can do them for you. In any event, "regular expressions" will be indispensible. Here are a couple non-TPIPE solutions. This will only pick out the first such line.

Code:

v:\> echo %@rereplace[".*right>|</td>",,%@execstr[ffind /k /m /v /e"class.*SmallDataHeader" quotes1.txt]]
122.92

To get more than one (or just one), you could use a do loop, picking out the lines with @REGEX.

Code:

v:\> do l in @quotes1.txt ( if "%@regex["class.*SmallDataHeader","%l"]" == "1" echo %@rereplace[".*right>|</td>",,%l] )
122.92
122.92
122.92

Jesse Heines · Dec 23, 2017

Vince:

I am ecstatic! Yes, "TPIPE is a monster" to be sure, but what power! I spent a good part of today playing with it and was able to produce the precise results that I want. Yay! What a nice Christmas present! :)

And just BTW, yes, I am quite familiar with regular expressions, but not an expert with the more complex structures. Those can be a bear, too.

I am now able to produce the output you see below (even with color! :), by extracting the data using grep and tpipe. (These are closing prices and changes as of yesterday, Friday, December 22nd.) This is exactly what I wanted. The price data also gets copied to the Windows clipboard so that I can paste it into Excel. Fantastic.

I couldn't attach the actual, final BTM file to this message board (probably for security reasons), so I printed it to a PDF and attached that. Very, very cool. My sincere thanks to you for pointing me toward the techniques I needed to create the solution. Merry Christmas!

Jesse

vefatica · Dec 23, 2017

You are quite welcome. I'm glad it worked well. All the best holiday wishes to you too.

AnrDaemon · Jan 15, 2018

You can't reliable use generic text parsing tools to parse structural formats like HTML.
Get normal XSL/XPATH parser/interpreter and just get what you want instead of wasting your time like that.

Jesse Heines · Jan 15, 2018

AnrDaemon said:
You can't reliable use generic text parsing tools to parse structural formats like HTML.
Get normal XSL/XPATH parser/interpreter and just get what you want instead of wasting your time like that.

I'm afraid that I strongly disagree with you here. You can't use an XSL parser on HTML5 because HTML5 documents are not "well-formed." They have opening tags that are not closed, such as <br> and <img src="...">.

TPIPE, as recommended Vince in previous responses, solved my problem beautifully. Take a look at that powerful tool.

AnrDaemon · Jan 17, 2018

They MAY have not well-formed tags. But a recommendation is to adhere to XML standards, and many page designers do, as it is easier to find an automated tool for XML processing, than a special-crafted tool for HTML processing. And there's always HTML Tidy.

Search

Welcome!

Done EQ vs ==

Jesse Heines

vefatica

Joe Caverly

Joe Caverly

Jesse Heines

Jesse Heines

vefatica

Jesse Heines

vefatica

Jesse Heines

vefatica

Jesse Heines

Attachments

vefatica

AnrDaemon

Jesse Heines

AnrDaemon