Finding the offset of a string in a file

May 20, 2008
12,178
133
Syracuse, NY, USA
I have a BTM which will start DEVENV with the most recently used .SLN file as an argument (avoiding the pick_a_project dialog). AFAIK, that info is only recorded in a file named "ApplicationPrivateSettings.xml", buried deep in the %APPDATA folder. I can get the file name with FFIND (or WHERE.EXE). After that I must find the first occurrence of the string "FullPath" in the file. I can do that with @FILEREAD (et al.) but searching about 10,000 bytes into the file is rather slow (about 3 seconds). For now I use SED.EXE to change "FullPath" to "NullPath" and send the output to a temp file ... then use CMP.EXE on the original file and the temp file and pick the offset out of CMP's output. That's pretty fast. But I'd like a more direct way to get the offset. Can anyone suggest another way, built-in or otherwise?
I've tried the @XML* functions without success. That may be because of my almost non-existent knowledge of XML. Looking at the file is difficult because its 44K bytes are all on one line? Is there a Windows (or free 3rd party) tool to make looking at the file easier? If anyone wants to try the @XML functions, I can post the file.

Thanks!
 
Jan 11, 2022
20
2
If you're willing to install Python (3.8 or higher) on your machine, you can use the attached script, which, for instance, performs an offset search in a 1MB source-code file in about 0.1s (which is mostly interpreter startup time -- for instance, a 100K file offset search takes 0.09s, so I assume whatever you throw at it will be fast).

Code:
Usage: python find-offset.py <filename> "string to find"

Prints the character offset of the string, or -1 if not found.
 
Jan 11, 2022
20
2
Huh, apparently the forum doesn't like attaching Python files. Here's the entirety of find-offset.py inline (it's very short).

Code:
import sys

argv = sys.argv

if len(argv) != 3:
    print("""\
Usage: python find-offset.py <filename> "string to find"

Prints the character offset of the string, or -1 if not found.""");
    sys.exit(1)

filename = argv[1]
needle = argv[2]

with open(filename, 'r', encoding="utf-8") as f:
    contents = f.read()

idx = contents.find(needle)
print(idx)
sys.exit(0)
 
Jan 11, 2022
20
2
HTML Tidy can reformat and pretty-print XML too. It's a CLI program, easy to use from scripts


Code:
tidy --quiet yes -xml -i input-file.xml > pretty-file.xml
 
Jan 19, 2011
614
15
Norman, OK
If you have Microsoft Edge (the new Chromium edition) or even the older IE11, you can drag the XML file into it. It has a built-in XML viewer.
 
May 20, 2008
12,178
133
Syracuse, NY, USA
If you have Microsoft Edge (the new Chromium edition) or even the older IE11, you can drag the XML file into it. It has a built-in XML viewer.
I do have edge, but I never use it. I tried it. The file looks better in edge than it does in MS's "XML Notepad".

But the bottom line is I want to get the offset of the first "FullPath" (or the actual value of it) in a script. Edge won't help with that.
 

samintz

Scott Mintz
May 20, 2008
1,557
26
Solon, OH, USA
It looks like that file is a mixture of XML and JSON. I was able to extract the JSON fairly easily.
Code:
echo %@xmlopen[C:\Users\mintz\AppData\Local\Microsoft\VisualStudio\16.0_66903569\ApplicationPrivateSettings.xml]
echo %@xmlxpath[/content/indexed/collection[@name="CodeContainers.Offline"]/value]
 
May 20, 2008
12,178
133
Syracuse, NY, USA
It looks like that file is a mixture of XML and JSON. I was able to extract the JSON fairly easily.
Code:
echo %@xmlopen[C:\Users\mintz\AppData\Local\Microsoft\VisualStudio\16.0_66903569\ApplicationPrivateSettings.xml]
echo %@xmlxpath[/content/indexed/collection[4]/value]
Yes, but that shows a list of used projects (and properties) that has no newlines. I have 53 entries (recently whittled down from 102, manually and tediously). Here's such a list with only 2 entries. Do you know how to go inside that list and pick 1 of the properties of one entry in the list?

Code:
[{"Key":"D:\\Projects2019\\filestr\\filestr.sln","Value":{"LocalProperties":{"FullPath":"D:\\Projects2019\\filestr\\filestr.sln","Type":0,"SourceControl":null},"Remote":null,"IsFavorite":false,"LastAccessed":"2022-05-24T21:50:03.5798427+00:00","IsLocal":true,"HasRemote":false,"IsSourceControlled":false}},{"Key":"P:\\4Utils\\4Utils.sln","Value":{"LocalProperties":{"FullPath":"P:\\4Utils\\4Utils.sln","Type":0,"SourceControl":null},"Remote":null,"IsFavorite":false,"LastAccessed":"2022-05-24T21:48:20.5397892+00:00","IsLocal":true,"HasRemote":false,"IsSourceControlled":false}},{"Key":"P:\\lastproject\\lastproject.sln","Value":{"LocalProperties":{"FullPath":"P:\\lastproject\\lastproject.sln","Type":0,"SourceControl":null},"Remote":null,"IsFavorite":false,"LastAccessed":"2022-05-24T15:48:24.3668938+00:00","IsLocal":true,"HasRemote":false,"IsSourceControlled":false}}, ...]

I'm after the first FullPath.
 

samintz

Scott Mintz
May 20, 2008
1,557
26
Solon, OH, USA
I pasted the JSON into jsonpath.com and it displays it nicely. But I can't figure out the secret to parse it.
 
May 20, 2008
12,178
133
Syracuse, NY, USA
I pasted the JSON into jsonpath.com and it displays it nicely. But I can't figure out the secret to parse it.
If I did that right, it doesn't look all that good.

1653446821126.png


Here it is in Firefox. Each record in the list starts on a new line. Edge is similar.

1653446929745.png
 
Jan 19, 2011
614
15
Norman, OK
But the bottom line is I want to get the offset of the first "FullPath" (or the actual value of it) in a script.
This returns the "actual value of it".
Code:
perl -p -e 's/.*FullPath":"(.*?)".*/\1/;' -e 's/\\\\/\\/g;' ApplicationPrivateSettings.xml

Edit:
Here it is again if you're running it from Windows command line (CMD/TCC) instead of from Cygwin.
Code:
perl -p -e "s/.*FullPath\":\"(.*?)\".*/\1/;" -e "s/\\\\/\\/g;" ApplicationPrivateSettings.xml
 
Last edited:
May 20, 2008
12,178
133
Syracuse, NY, USA
Thanks @JohnQSmith. I don't have perl but I'll try making a GnuWin32 sed.exe version of that (which may be a chore). This is what I'm using now to get the offset (then I use the @FILE* functions to get the string.

Code:
sed -e "s/FullPath/NullPath/g" %filename > %tmpfile

set offset=%@word[" ,",4,%@execstr[cmp %filename %tmpfile]]
 
May 20, 2008
12,178
133
Syracuse, NY, USA
Whew! Almost ...

Code:
v:\> sed -e "s/.*\"FullPath\":\"\([^^\"]*\).*/\1/;" -e "s/\\\\/\\/g;" %file
P:\Linux2\Linux2.sln

But that's the last one in the file. That's because SED is greedy. And SED doesn't have a non-greedy operator. But if I change only the first "FullPath" to "NullPath" and look for "NullPath" I get what I want.

Code:
v:\> sed -e "s/FullPath/NullPath/" -e "s/.*\"NullPath\":\"\([^^\"]*\).*/\1/;" -e "s/\\\\/\\/g;" %file
D:\Projects2019\filestr\filestr.sln

Thanks again @JohnQSmith.
 
May 20, 2008
12,178
133
Syracuse, NY, USA
Interesting! This fails because the opening '(' is outside quotes (from TCC's point of view) and the ')' is inside quotes (from TCC's point of view) ... thus not considered the closing ')'.

1653501561424.png


That's fixed by escaping (a la TCC) the '('.

Code:
v:\> echo %@execstr[sed -e "s/FullPath/NullPath/" -e "s/.*\"NullPath\":\"\^([^^\"]*\).*/\1/" -e "s/\\\\/\\/g" %file]
D:\Projects2019\filestr\filestr.sln

Whew! (again)
 
May 20, 2008
12,178
133
Syracuse, NY, USA
I've installed perl many times, usually ActiveState. But I need it so seldom that I never wind up learning even the basics. I always manage to survive on BTMs, C, and VBscript (and I know darn little about VBscript).
 
Jan 19, 2011
614
15
Norman, OK
I use it to run other people's scripts (awstats, sendEmail, ack) and as a SED replacement.
Edit: I don't know if that's a good enough reason to keep 620MB of files on my hard drive, but it comes in handy when I need it.
 
May 20, 2008
12,178
133
Syracuse, NY, USA
It looks like that file is a mixture of XML and JSON. I was able to extract the JSON fairly easily.
Code:
echo %@xmlopen[C:\Users\mintz\AppData\Local\Microsoft\VisualStudio\16.0_66903569\ApplicationPrivateSettings.xml]
echo %@xmlxpath[/content/indexed/collection[@name="CodeContainers.Offline"]/value]
I got that one to work also, @samintz. Thanks!

Code:
v:\> echo %@execstr[echo %@xmlxpath[/content/indexed/collection[4]/value] | cut -c9- | cut -d "," -f1 | sed -e "s/\\\\/\\/g"]
"D:\Projects2019\filestr\filestr.sln"
 
Aug 23, 2010
688
9
Code:
jq -r .[0].Value.LocalProperties.FullPath < xx.json
You can get native version of JSON query tool from the project homepage Download jq or using Cygwin setup launcher.
 

Similar threads