Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

Done Polling for *MONITOR

May
13,125
180
Best: Stop the polling for monitored events. That causes TCC to use about 30% more cycles than without it. Assuming you have something analogous to queue(s) and a thread to (poll and) process the queue(s), can you not let the queueing of an event set an EVENT (or other synchronization device) for which the queue-servicing thread could wait?

Second best: Put that polling back in its own thread (as in v20 and before). so those who never use *MONITOR can suspend the thread.
 
I don't know what you're referring to -- the only difference between v20 and v21 is that v21 does NOT have a separate thread and does not do any polling unless you have one or more monitors running. If you do, v21 creates a thread, and v20 and v21 are identical.

And no, many (most?) of the Windows functions used for monitoring are not asynchronous, so there's no way to set an event.

In my benchmarks, the polling thread (when active) uses around 0.007% of one core.
 
Whether on not you call it "polling", it's checking something regularly and that increases an idling TCC's cycle count by about 30%. I've tested this ad nauseum, not counting any cycles used in the first 3 seconds. The results are quite consistent. If A is v20 with the thread suspended, B is v20 with the thread running, and C is v21, then the cycle count ratios A:B:C are consistently about 40:51:53
 
Whether on not you call it "polling", it's checking something regularly and that increases an idling TCC's cycle count by about 30%. I've tested this ad nauseum, not counting any cycles used in the first 3 seconds. The results are quite consistent. If A is v20 with the thread suspended, B is v20 with the thread running, and C is v21, then the cycle count ratios A:B:C are consistently about 40:51:53

Which makes no sense, as v20 ALWAYS has the thread running, and v21 only has the thread running if you're monitoring something.

Can you give me an explicit test case of what you're measuring? (And why you think it's connected to the xxxMonitor functions?)
 
I have supposed the v20 thread in question concerns *MONITOR because in Sept 2016 you said
There are three permanent threads in TCC:

1) The main processing thread
2) A thread to handle messages posted to the TCC console window (required because Windows will not pass messages to console sessions)
3) A thread to poll the monitoring functions
The v20 thread is the one selected (and suspended) below.
upload_2017-6-17_11-38-2.png

I can suspend/resume that thread with a plugin. Since last fall, my v20 has run with that thread suspended and I have not noticed any ill effects. TCC works; the "TCC.EXE" class window gets messages; I don't know about messages to the console window. Today I see that, even so, FOLDERMONITOR works. So now I have no clue what that thread is doing other than increasing TCC's cycle count by 25-30%.
What does that thread do, and why?
 
But you're saying that V21 (which does not have that thread) runs slower.

And there's no way that a (nonexistent) polling thread is increasing TCC's cycles by 25 - 30%. I strongly suspect you're measuring something entirely unrelated -- exactly how are you determining that figure?

FOLDERMONITOR has its own monitoring thread, since it's one of the two or three monitoring functions that is truly asynchronous.
 
I'm saying it uses more cycles. Aparently the function of that thread is now in TCC's main (input) thread.

I start the three instances (v20 no thread (i=0), v20 with thread (i=1), v21 (i=2)). After they settle down (I wait 5 seconds) I read their "initial" cycle counts (%ic[%i]). Then I start monitoring. At regular intervals, I read their cycle counts (%c[%i]) and compute the percent used since the settle-down time, like this:

percent = %@eval[(%c[%i] - %ic[%i]) / 2600000000 / elapsed_seconds]

You don't have to get that fancy to see the difference. ProcessExplorer shows cycle-delta for each thread. If I look at the three processes I mentioned, and add the cycle-deltas (third column) of the threads which show a cycle-delta I see ratios like the ones I mentioned.

V20 - no thread
upload_2017-6-17_19-36-10.png


V20 - with thread
upload_2017-6-17_19-36-45.png


V21
upload_2017-6-17_19-37-21.png


The totals compare roughly like this:

0:1:2 ... 122:178:183

(a little worse, actually, than the 40:51:53 ratios I mentiones earlier). What does that thread in v20 do?

Here's my most recent test after more than 2 hours.
Code:
Elapsed: 141.05 min

V20-nothread
Cycles  0.00036%

V20-thread
Cycles  0.00047%

V21
Cycles  0.00056%
 
I'm saying it uses more cycles. Aparently the function of that thread is now in TCC's main (input) thread.

Nope. As I said, v21 has no monitoring thread (period) if you're not monitoring anything. There is no functionality in the main thread for the monitoring functions.

You don't have to get that fancy to see the difference. ProcessExplorer shows cycle-delta for each thread. If I look at the three processes I mentioned, and add the cycle-deltas (third column) of the threads which show a cycle-delta I see ratios like the ones I mentioned.

Not reproducible here (v21 x64, Windows 10). Using the same test, v21 is substantially faster than v20, and uses fewer cycles.

When I have some free time, I'll fire up an x86 Win 7 VM and try it there (but if it's an x86 issue I will not follow it any further).

I am curious though -- you've said (many times) that you don't care about performance, but you seem agitated here about a few microseconds??? If you're comparing idle v20 sessions with & without the monitoring thread, then yes, the monitoring thread will add a few milliseconds over the course of several hours. (Were you saving those cycles for something?) If your TCC session is actually doing something, you will never be able to distinguish the monitoring thread time.

But you'd have to be running an idle TCC session for several decades for it to use a fraction of the time you've spent in this thread ...
 
You have said twice now that I'm not concerned with performance. Where did you get that impression? I'm obsessed with performance, even when it has no practical implications. Besides that, investigating performance issues in just plain fun.

You haven't said ... what does that v20 thread (starting at AllocDup+0x2C0) do?
 
You have said twice now that I'm not concerned with performance. Where did you get that impression? I'm obsessed with performance, even when it has no practical implications. Besides that, investigating performance issues in just plain fun.

Well, for instance, in this thread you said you didn't care about anything except startup time:

psoft.com/forums/threads/tcc-takes-time-to-show-registered-to.8111/#post-46168

And again:

jpsoft.com/forums/threads/tcmd-v19-0-7-x86-uploaded.6972/#post-40189

If you actually *did* care about performance, you wouldn't be running Windows 7 x86 (which is the surest known way to cripple a computer -- short of running Vista x86).

You haven't said ... what does that v20 thread (starting at AllocDup+0x2C0) do?

I did say - in v20, there are three (permanent) threads, one for the main TCC thread, one for IPC (mostly with TCMD), and one for xxxMONITOR polling. Only the main thread consumes any significant amount of CPU. In v21, the only permanent threads are the main thread and the IPC thread.
 
It is still not clear to me. When I stop the v20 thread (at AllocDup+0x2C0) I get a significant reduction in cycle count. What does that thread do and what is not being done when it's suspended.
 
It is still not clear to me. When I stop the v20 thread (at AllocDup+0x2C0) I get a significant reduction in cycle count. What does that thread do and what is not being done when it's suspended.

It wakes up, scans the 100 possible monitor slots, checks the status if any of them are active (and processes the results if they have triggered), and then goes back to sleep. You might see it use as much as a couple of microseconds every few minutes if you don't have any monitors active.

And it's irrelevant now because v21 doesn't do that.
 
Thanks for that info. With WinDbg I saw it looping for a count of 100 and couldn't even guess what it might be doing.

v21's cycle usage of the largest of the three mentioned earlier. Don't you see that?
 
Back
Top
[FOX] Ultimate Translator
Translate