Peculiarities of how START launches processes

Mar 14, 2019
10
0
Hello :-)

This is a bit of an odd request. I am on a bit of a detective case right now, and the problem I am trying to figure out isn't with TCC itself, but the evidence suggests that TCC, or something TCC is doing, plays a key role. Let me explain. :-)

I am a software developer and one of my projects involves a local Named Pipe server to which multiple clients can connect. Recently, I discovered a bizarre problem, where if many clients started in quick succession, some of them wouldn't be able to connect to the server at all (Windows behaves as if the server doesn't exist). Naturally, being a TCC user, the way I launched by barrage of clients was:

for /L %i in (1,1,50) do start Client.exe

When this is run, 50 windows pop up, but some very significant subset of them (10+ typically) at random get an error from the OS saying that no such named pipe server exists, even when the processes immediately before and after might succeed in connecting.

This immediately made me wonder if my pipe server isn't implemented properly, but after a good deal of piece-by-piece deconstruction and consultation with documentation and reference implementations, I can find nothing wrong with my server. Microsoft's documentation also specifically states that a client attempting to connect to a pipe will wait for the pipe server to come into existence if necessary. So, I set about creating a minimal reproduction. As part of this reproduction, I created a project to launch Client.exe in a controlled manner (and a manner repeatable by people who are not blessed to know the wonder of TCC :-). My project launches Client.exe in the most straightforward manner possible. I'm using .NET, so the OS API is thinly wrapped, but I am in essence doing nothing other than a straightforward CreateProcessEx (Process.Create). I could not reproduce the problem. Even if my driver application launches many hundreds of client instances as fast as it can, every single one gets its own connection to the server. I gradually reintroduced pieces so that my "minimal" reproduction became closer and closer to the real server, but nothing would reproduce the problem. Finally, tearing my hair out trying to figure out how it was different, I happened to use a for loop in TCC to start my minimal reproduction's Client.exe using the START built-in. Immediately, the problem recurred. I pared my test app back down, and the problem continues to occur but only if I am using TCC's START to launch the client processes.

Therefore I am wondering whether anyone from JP Software might be able to shed some light into what exactly TCC's START is doing beyond just calling CreateProcessEx to launch a child process. It seems that it must be doing something and whatever it is is causing some bizarre interference with Windows' named pipe infrastructure. Could it be related to jobs? I'm trying to figure this out, to see whether there's anything I need to change in my server to improve its reliability. I currently only know how to reproduce this problem using TCC to START the client processes, but without knowing why it is happening, I can't help but worry that whatever the underlying cause is, it could end up being accidentally triggered some other way and cause issues for our software down the road.

If necessary, I can probably share my minimal reproduction project (three .NET console applications, Server, Client and ClientRunner) if there's some chance that it'll help isolate the cause.

Hoping you can help me :-)

Thanks very much,

Jonathan Gilbert
 
Aug 23, 2010
636
9
START does not use CreateProcess, it uses ShellExecute(Ex).
Did you strace your clients to see how and why they are failing?
 
Mar 14, 2019
10
0
I haven't done a trace on them, but the .NET wrapper around named pipes is relatively thin. Based on a review of the reference source, the exception that arises reveals that the code is getting system error 121 from either WaitNamedPipe or CreateFile. I'm not entirely sure how to catch exactly which call it is, since Windows doesn't have a straight-up API monitor (I am aware of Rohitab's tool by exactly that name, but it doesn't seem to work on my Windows 10 64-bit installation), and in order to reproduce this problem, I am constrained to launching processes in bulk outside of the debugger. I'll see if I can't figure something out, though.
 
May 20, 2008
11,378
98
Syracuse, NY, USA
START does not use CreateProcess, it uses ShellExecute(Ex).
Did you strace your clients to see how and why they are failing?
That's not, in general, true. Using WinDBG with breakpoints on CreateProcess, ShellExecute, and ShellExecuteEx ...
Code:
START
uses CreateProcess (and not SE or SEEx) to start a new instance of TCC.
Code:
START notepad
uses CreateProcess (and not SE ot SEEx) to start notepad.
Code:
START https://www.google.com
uses both SE and SEEx (with SEEx ultimately calling CreateProcess if a new instance of the browser is needed).
Code:
START c:\
uses SE and SEEx
Code:
START file.pdf
used CreateProcess (and not SE or SEEx, a bit of a surprise).
 
Mar 14, 2019
10
0
Okay I duplicated the reference source for connecting to a named pipe into my project and can confirm that the ERROR_SEM_TIMEOUT error is coming back from WaitNamedPipe. This is in fact documented as being the expected result if the timeout period elapses, which suggests to me that this is essentially a bug in the .NET Framework; the code clearly tries to detect timeouts and turn them into TimeoutException objects instead, but this obvious case, where WaitNamedPipe's timeout elapses, doesn't get handled properly and a generic exception object is returned. :-P

So, this suggests that my pipe server isn't keeping up with the requests coming into it -- but only when TCC is doing a FOR loop. The next thing I'm going to check, then, is whether TCC is perhaps burning 100% CPU during the execution of the loop -- the hypothesis being that perhaps my server is being starved for CPU and isn't able to accept connections fast enough. For reference, remember that a minimal application that just calls CreateProcess in a loop as fast as it can can create hundreds of processes without triggering this problem, but for /L %i in (1,1,50) start Client.exe reliably produces several handfuls of failed connections every single time.
 
Mar 14, 2019
10
0
Well, there goes that theory. This graph shows a for loop creating not 50 but 100 child processes:

2270
 
Mar 14, 2019
10
0
I also did a check earlier using ShellExecute instead of CreateProcess from my driver, and it did not trigger the problem. So, whether START is using CreateProcess or ShellExecute does not seem to be a determining factor.
 

rconn

Administrator
Staff member
May 14, 2008
12,340
149
TCC doesn't do anything after the START (CreateProcess, not CreateProcessEx), unless you're running TCC inside a Take Command tab window (which requires a little communication between TCC & TCMD), or you've passed additional options to START to tell it to do something like wait for the START'd process to exit.

Is your client app a console or GUI app?
 
Mar 14, 2019
10
0
I have created a Git repository with a solution containing three projects that reproduce the issue on my end.

Git repository: logiclrd/TestPipeConnection

Projects:
- Server: Sets up a named pipe server, listens for connections, performs a basic protocol that does some unique communication with each client. Press Enter to exit.
- Client: Connects to an instance of Server, processes the data sent to it, and then sets its exit code based on whether it successfully connected and the data it received followed the expected protocol (0 == success).
- ClientRunner: Runs 100 instances of Client as quickly as possible and monitors their exit codes, reporting on the status of the batch to its console output.

To reproduce the findings so far in the thread:
1. Run Server.exe in one console window.
2. Run ClientRunner.exe in another console window. This will spam your system with 100 console windows, but they should all pretty much immediately say "Connected", and then they should disappear one by one as they received the expected data from the server. If all clients exit with the same code, ClientRunner reports this, and when I run it, the final line of output says:

All check results are "succeeded"

So far so good, this is what we want to see. But now, instead launch the Client instances directly using TCC's START command in a FOR loop:

[C:\code\TestPipeConnection\Client\bin\Debug]for /L %i in (1,1,50) do start Client.exe keepopen

The keepopen command-line option will cause Client.exe to wait until Enter is pressed if an exception occurs. (Otherwise the console window would disappear immediately and you wouldn't be able to see.)

I have two different systems where running this produces some significant fraction of windows with the error message:

System.IO.IOException: The semaphore timeout period has expired.

at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.Pipes.NamedPipeClientStream.Connect(Int32 timeout)
at Client.Program.Main(String[] args) in C:\code\TestPipeConnection\Client\Program.cs:line 25


(Technically, one of the systems is running 4NT 7.01.370, not TCC. Same behaviour.)
 
Mar 14, 2019
10
0
I am running TCC 19.10.51 x64. I notice a much newer version available for download. How does licensing intersect with that? Does the license purchased for TCC 19 carry forward, or is some sort of upgrade purchase required to run a newer version? I'd like to test with the latest version in addition to the 4NT 7.01.370 and TCC 19.10.51 I have tried so far, but I don't want to risk messing up my current properly-licensed TCC :-)
 

Charles Dye

Super Moderator
Staff member
May 20, 2008
4,446
88
Albuquerque, NM
prospero.unm.edu
Your v19 licence will not work for v24, but there is a 30-day grace period. If you want to test v24, be sure to install it to a different location from your existing copy. (The installer should do this by default, but it never hurts to check.)
 

rconn

Administrator
Staff member
May 14, 2008
12,340
149
This seems to be a timing issue - TCC is creating the new processes faster than your app (or perhaps Windows?) can handle them. On my fairly fast system, almost all of the clients display that timeout error.

However, if I slow things down a bit:

Code:
for /L %i in (1,1,50) do (start Client.exe keepopen & delay /m 200)

then it works perfectly for all of them.
 
Mar 14, 2019
10
0
But surely my ClientRunner application runs them even faster -- it has no scripting overhead at all and simply calls Process.Start (CreateProcessEx) in a tight loop as fast as possible. I can set it to 200 child instances, and all the windows have appeared within about 5 seconds, and not a single one fails! TCC must be doing something different, but I can't begin to imagine what it might be.
 
Similar threads
Thread starter Title Forum Replies Date
J Take command does not start on Windows 10 Support 3
N for start /b anamoly Support 1
U Can not get/start Take Command gui Support 1
DrusTheAxe START /? is incomplete Support 2
S How to? Runs start /w in invisible mode OR run program after exit of another one Support 3
vefatica Start/stop screensaver from TCC? Support 12
S START with title Support 4
vefatica START at 0,0? Support 2
D START /POS versus @WINPOS and @WINSIZE Support 12
vefatica START /elevated and PcaSvc? Support 5
S incorrect message at start Cancel batch job 4START.bat ? (Y/N/A) : Support 1
vefatica START /PGM "name with spaces.URL" fails Support 2
R How to? START program as top window? Support 2
Peter Murschall v24 IDE/BDEBUGGER won't start Support 9
Dmitry L. Kobyakov WAD Start /pgm "name.htm" works improperly Support 4
Joe Caverly START Dialog usage of %_ variables Support 0
A Fixed [23.0.22]: TCMD crashes shortly after start, TCC keeps running in background Support 6
kb6ojs Want to start TCMD v20 with every Windows 10 bootup Support 3
vefatica START *command Support 5
Per TCC/LE 14 64-bit won't start on Windows 10 Insider Preview 17063 (171213) Support 12
A Fixed (CMD compat) START /D fails to recognize the switch option. Support 3
MikeBaas start /runas - I'd like /netonly Support 2
vefatica Start VIEW with the toolbar showing? Support 5
vefatica Documentation START /AFFINITY Support 0
B how to do in "start" command thing like in tcc.exe Support 1
J directory or folder to start in Support 2
R How to? Not have RT version show on start Support 3
D WAD START /ELEVATED "Title" /PGM "job.btm" Support 6
J How to? Start TCMD with different configurations Support 3
vefatica START /K ... Support 1
vefatica Documentation START /AFFINITY Support 5
vefatica Start TCMD with File Explorer showing? Support 2
Alpengreis WAD Each start/close of TCMD changes my Explorer setting Support 7
WadeHatler Just started getting a message about Cloud Storage every time I start TCC 19 Support 3
Alpengreis Fixed TCMD FileExplorer Start-Directory problem Support 4
vefatica Shortcut/start-up folder nonsense Support 20
fishman@panix.com Can I start TCC in full screen mode? Support 9
R Unable to start any BTM file from Explorer Support 5
fpefpe How to? Stange start up issue Support 1
M Start "/Elevated" failure... Support 3
vefatica START, CMD vs. TCC Support 0
Steve Pitts Difference in exection with and without START Support 24
rjperrella start /tabna leaves blank windows when running batch scripts Support 10
vefatica Start-up directory for new tabs? Support 7
C OT: Registry OPEN verb to start elevated... Support 2
tmaynard Error on Take Command Start after recent update Support 2
M How to? Start a program on log on elevated abovenormal Support 5
M How to? Start the version of Take Command that I want... Support 7
C 'start tcmd.exe' actually starts tcc.exe Support 7
vefatica All my old TCMDs start slowly Support 9

Similar threads