WAD @RANDOM

vefatica · Dec 17, 2018

I don't think this should produce negative numbers. It doesn't in v20 and before.

Code:

v:\> do i=1 to 5 ( echo %@random[0,2147483647] )
1448389178
-817923826
-272901703
1035707379
-1074066855

rconn · Dec 17, 2018

WAD - the maximum value allowed is 2147483647 - 1. I've corrected the help to reflect this.

vefatica · Dec 17, 2018

rconn said:
WAD - the maximum value allowed is 2147483647 - 1. I've corrected the help to reflect this.

It's not that simple!

Code:

v:\> do i=1 to 5 ( echo %@random[-1,2147483646] )
-405438847
-272795195
-311299433
1073697800
384306837

rconn · Dec 17, 2018

Is that a real use case?

vefatica · Dec 17, 2018

rconn said:
Is that a real use case?

Someone might have a use for it. But, really, it's all screwed up. Compare it to v20.

Where does the randomness come from ... MAPM? If so, it's a FP number in (0,1), call it x, and this sort of calculation should turn it into an integer >= lo and <= hi. And, for 32-bit numbers, the limits in the help should be OK.

floor(lo + x * ( hi - lo + 1)) assuming floor works like @EVAL's floor, namely, floor(-.5) = -1

vefatica · Dec 17, 2018

V20 does this without complaint.

Code:

set x=%@random[-2147483647,2147483647]

I've run a few simple tests for randomness on it and it seems to do it correctly.

vefatica · Dec 18, 2018

This is better in that the limits from the help seem OK. But there's a big problem with the randomness of it. I'd expect random uniform (each value equally likely). But it's coming out triangular (and very nicely so, as if by design) instead of uniform. Here are pics of the distributions of 10000 instances of %@random[-2147483647,2147483647] for v20 and v24.

vefatica · Dec 20, 2018

I see in the help that you changed @RANDOM (significantly) so that it now can produce 64-bit numbers. Internally it may be OK, but the output isn't. It always outputs a number in the INT range. This one shouldn't be in the INT range.

Code:

p:\4utils\release> echo %@random[%@eval[2**32], %@eval[2**63-2]]
1881353011

This one always gives a negative result (in the INT range).

Code:

p:\4utils\release> echo %@random[8999999999999999000,9000000000000000000]
-494666116

And here (below) while the range meets the stated criterion, there's a bad parameter message.

Code:

p:\4utils\release> echo %@random[%@eval[2**32], %@eval[2**64-3]]
TCC: (Sys) The parameter is incorrect.
 "%@random[4294967296, 18446744073709551613]"

rconn · Dec 20, 2018

vefatica said:
I see in the help that you changed @RANDOM (significantly) so that it now can produce 64-bit numbers. Internally it may be OK, but the output isn't. It always outputs a number in the INT range. This one shouldn't be in the INT range.

Code:

p:\4utils\release> echo %@random[%@eval[2**32], %@eval[2**63-2]] 1881353011

Works here with build 29.

Code:
This one always gives a negative result (in the INT range).

Code:

p:\4utils\release> echo %@random[8999999999999999000,9000000000000000000] -494666116

Works here with build 29.

Code:
And here (below) while the range meets the stated criterion, there's a bad parameter message.

Code:

p:\4utils\release> echo %@random[%@eval[2**32], %@eval[2**64-3]] TCC: (Sys) The parameter is incorrect. "%@random[4294967296, 18446744073709551613]"

WAD - the start and end arguments are 64-bit integers, and your end argument is negative (and < start, and therefore invalid).

vefatica · Dec 20, 2018

Yup! The first two are OK. As for the third, I only read the bit about the range in the help (and my range was OK) ... rather dumb of me since I should have figured you were using unsigned __int64 and I know well what values they can have.

What's your random number generator? Is it the 64-bit version of the Mersenne Twister? Though it's old, I just discovered it and implemented it in a plugin ... a tad faster that the 32-bit one (in only a few quick tests) regardless of whether both are built for x86 or both are built for x64.

rconn · Dec 20, 2018

Yes, I abandoned the (mostly useless) random number generator in the RTL and went with the Mersenne Twister.

vefatica · Dec 20, 2018

Hmmm! I hate to sound like a mathematician, but there's still a problem. I know what it is (had to deal with it myself) and how to work around it. That involves waste (throwing away some random numbers, can't be helped if you want uniformity) resulting in 1.5 generations each time instead of 1 (in this very bad case). That's a very minor problem since calling genrand64_int64() takes only about 1 microsecond (which is nothing compared to the rest of the code needed to implement @RANDOM).

This will happen whenever (hi - lo + 1) is not a power of 2, but can go unnoticed when the range is small. Here's a pretty bad case. I asked for a sample of size 10,000 like this:

Code:

(do i=1 to 10000 (echo %@random[-6148914691236517205,6148914691236517205])) > clip:

Here's a histogram of the results showing that a negative result is twice as likely as a positive one (even though the range is symmetric about 0). It's not surprising when you dig into what's happening. I'll post a work-around when I get to the other computer (and an explanation of what's happening if anyone wants one).

You don't need a histogram to see this ... just do a few and count the negative ones. Here (below) I got 66 negative ones out of 100.

Code:

v:\> do i=1 to 100 (echo %@random[-6148914691236517205,6148914691236517205]) | ffind /v /t"-" /u
  66 lines in      1 file

vefatica · Dec 21, 2018

Here's my version showing the workaround. To get it to be uniform, you have to throw away some bad guys before modding by (hi - lo + 1).

Code:

INT WINAPI f_TEST ( LPWSTR psz )
{
    LONGLONG llLo = 0, llHi = 0;  // inclusive, limited to plus/minus (_I64_MAX-1) = 9223372036854775806

    Sscanf(psz, L" %I64d , %I64d ", &llLo, &llHi);

    if ( llHi < llLo )
    {
        TCError(ERROR_INVALID_PARAMETER, psz);
        return -1;
    }

    ULONGLONG x, span = llHi - llLo + 1, limit = span*(0xFFFFFFFFFFFFFFFF/span);

    do x = genrand64_int64(); while ( x >= limit );

    // take your pick
    //_i64tow_s(llLo + (x % span), psz, 32, 10);
    Sprintf(psz, L"%I64d", llLo + ( x % span ));

    return 0;
}

A very, very slightly better workaround would be to use 2^64 where I used 0xFFFFFFFFFFFFFFFF, but that's hard to do on my computer! :-)

Here's a time comparison. I've done many. Mine's a little faster but you're probably doing more error checking. It also shows mine giving about 50:50 negative:positive and @RANDOM giving about 67:33.

Code:

v:\> timer & do i=1 to 10000 (echo %@random[-6148914691236517205,6148914691236517205]) | ffind /v /t"-"
/u & timer
Timer 1 on: 23:57:03
  6624 lines in      1 file
Timer 1 off: 23:57:06  Elapsed: 0:00:02.65

v:\> timer & do i=1 to 10000 (echo %@test[-6148914691236517205,6148914691236517205]) | ffind /v /t"-" /u
& timer
Timer 1 on: 23:58:07
  5019 lines in      1 file
Timer 1 off: 23:58:10  Elapsed: 0:00:02.53

Joe Caverly · Dec 21, 2018

For those who have not heard of the Mersenne Twister (such as myself), here's the Wikipedia page with details;

Mersenne Twister

Interesting read.

Joe

vefatica · Dec 21, 2018

vefatica said:
A very, very slightly better workaround would be to use 2^64 where I used 0xFFFFFFFFFFFFFFFF, but that's hard to do on my computer! :-)

I found a way around that and added another validity check on the parameters. A little change in the logic kept it fast.

Code:

v:\> timer /q & do i=1 to 10000 (noop %@random[0,1]) & timer
Timer 1 off: 12:06:33  Elapsed: 0:00:02.21

v:\> timer /q & do i=1 to 10000 (noop %@test[0,1]) & timer
Timer 1 off: 12:06:40  Elapsed: 0:00:02.23

There are also no restrictions on the limits except that they be 64 bit signed integers (and lo <= hi). This allows the following extreme case where the limits are exactly the limits of a 64-bit signed integer.

Code:

v:\> echo %@test[%@eval[-(2**63)],%@eval[2**63-1]]
-6901783667727303969

Code:

INT WINAPI f_TEST ( LPWSTR psz )
{
    LONGLONG llLo = 0, llHi = 0;  // inclusive range, no restrictions

    if ( Sscanf(psz, L" %I64d , %I64d ", &llLo, &llHi) != 2 || llHi < llLo )
    {
        TCError(ERROR_INVALID_PARAMETER, psz);
        return -1;
    }

    ULONGLONG x, span = llHi - llLo + 1, limit;

    // span == 0 means span == 2^64 (which is OK)
    // the second disjunct below is equivalent to span being a power of 2
    // in either case there are no bad guys to throw away

    if ( span == 0 || (limit = span*(0xFFFFFFFFFFFFFFFF/span)) + span == 0 )
        
        x = genrand64_int64();

    else    // else there will be bad guys at the top; disallow them

        { do x = genrand64_int64(); while ( x >= limit ); }
    

    //_i64tow_s(llLo + (span == 0 ? x : (x % span)), psz, 32, 10);
    Sprintf(psz, L"%I64d", llLo + (span == 0 ? x : (x % span)));

    return 0;
}

vefatica · Dec 22, 2018

Notes:
1. For what it's worth ...
2. I barely know what I'm doing when it comes to std:anything (so any tips would be appreciated)

You can also get at the Mersenne Twister like this (below). The problem I've discussed (and offered a fix for) seems to have been handled by Microsoft. Here's what I did in my 4UTILS plugin. It's fast and doesn't seem to eat up much space.

In 4UTILS.H

Code:

#include <random>
extern std::default_random_engine g;

In 4UTILS.CPP

Code:

// global scope
std::default_random_engine g; // will be Mersenne Twister ...
// in InitializePlugin()
g.seed(GetTickCount()); // ... according to g.seed's bubble help

In MRAND.CPP

Code:

#include <random>
INT WINAPI f_TEST ( LPWSTR psz ) // compare to @RANDOM
{
    LONGLONG lo = 0, hi = 0;  // inclusive range, no restrictions

    if ( Sscanf(psz, L" %I64d , %I64d ", &lo, &hi) != 2 || hi < lo )
    {
        TCError(ERROR_INVALID_PARAMETER, psz);
        return -1;
    }

    std::uniform_int_distribution<LONGLONG> m(lo, hi);

    Sprintf(psz, L"%I64d", m(g));

    return 0;
}

vefatica · Jan 4, 2019

This persists in build 30. Here's another solution. It might be a bit faster than the last one I offered because it postpones computing "limit" until it's known that it's necessary, and it avoids an integer multiplication (though that might have to be done at some level in computing the modulo (%).

Code:

    ULONGLONG x, span = hi - lo + 1, limit, mod;

    // span == 0 means span == 2^64, which is good ... no bad guys.
    // mod == (span - 1) is also good ... no bad guys.

    if ( span == 0 || (mod = (0xFFFFFFFFFFFFFFFF % span)) == (span - 1) )
        
        x = genrand64_int64();

    else    // else disregard the bad guys at the top
    {
        limit = 0xFFFFFFFFFFFFFFFF - mod;
        do x = genrand64_int64(); while ( x >= limit );
    }

    //_i64tow_s(lo + (span == 0 ? x : (x % span)), psz, 32, 10);
    Sprintf(psz, L"%I64d", lo + (span == 0 ? x : (x % span)));

rconn · Jan 4, 2019

vefatica said:
This persists in build 30.

What, exactly, persists in build 30?

vefatica · Jan 5, 2019

The distribution of values of @RANDOM[lo,hi] is only uniform (every number equally likely) when (hi - lo +1) is a power of 2. That's because you can only divide up 0 ~ UINT64_MAX into intervals of 0 ~ n-1 (using module, %) EVENLY when n is a power of 2. Here's a histogram of the distribution of values of @random[-6148914691236517205,6148914691236517205] which is an almost-worst case. Though the interval is symmetric about 0, the negative numbers are twice as likely than the positive numbers. It's easy to correct and with a correction I can't measure a time difference even in this almost-worst case. The most recent correction (my last post) is the best I have to offer. MS's inplementation of the Mersenne Twister has apparently corrected for this unavoidable effect. I also suggested a way to get that from std ... random.

rconn · Jan 5, 2019

vefatica said:
The distribution of values of @RANDOM[lo,hi] is only uniform (every number equally likely) when (hi - lo +1) is a power of 2. That's because you can only divide up 0 ~ UINT64_MAX into intervals of 0 ~ n-1 (using module, %) EVENLY when n is a power of 2. Here's a histogram of the distribution of values of @random[-6148914691236517205,6148914691236517205] which is an almost-worst case.

Your example @random[-6148914691236517205,6148914691236517205] has invalid arguments - the args are signed int 64's, and the max range is 0x7FFFFFFFFFFFFFFE. So you're specifying a negative range.

When I use a range < INT64_MAX-1 I do not see a distribution issue. For example:

Code:

do i=1 to 100 (echo %@random[-4611686018427387903,4611686018427387903]) | ffind /v /t"-" /u/code]

returns evenly distributed numbers.  Do you have a real-world need for random numbers > -4 quintillion and < 4 quintillion?

vefatica · Jan 5, 2019

rconn said:
Your example @random[-6148914691236517205,] has invalid arguments - the args are signed int 64's, and the max range is 0x7FFFFFFE. So you're specifying a negative range.

That's DWORD_MAX. Don't you mean 0x7FFFFFFFFFFFFFFE? The signed 64-bit integers range from -(2^63) to (2^63-1). That's -9223372036854775808 to 9223372036854775807.

The numbers I used in my example are well within that range (in fact somewhat carefully chosen to be about 2/3 of the limits).

rconn · Jan 5, 2019

The numbers you used in your example are within that range, but the *difference* between them is definitely not. @RANDOM does a (high - low) + 1, and with your example that puts it in the negative range for INT64.

The signed 64-bit integers range from -(2^63) to (2^63-1). That's -9223372036854775808 to 9223372036854775807.

Yes, and you can do @RANDOM[-9223372036854775807,0] or @RANDOM[0,9223372036854775807]. But you cannot do @RANDOM[-9223372036854775808,9223372036854775807]. (Nor, I suspect, does anybody actually *need* to.)

vefatica · Jan 5, 2019

I don't know what you're doing, or why, and I don't want to spend more time trying to figure it out. There is no need for (hi - lo + 1) to be in the (signed) INT64 range. It can be any number in the unsigned INT64 range. In fact it can even be 2^64 (looks more like 0). That puts no limits on lo and hi except that they be in the signed INT 64 range (and lo <= hi). Both my code and MS's std\random allow that, have corrected for non-uniformity, and are as fast as @RANDOM.

Charles Dye · Jan 5, 2019

Sound like you have working code, Vince. Why not publish a plugin to replace @RANDOM?

rconn · Jan 5, 2019

std::rand doesn't support negative numbers (which you apparently want since you keep trying to use them!).

But since you apparently don't actually have a use / need for this, I'm not going to spend any more time on it.

vefatica · Jan 5, 2019

Charles Dye said:
Sound like you have working code, Vince. Why not publish a plugin to replace @RANDOM?

There's already @MRAND in 4UTILS. The public version still uses the 32-bit Mersenne Twister. At home it has been updated to produce any 64-bit unsigned integer (0 ~ 2^64-1); subtract 2^63 for 64-bit signed numbers ( -(2^63) ~ 2^63-1). I haven't published it yet.

vefatica · Jan 5, 2019

rconn said:
std::rand doesn't support negative numbers (which you apparently want since you keep trying to use them!).

I don't know if we're talking about the same thing, but this certainly handles negative numbers, in fact all pairs of LONGLONGs with lo <= hi. And it uses the Mersenne Twister and doesn't need a uniformity correction.

Code:

std::uniform_int_distribution<LONGLONG> m(lo, hi);

I thought the help was more up-to-date but it seems to have reverted to something inappropriate.

vefatica · Jan 5, 2019

I'm not plugging std::random. I tried converting all my 4UTILS random stuff to it. That added 10KB to the DLL and in one case seemed to be buggy. Using my own MT64 is just as fast so I went back to it.

Welcome!

WAD @RANDOM

Administrator

Administrator

Administrator

Administrator

Administrator

Administrator

Administrator

Super Moderator

Administrator

Similar threads