buzzy: Rigby from Regular Show sitting in front of a computer (Computer Rigby)
[personal profile] buzzy

A good chunk of yesterday was trying to come up with some MySQL/MariaDB code that could generate a random row for me. Doing the simplest thing (something like SELECT ID FROM tablename ORDER BY RAND() LIMIT 1) is slow. This huge thing worked, but you can't run a sequence of commands the same way you run prepared [single] statements:

SET @r := (RAND()*(SELECT MAX(ID) FROM tablename));
SET @sql := CONCAT('SELECT ID FROM tablename WHERE ID>=',@r,' ORDER BY ID ASC LIMIT 1');
PREPARE stmt1 FROM @sql;
EXECUTE stmt1;
Every attempt to condense that into one single query with a subquery always gave my an ID number below would only return ID numbers lower than about 310. It's like the statement "(RAND()*(SELECT MAX(ID) FROM tablename))" gives different results when it's run separately versus running it in another statement, like SELECT ID FROM tablename WHERE ID>=(RAND()*(SELECT MAX(ID) FROM tablename)) ORDER BY ID ASC LIMIT 1. Enter mysqli_multi_query. Apologies for any bad formatting, this was whipped up quickly just to see if it worked:
$query = 'SET @r := (' . (mt_rand(0, mt_getrandmax()) / mt_getrandmax()) . '*(SELECT MAX(ID) FROM tablename));' . PHP_EOL . 'SET @sql := CONCAT(\'SELECT ID,\',@r, \' FROM tablename WHERE ID>=\',@r,\' ORDER BY ID ASC LIMIT 1\');' . PHP_EOL . 'PREPARE stmt1 FROM @sql;' . PHP_EOL . 'EXECUTE stmt1;';
if ($mysqli->multi_query($query)) {
while ($mysqli->more_results() && $mysqli->next_result()); // jump to last one
    if ($result = $mysqli->store_result()) {
        while ($row = $result->fetch_row()) $theID = $row[0];
    }
    $result->free();
}
(The code assumes you opened a link to the database as $mysqli.) Note that I used mt_rand instead of the MySQL/MariaDB RAND() because the latter produces too many repeats. I made a test to benchmark and test the randomness the code I posted above and I think the results are satisfactorily random (this is 1000 runs on a database with 5522 rows with IDs ranging from 2 to 6155):
1 results repeated 5 times
1 results repeated 19 times
2 results repeated 4 times
4 results repeated 3 times
76 results repeated 2 times
Highest ID: 6148
Lowest ID: 22
Max Time to Execute (returning only an ID): 0.0022029876708984
Min Time to Execute (returning only an ID): 0.0006871223449707
Average Time to Execute (returning only an ID): 0.00086945366859436
Max Time to Execute (returning all 45 columns): 0.0029420852661133
Min Time to Execute (returning all 45 columns): 0.00085997581481934
Average Time to Execute (returning all 45 columns): 0.001065233707428
First Repeat at Iteration #61
I'm assuming that one that comes up a lot appears because there's a giant hole between IDs 5624 and 5808, and this code just seems to like picking a number between those. If any holes in your table's IDS are fairly evenly distributed, they won't repeat that many times. If you have no holes at all and you can plan on it, something simpler may be in order. Even with the repeating numbers, I don't expect someone to click the link for a random result 61 times either. As for the time to execute, the ORDER BY RAND() LIMIT 1 code took about 0.25 seconds to run, so I'd say the figures here are an improvement. (Don't laugh at my 45 column table; I'm just working with a database someone else designed that I can't change.) Hopefully this is useful to someone.

Profile

buzzy: Steven Universe from the show of the same name with a big smile (Default)
Buzzy

January 2016

S M T W T F S
     12
3456789
101112131415 16
17181920212223
24252627282930
31      

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 24th, 2017 04:45 pm
Powered by Dreamwidth Studios