![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
A good chunk of yesterday was trying to come up with some MySQL/MariaDB code that could generate a random row for me. Doing the simplest thing (something like SELECT ID FROM tablename ORDER BY RAND() LIMIT 1
) is slow. This huge thing worked, but you can't run a sequence of commands the same way you run prepared [single] statements:
SET @r := (RAND()*(SELECT MAX(ID) FROM tablename));
SET @sql := CONCAT('SELECT ID FROM tablename WHERE ID>=',@r,' ORDER BY ID ASC LIMIT 1');
PREPARE stmt1 FROM @sql;
EXECUTE stmt1;
Every attempt to condense that into one single query with a subquery always gave my an ID number below would only return ID numbers lower than about 310. It's like the statement "(RAND()*(SELECT MAX(ID) FROM tablename))
" gives different results when it's run separately versus running it in another statement, like SELECT ID FROM tablename WHERE ID>=(RAND()*(SELECT MAX(ID) FROM tablename)) ORDER BY ID ASC LIMIT 1
. Enter mysqli_multi_query
. Apologies for any bad formatting, this was whipped up quickly just to see if it worked:
$query = 'SET @r := (' . (mt_rand(0, mt_getrandmax()) / mt_getrandmax()) . '*(SELECT MAX(ID) FROM tablename));' . PHP_EOL . 'SET @sql := CONCAT(\'SELECT ID,\',@r, \' FROM tablename WHERE ID>=\',@r,\' ORDER BY ID ASC LIMIT 1\');' . PHP_EOL . 'PREPARE stmt1 FROM @sql;' . PHP_EOL . 'EXECUTE stmt1;';
if ($mysqli->multi_query($query)) {
while ($mysqli->more_results() && $mysqli->next_result()); // jump to last one
if ($result = $mysqli->store_result()) {
while ($row = $result->fetch_row()) $theID = $row[0];
}
$result->free();
}
(The code assumes you opened a link to the database as $mysqli.) Note that I used mt_rand
instead of the MySQL/MariaDB RAND()
because the latter produces too many repeats. I made a test to benchmark and test the randomness the code I posted above and I think the results are satisfactorily random (this is 1000 runs on a database with 5522 rows with IDs ranging from 2 to 6155):
1 results repeated 5 timesI'm assuming that one that comes up a lot appears because there's a giant hole between IDs 5624 and 5808, and this code just seems to like picking a number between those. If any holes in your table's IDS are fairly evenly distributed, they won't repeat that many times. If you have no holes at all and you can plan on it, something simpler may be in order. Even with the repeating numbers, I don't expect someone to click the link for a random result 61 times either. As for the time to execute, the
1 results repeated 19 times
2 results repeated 4 times
4 results repeated 3 times
76 results repeated 2 times
Highest ID: 6148
Lowest ID: 22
Max Time to Execute (returning only an ID): 0.0022029876708984
Min Time to Execute (returning only an ID): 0.0006871223449707
Average Time to Execute (returning only an ID): 0.00086945366859436
Max Time to Execute (returning all 45 columns): 0.0029420852661133
Min Time to Execute (returning all 45 columns): 0.00085997581481934
Average Time to Execute (returning all 45 columns): 0.001065233707428
First Repeat at Iteration #61
ORDER BY RAND() LIMIT 1
code took about 0.25 seconds to run, so I'd say the figures here are an improvement. (Don't laugh at my 45 column table; I'm just working with a database someone else designed that I can't change.) Hopefully this is useful to someone.