The cycle of large data sets is very slow. How do you speed it up?
-
I'll get a report from the OBD. Data received +- 1 sec. Line 300,000+. There is a need for some manipulation in each teration. One iteration is performed on an average of 0.003 seconds. Total - more than 15 minutes cycle. As a result, an association is formed with summary information (50 lines approximately). But it's been a long time.
Question: Any idea how to speed up or who's using the chips to speed up work with the big data?
update:
Manipulated:
if (!isset($consumption[$material_name]['category_name'])) { $consumption[$material_name]['category_name'] = $category_name; }
if (!isset($consumption[$material_name]['material_name'])) { $consumption[$material_name]['material_name'] = $material_name; } if (isset($consumption[$material_name][$date])) { $consumption[$material_name][$date] += $consumption_value; } else { $consumption[$material_name][$date] = $consumption_value; } if (isset($consumption_by_invoices["{$material_name}_$date"][$invoice_id])) { $consumption_by_invoices["{$material_name}_$date"][$invoice_id] += $consumption_value; } else { $consumption_by_invoices["{$material_name}_$date"][$invoice_id] = $consumption_value; }
-
The most obvious way to speed up operations in large cycles is to divide the cycle into smaller cycles and to launch them in parallel.
It was:
<?php // test.php
$start = time();
$result = 0;
for ($i = 0; $i < 100000; ++$i) {
++$result;
usleep(100);
}printf("%d\nза %d сек\n", $result, time() - $start);
It was:
<?php // test.php
if (count($argv) > 1) {
[, $limit] = $argv;
$result = 0;
for ($i = 0; $i < $limit; ++$i) {
++$result;
usleep(100);
}
echo "$result\n";
exit(0);
} else {
$start = time();
exec('echo 25000 25000 25000 25000 | xargs -P 4 -n 1 php test.php', $results);
$sum = 0;
foreach ($results as $result) {
$sum += (int) $result;
}printf("%d\nза %d сек\n", $sum, time() - $start);
}