How do we remove the measurement errors?



  • введите сюда описание изображения

    There is a set of points, each point is the gps coordinate of the bus (x,y), each point has timestamp. There are clear measurement errors on the schedule. How can they be removed? The solution must be simple, because all the coordinates are about 100,000. The idea is interesting, but it is desirable that it be implemented without particular problems by Java.

    Example of baseline data:

      1447037729  <tab>  3054.619968  <tab>  2409.828279  <tab>  570d8
    

    First field - UNIX-time, second and third - (x,y) respectively, fourth - bus identifier (buses about 50th). Baseline data: https://drive.google.com/file/d/0B4bA9d5B_O_BcVpPUXpYTmZBUFE/view



  • The set of points is two relationships: x(t) and y(t)and every one of them has impulse noise. Such data are ideally suited to the copper-processing algorithm in a sliding window on 7-9 elements, when the i-th,i+h is replaced by the median number element at h=3. ♪ ♪ 4.
    Processing x(t) and y(t) They should be carried out independently after which the bases should be replaced.

    Processing is effective at high impulse level (the third part of the data is distorted in the test example). Additional plus - that the reference data format remains. Crew processing takes place on smaller windows.
    The minus of treatment in a sliding window shall be observed when the sequences are scattered, as the openings and failures of the width are less than h applied.

    The demo programme provides a retrograde area in the window. To this end, the point between the old (delayed) and the new (additioned) elements is moving towards the old element, after which a new element is recorded at the extreme secondary level. This dramatically reduces computing costs.

    Demo programme (PHP):

    function print_a($a, $name){
        print("$name: ");
        foreach($a as $item){
            printf("%2d, ",$item);
        }
    }   
    

    function slide_median($h, $a){
    $size = count($a);
    $result = [];
    $slide = [];
    array_push($slide, reset($a));
    array_push($result,$slide[0]);
    print_a($slide, "&emsp;Сортировка в окне");
    print_a($result, "<br>Массив результата");

    for($i=1; $i&lt;=$h; $i++){
        array_push($slide, next($a), next($a));
        sort($slide);
        array_push($result, $slide[$i]);    
        print_a($slide, "&amp;emsp;Сортировка в окне");
        print_a($result, "&lt;br&gt;Массив результата");
    }
    
    for($i=0; $i &lt; $size-2*$h-1; $i++){
        $old = $a[$i];
        $new = $a[$i+2*$h+1];
        if($old &lt; $new){
            for($key = 0; $key &lt;= 2*$h; $key++){
                if($new &lt; $slide[$key]){
                    break;
                }
                if(($old &lt;= $slide[$key])&amp;&amp;($slide[$key] &lt; $new)) $slide[$key] = $slide[$key+1]; 
            }
            $slide[$key-1] = $new;
    
        }
        if($old &gt; $new){
            for($key = 2*$h; $key &gt;= 0; $key--){
                if($new &gt; $slide[$key]){
                    break;
                }
                if(($old &gt;= $slide[$key])&amp;&amp;($slide[$key] &gt; $new)) $slide[$key] = $slide[$key-1]; 
            }                   
            $slide[$key+1] = $new;
        }
        array_push($result, $slide[$h]);            
        print("&amp;emsp;old = $old, new =$new");
        print_a($slide, "&amp;emsp;Сортировка в окне");
        print_a($result, "&lt;br&gt;Массив результата");
    }
    
    for($i = $h-1; $i &gt; 0; $i--){
        $slide = array_slice($a, $size-2*$i-1, 2*$i+1);
        sort($slide);
        array_push($result, $slide[$i]);
        print_a($slide, "&amp;emsp;Сортировка в окне");
        print_a($result, "&lt;br&gt;Массив результата");
    }
    $slide = [$a[$size-1]];
    array_push($result, $slide[0]);
    print_a([end($a)], "&amp;emsp;Сортировка в окне");
    print_a($a, "&lt;br&gt;&lt;br&gt;Исходный массив: ");
    print_a($result, "&lt;br&gt;Массив результата");
    
    return $result;
    

    };

    $a = range(20, 40);
    foreach($a as &$item){
    $item += 5mt_rand(-1,1)(int)(mt_rand(0,199)/100);
    }
    print_a($a, "Исходный массив: ");
    slide_median(3, $a);

    Results (impulsive noise, amplitude 5):

    Baseline: : 20, 21, 22, 23, 19, 20, 21, 27, 28, 29, 30, 31, 32, 28, 29, 35, 36, 42, 43, 39, 35, Classification in the window: 20,
    Output mass: 20, Coaling: 20, 21, 22,
    Output mass: 20, 21, Orientation: 19, 20, 21, 22, 23,
    Output mass: 20, 21, Ocean grading: 19, 20, 20, 21, 22, 23,
    Output mass: 20, 21, 21, old = 20, new =27
    Output mass: 20, 21, 21, 21, old = 21, new =28
    Output mass: 20, 21, 21, 21, 22, old = 22, new =29 Occa: 19, 20, 21, 23, 27, 28, 29,
    Output mass: 20, 21, 21, 21, 21, 22, 23, old = 23, new =30
    Output mass: 20, 21, 21, 21, 21, 22, 23, 27, old = 19, new =31
    Output mass: 20, 21, 21, 21, 21, 21, 22, 23, 27, 28, old = 20, new =32
    Output mass: 20, 21, 21, 21, 21, 21, 22, 23, 27, 28, 29, old = 21, new =28 Classification in the window: 27, 28, 28, 29, 30, 31, 32,
    Output mass: 20, 21, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, old = 27, new =29 Cothing: 28, 28, 29, 29, 30, 31, 32,
    Output mass: 20, 21, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, old = 28, new =35
    Output mass: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 30, old = 29, new =36
    Output mass: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 30, 31, old = 30, new =42 Classification in the window: 28, 29, 31, 32, 35, 36, 42,
    Output mass: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, old = 31, new =43
    Output mass: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, old = 32, new =39 Oceans: 28, 29, 35, 36, 39, 42, 43,
    Output mass: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 30, 31, 32, 35, 36, old = 28, new =35
    Output mass: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, 36, 36, Coaling: 35, 36, 39, 42, 43,
    Output mass: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, 36, 39, Coaling: 35, 39, 43,
    Output mass: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, 36, 39, 39, windows: 35,

    Baseline: : 20, 21, 22, 23, 19, 20, 21, 27, 28, 29, 30, 31, 32, 28, 29, 35, 36, 42, 43, 39, 35, 35,
    Output Mass: 20, 21, 21, 21, 21, 22, 23, 27, 28, 29, 29, 29, 30, 31, 32, 35, 36, 39, 39, 35, 35, 35, 35, 35,

    Comparison of sliding copper and sliding medium on intensive impulse has been done by the following programme:

    function print_a($a, $name){
    print("$name: ");
    foreach($a as $item){
    printf("%3d, ",$item);
    }
    }

    function slide_median($h, $a){
    $size = count($a);
    $result = [];
    $slide = [];
    array_push($slide, reset($a));
    array_push($result,$slide[0]);

    for($i=1; $i&lt;=$h; $i++){
        array_push($slide, next($a), next($a));
        sort($slide);
        array_push($result, $slide[$i]);    
    }
    
    for($i=0; $i &lt; $size-2*$h-1; $i++){
        $old = $a[$i];
        $new = $a[$i+2*$h+1];
        if($old &lt; $new){
            for($key = 0; $key &lt;= 2*$h; $key++){
                if($new &lt; $slide[$key]){
                    break;
                }
                if(($old &lt;= $slide[$key])&amp;&amp;($slide[$key] &lt; $new)) $slide[$key] = $slide[$key+1]; 
            }
            $slide[$key-1] = $new;
    
        }
        if($old &gt; $new){
            for($key = 2*$h; $key &gt;= 0; $key--){
                if($new &gt; $slide[$key]){
                    break;
                }
                if(($old &gt;= $slide[$key])&amp;&amp;($slide[$key] &gt; $new)) $slide[$key] = $slide[$key-1]; 
            }                   
            $slide[$key+1] = $new;
        }
        array_push($result, $slide[$h]);            
    }
    
    for($i = $h-1; $i &gt; 0; $i--){
        $slide = array_slice($a, $size-2*$i-1, 2*$i+1);
        sort($slide);
        array_push($result, $slide[$i]);
    }
    $slide = [$a[$size-1]];
    array_push($result, $slide[0]);
    print_a($a, "&lt;br&gt;&lt;br&gt;Исходный массив ");
    print_a($result, "&lt;br&gt;Массив медиан &amp;emsp;");
    
    return $result;
    

    };

    function slide_average($h, $a){
    $size = count($a);
    $b = array_merge([0], $a);
    $sum = reset($a);
    $result = [$sum];

    for($i=1; $i&lt;=$h; $i++){
        $sum += next($a)+next($a);
        $average = (int)($sum/(2*$i+1)+.5);
        array_push($result, $average);  
    }
    
    reset($b);
    for($i=0; $i &lt; $size-2*$h-1; $i++){
        $sum += next($a) - next($b);
        $average = (int)($sum/(2*$h+1)+.5);
        array_push($result, $average);
    }
    
    for($i = $h-1; $i &gt;=0; $i--){
        $sum -= (next($b) + next($b));
        $average = (int)($sum/(2*$i+1)+.5);
        array_push($result, $average);
    }
    print_a($a, "&lt;br&gt;&lt;br&gt;Исходный массив ");
    print_a($result, "&lt;br&gt;Массив средних &amp;ensp;");
    return $result;
    

    };

    $a = range(200, 240);
    foreach($a as &$item){
    $item += 50mt_rand(-1,1)(int)(mt_rand(0,149)/100);
    }
    slide_median(3, $a);
    slide_average(3, $a);

    Results:

    Baseline : 200, 201, 202, 203, 204, 155, 206, 207, 208, 209, 210, 211, 212, 213, 264, 265, 216, 217, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 183, 234, 235, 236,240, 238
    Massive : 200, 201, 202, 202, 203, 204, 206, 207, 208, 209, 210, 211, 212, 213, 216, 217, 219, 219, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 229, 230, 231, 232, 235, 236, 238

    Baseline : 200, 201, 202, 203, 204, 155, 206, 207, 208, 209, 210, 211, 212, 213, 264, 265, 216, 217, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 183, 234, 235, 236,240, 238
    Medium Massive: 200, 201, 202, 196, 197, 198, 199, 200, 201, 209, 210, 218, 226, 227, 228, 229, 230, 231, 225, 219, 220, 221, 222, 223, 224, 225, 225, 226, 227, 22228, 229, 223, 224, 225, 226, 240, 236, 244,

    It's seen that the sliding copper is better able to smooth out the intensive accidental releases of the data.

    For real data (x,y rounded to whole):

    Processing of x80c48

    Baselines : 10790, 10728, 10565, 10228, 10148, 9911, 9861, 9880, 9894, 9907, 9910, 9917, 9932, 9937, 9925, 9684, 7146, 9040, 8912, 8703, 8350, 8338, 8087, 2571
    ♪ ♪

    Baselines : 10790, 10728, 10565, 10228, 10148, 9911, 9861, 9880, 9894, 9907, 9910, 9917, 9932, 9937, 9925, 9684, 7146, 9040, 8912, 8703, 8350, 8338, 8087, 2571
    6090, 10694, 10492, 10319, 10189, 10069, 9927, 9893, 9904, 9912, 9917, 9918, 9886, 9839, 9644, 9498, 9323, 9117, 8927, 8749, 8461, 8418, 8150, 7346, 7723, 7645

    Processing of y80c48

    4918, 3528,
    36, 38, 3418,

    4918, 3528,
    3589, 7269, 7230, 7197, 7141, 7071, 6991, 6887, 6775, 6475, 6305, 6114, 5924, 3329, 3544, 5375, 5260, 5110, 5083, 4711, 5087, 5067, 5056, 5051, 5031, 4990

    Compared to the sliding average algorithm, the sliding copper is much more accurately handling the data.

    Got it. https://math.stackexchange.com/questions/2206175/position-triangulation-of-moving-nodes/2268919#2268919 ♪




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2