# How quickly do we calculate the median data fream on the rolling data window?

• Generating an accidental dateset and on a rolling window of 1,000 values, I think the median:

``````%%time
sr = pd.Series(np.random.randint(0,100, size=20000))
for i in range(10):
sr.rolling(1000).apply(lambda x: np.median(x))
``````

Result:

Wall time: 28.8 s

Three seconds to one. Such calculations need a lot. And the real date is 0.5M, not 20k.

How do you think a moving median is faster?

• Use Pandas built-in techniques:

``````In [265]: %timeit sr.rolling(1000).median()
14.4 ms ± 513 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [266]: %timeit sr.rolling(1000).apply(lambda x: np.median(x))
1.72 s ± 88.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
``````

on my laptop, the difference is 119 times. ♪

PS also questions - why do this many times? In the case of median values of different columns in one sliding window, this is also done by vectorized Pandas methods. free cycles

2

2

2

2

2

2

2

2

2

2

2

2

2

2

2