Opened 8 years ago
Last modified 8 years ago
#35 new task
What are dp and db supposed to be?
Reported by: | dneise | Owned by: | somebody |
---|---|---|---|
Priority: | minor | Milestone: | milestone:milestone1 |
Component: | component1 | Version: | |
Keywords: | Cc: |
Description
In line 188 and 189 of ratecontrol.cc
two values called dp and db are caluclated.
What are they supposed to mean?
Change History (3)
comment:1 by , 8 years ago
comment:2 by , 8 years ago
Ah ... well ... in that case I believe there is a little bug in the implementation.
I show this using the example of the patch rates with a little python program, you can copy this into your ipython shell
import numpy as np np.random.seed(0) # let's get 160 random rates around the value 70, with a sigma of 10. patch_rates = np.random.normal(70, 10, 160) # let's print the median and the std-deviation for reference print('median from numpy:', np.median(patch_rates)) print('68% around the median:', np.diff(np.percentile(patch_rates, [50-34, 50+34])) / 2) ## from this point on I use variable names from ratecontrol.cc and re-do what is done there medp = np.copy(patch_rates) # make new vector called `medp` containing the patch_rates medp = np.sort(medp) # sort it in place devp = np.abs(patch_rates - medp) # make a new vector `devp`, containing the elementwise difference between the original unsorted patch_rates and `medp`. devp = np.sort(devp) # sort `devp` in place mp = (medp[79]+medp[80])/2; dp = devp[109]; print('median from ratecontrol:', mp) print('68% quantile from ratecontrol:', dp)
We see, the median works well, but dp is not the same as the 68% quantile. It is a little larger.
median from numpy: 70.9409611944 68% around the median: [ 10.35922294] median from ratecontrol: 70.9409611944 68% quantile from ratecontrol: 14.5027351115
Even more interesting, dp depends on the order of the input data,
because of line 180 in ratecontrol.cc. One can see it in this example.
import numpy as np np.random.seed(0) patch_rates = np.random.normal(70, 10, 160) # let's print the median and the std-deviation for reference print('68% around the median:', np.diff(np.percentile(patch_rates, [50-34, 50+34])) / 2) # now let's repeat it 3 times but we shuffle the input data for test in range(3): np.random.shuffle(patch_rates) medp = np.copy(patch_rates) # make new vector called `medp` containing the patch_rates medp = np.sort(medp) # sort it in place devp = np.abs(patch_rates - medp) # make a new vector `devp`, containing the elementwise difference between the original unsorted patch_rates and `medp`. devp = np.sort(devp) # sort `devp` in place mp = (medp[79]+medp[80])/2; dp = devp[109]; print('68% quantile from ratecontrol:', dp)
In my case the output is
68% around the median: [ 10.35922294] 68% quantile from ratecontrol: 14.0650573838 68% quantile from ratecontrol: 15.5652876525 68% quantile from ratecontrol: 14.8008871673
I think the C++ implementation of this could e.g. look like this:
// Calculate Median and deviation vector<float> medp(sdata.fPatchRate, sdata.fPatchRate+160); sort(medp.begin(), medp.end()); const double mp = (medp[79]+medp[80])/2; const double dp1 = medp[135] - mp; // or const double dp2 = (medp[135] - medp[25]) /2 ;
the result from a python implementation looks like this then:
median from numpy: 70.9409611944 68% around the median: [ 10.35922294] mp: 70.9409611944 dp1: 11.0828372935 dp2: 10.5760103711
Last but not least ... this little bug, I believe, has a very limited impact on operation, if it has an impact at all. Since dp, is multiplied with 3.5 in line 236, I think, and this 3.5 has been chosen carefully by hand. So one would fix the implementation and thus reduce dp a little bit, one would simply have to increase this factor 3.5 a little bit an nothing would change overall.
Anyway I though I should mention this, since I was puzzled quite a lot when reading the code trying to understand it. From the variable name, I was guessing that dp, might stand for "delta_p" which is sometimes used as "sigma_p" depending on context, but due to the bug in the implementation I was unsure if I misinterpreted the variable name or whether it was an actual bug.
Now it's clear, thanks a lot.
comment:3 by , 8 years ago
I agree. There is a bug :( Fortunately, it seems that a solution is not very urgent, given that it works already (by chance) now for several years. However, I will look into it when I find the time.
This is the 68% quantile around the median.