Logicbro wrote:Simple comparison... Ontario 26000 tests, 180 positive, Florida 30000 tests 3000 positive . It doesn't take a rocket scientist to see that Ontaroi is in much better shape at a 0.7 percent positve rate vs 10%.
What about the underlying distribution of the data and not just the raw numbers?
What if the age of the patient for 25% of the Ontario tests was 60+ years old and the age of the patient for 75% of the Florida tests was 60+ years old?
What if the percentage of the total population in Ontario aged 60+ years is 40% and the percentage of the total population in Florida within that same age bracket is 60% in comparison?
Would that still make for a simple comparison or would the unequal age distribution among those patients being tested when compared to each other and the underlying distribution of ages within their province/state have an effect on the outcome?
As far as the underlying data is concerned, there are 2 factors that can be controlled to manipulate the number of new cases and positive test rate among those being tested.
#1 - The number of tests administered.
In this case, given a fixed positive test rate, as the number of tests increases, so too does the number of positive tests.
Just the same as the number of times you roll a 7 increases as you attempt more rolls of the dice, the number of positive tests increases as more and more tests are administered.
Even if the probability of the expected outcomes are exactly the same as they are with a coin flip, the raw number of heads/tails increases along with the number of attempts since one can't happen without the other - can't flip tails/heads without flipping the coin.
Same with the relationship between tests and positive tests.
A patient cannot test positive without first being tested. As more people are tested, the number of positive test cases will increase in correlation with the number of tests by a factor defined by the positive test rate.
The number of positive test cases increases at 1/10 the rate of the number of tests when the positive test rate is 10%. Just the same, the number of positive test cases increases at 1/20 the rate of the number of tests when the positive test rate is 5% instead.
If positive test rate = 5% and # of tests = 100, the raw number of positive tests is 5 patients, but if the # of tests were to increase to 500, the raw number of positive tests would increase to 25 since 5 times as many people were tested, not because the virus become 5 times more deadly compared to the period you're comparing it against.
#2 - The underlying distribution of data.
In this case, given a consistent number of daily tests (i.e. 1,000 per day), as the age of the patient increases, so too does the positive test rate.
If you were to roll a set of dice 1,000 times, would you expect to roll a 2 as often as a 7?
FYI... here are the two dice probabilities for all possible comes using a pair of 6-sided dice.
2 = 1/36 (2.778%)
3 = 2/36 (5.556%)
4 = 3/36 (8.333%)
5 = 4/36 (11.111%)
6 = 5/36 (13.889%)
7 = 6/36 (16.667%)
8 = 5/36 (13.889%)
9 = 4/36 (11.111%)
10 = 3/36 (8.333%)
11 = 2/36 (5.556%)
12 = 1/36 (2.778%)
What if you only recorded the outcomes of every 3rd roll instead of every single outcome? What if half of the rolls were made with a single dice instead of two? What if one of the dice was weighed or changed in some other way that would affect the expected outcomes?
Would you still expect the raw numbers from those 1,000 dice rolls to approximate these expected probabilities or would they be different because the underlying conditions of your test vs. reality are different?
Based on the data collected from previous periods of testing, what happens when the positive test rate for aged 60 or younger = 5%, the positive test rate for aged 60 or older = 10%, and the distribution of tests administered to people within those groups ranges from 25% to 75% depending on these 3 scenarios?
SCENARIO #1: 50/50 split between both age groups
500 people aged 60 or younger tested + 500 people aged 60+ or older tested
500 x 5% = 25 positive tests
500 x 10% = 50 positive tests
1,000 tests with 75 positive tests = 7.5% positive test rate
SCENARIO #2: 25/75 split between both age groups
250 people aged 60 or younger tested + 750 people aged 60+ or older tested
250 x 5% = 12.5 positive tests = 13 positive tests
750 x 10% = 75 positive tests
1,000 tests with 88 positive tests = 8.8% positive test rate
SCENARIO #3: 75/25 split between both age groups
750 people aged 60 or younger tested + 250 people aged 60+ or older tested
750 x 5% = 37.5 positive tests = 38 positive tests
250 x 10% = 25 positive tests
1,000 tests with 63 positive tests = 6.3% positive test rate
Scenario #1 vs Scenario #2 = The positive test rate increased by 1.3%. The media, health officials and government report things are getting worse.
Scenario #3 vs Scenario #1 = The positive test rate increased by 1.2%. The media, health officials and government report things are getting worse.
Scenario #3 vs Scenario #2 = The positive test rate increased by 2.5%. The media, health officials and government report things are getting much worse.
Scenario #1 vs Scenario #3 = The positive test rate decreased by 1.2%. The media, health officials and government report things are getting better.
Scenario #2 vs Scenario #1 = The positive test rate decreased by 1.3%. The media, health officials and government report things are getting better.
Scenario #2 vs Scenario #3 = The positive test rate decreased by 2.5%. The media, health officials and government report things are getting much better.
Same raw number of tests. Different results and conclusions dependent upon the distribution of the underlying data and distribution of the underlying data for the proceeding period.
Do you see how easily the numbers can be manipulated?
If 1,000 tests are administered under all 3 scenarios, why are the number of new cases and positive test rate among each set of sample tests so different for each?
While Ontario and Florida might have similar numbers of tests being administered, there are many other underlying factors that can contribute to the number of new cases, positive test rate, and other experienced outcomes.
So much so that you don't have to be a rocket scientist to see it should you take a moment to try.