-
Notifications
You must be signed in to change notification settings - Fork 21
Description
Probably ignore Problem 2 till I can replicate it in a workflow.
Problem 1: Edge case where number of bins fit perfectly into upper/lower bounds
The original problem was brought up by Ben Day related to when the number of bins (200) fit perfectly into the upper lower range (0-20). Our code originally put the distribution's weight on bin "11" just barely below bin "12" (correct answer). This was originally addressed in a PR I made here
Though in parallel Luke found a different solution/problem
“It looks like you did have processing for _check_and_update_repeating_values which seems like it just moves the later percentiles up by a bit. Instead, they should go down, so I rewrote that helper for clarity and added that change to a PR for your review: github.com/Metaculus/forecasting-tools/pull/166“
However I was able to recreate the problem in unit tests (here and here) on my branch, and the tests I made for this pass (which is why I was originally skeptical of the new PR without making more tests). I haven't had time to take a look at Luke's code and figure out who is right. Though if Luke is right, then maybe I just should have merged his PR in earlier (and frankly he probably is, just want to recreate and test things to be sure).
Btw the unit tests that Luke made are on unrelated code (i.e. testing old code used to regex find the numeric prediction from LLM output)
At very least some problem questions I found before the first fix were made into unit tests (e.g. the below maps to a real question that had a problem where the obvious answer was "12" for the number of nations who joined some treaty)
def test_discrete_distribution_repeated_value() -> None:
percentiles = [
Percentile(value=12, percentile=0.1),
Percentile(value=12, percentile=0.2),
Percentile(value=12, percentile=0.4),
Percentile(value=12, percentile=0.6),
Percentile(value=12, percentile=0.8),
Percentile(value=12, percentile=0.9),
]
numeric_distribution = NumericDistribution(
declared_percentiles=percentiles,
open_upper_bound=False,
open_lower_bound=False,
upper_bound=20,
lower_bound=0,
zero_point=None,
# standardize_cdf=True,
cdf_size=21,
)
pmf_diffs = _get_and_log_pmf_diffs(numeric_distribution)
assert pmf_diffs[12] > 0.5 > pmf_diffs[13]
assert pmf_diffs[12] > 0.5 > pmf_diffs[11]
Problem 2: Unexplored workflow errors
Though another problem is that I'm occasionally seeing some more error related to repeated values. However I'm having trouble finding workflows with these errors (I'll try to comment when I see them again). This makes me think its related to the above. The error usually mentions that all values in a numeric CDF need to be in increasing order (I think)