How to implement float hashing with approximate equalityHow to implement Continuous Delivery with Java webapp?How to test if a hashing algorithm is good?How Lua handles both integer and float numbers?How do crackers determine number of iterations of a Hashing algorithm?How to ensure objects unique by equality?How to convert byte-array (4 bytes) back into float?Should `Vector<float>.Equals` be reflexive or should it follow IEEE 754 semantics?Avoiding Division by Zero Using Float ComparisonHow does Pearson hashing compare with other non-cryptographic hashing algorithms?Does comparing equality of float numbers mislead junior developers even if no rounding error occurs in my case?

Is this strange Morse signal type common?

How can I test a shell script in a "safe environment" to avoid harm to my computer?

As a small race with a heavy weapon, does enlage remove the disadvantage?

Why doesn't increasing the temperature of something like wood or paper set them on fire?

Is it a good idea to copy a trader when investing?

I want to write a blog post building upon someone else's paper, how can I properly cite/credit them?

Is there an application which does HTTP PUT?

Align a table column at a specific symbol

Why is the episode called "The Last of the Starks"?

Gift for mentor after his thesis defense?

Why did Ham the Chimp push levers?

How to append code verbatim to .bashrc?

GLM: Modelling proportional data - account for variation in total sample size

How long can fsck take on a 30 TB volume?

Expl3 and recent xparse on overleaf: No expl3 loader detected

The unknown and unexplained in science fiction

Is it safe to keep the GPU on 100% utilization for a very long time?

I'm attempting to understand my 401k match and how much I need to contribute to maximize the match

What is the Ancient One's mistake?

Should one save up to purchase a house/condo or maximize their 401(k) first?

Can I bring back Planetary Romance as a genre?

Whose birthyears are canonically established in the MCU?

Why is it wrong to *implement* myself a known, published, widely believed to be secure crypto algorithm?

How do I minimise waste on a flight?



How to implement float hashing with approximate equality


How to implement Continuous Delivery with Java webapp?How to test if a hashing algorithm is good?How Lua handles both integer and float numbers?How do crackers determine number of iterations of a Hashing algorithm?How to ensure objects unique by equality?How to convert byte-array (4 bytes) back into float?Should `Vector<float>.Equals` be reflexive or should it follow IEEE 754 semantics?Avoiding Division by Zero Using Float ComparisonHow does Pearson hashing compare with other non-cryptographic hashing algorithms?Does comparing equality of float numbers mislead junior developers even if no rounding error occurs in my case?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








15















Let's say we have the following Python class (the problem exists in Java just the same with equals and hashCode)



class Temperature:
def __init__(self, degrees):
self.degrees = degrees


where degrees is the temperature in Kelvin as a float. Now, I would like to implement equality testing and hashing for Temperature in a way that



  • compares floats up to an epsilon difference instead of direct equality testing,

  • and honors the contract that a == b implies hash(a) == hash(b).

def __eq__(self, other):
return abs(self.degrees - other.degrees) < EPSILON

def __hash__(self):
return # What goes here?


The Python documentation talks a bit about hashing numbers to ensure that hash(2) == hash(2.0) but this is not quite the same problem.



Am I even on the right track? And if so, what is the standard way to implement hashing in this situation?



Update: Now I understand that this type of equality testing for floats eliminates the transitivity of == and equals. But how does that go together with the "common knowledge" that floats should not be compared directly? If you implement an equality operator by comparing floats, static analysis tools will complain. Are they right to do so?










share|improve this question



















  • 8





    why is the question has Java's tag?

    – Laiv
    Apr 29 at 7:53







  • 8





    About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

    – J. Fabian Meier
    Apr 29 at 9:04






  • 6





    @Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

    – Matthieu M.
    Apr 29 at 11:12







  • 4





    Kelvins are no longer degrees. Degrees are also ambiguous. Why not just call it kelvin?

    – Solomon Ucko
    Apr 29 at 12:01






  • 5





    Python has more-or-less excellent fixed-point support, maybe that’s something for you.

    – Jonas Schäfer
    Apr 29 at 14:07

















15















Let's say we have the following Python class (the problem exists in Java just the same with equals and hashCode)



class Temperature:
def __init__(self, degrees):
self.degrees = degrees


where degrees is the temperature in Kelvin as a float. Now, I would like to implement equality testing and hashing for Temperature in a way that



  • compares floats up to an epsilon difference instead of direct equality testing,

  • and honors the contract that a == b implies hash(a) == hash(b).

def __eq__(self, other):
return abs(self.degrees - other.degrees) < EPSILON

def __hash__(self):
return # What goes here?


The Python documentation talks a bit about hashing numbers to ensure that hash(2) == hash(2.0) but this is not quite the same problem.



Am I even on the right track? And if so, what is the standard way to implement hashing in this situation?



Update: Now I understand that this type of equality testing for floats eliminates the transitivity of == and equals. But how does that go together with the "common knowledge" that floats should not be compared directly? If you implement an equality operator by comparing floats, static analysis tools will complain. Are they right to do so?










share|improve this question



















  • 8





    why is the question has Java's tag?

    – Laiv
    Apr 29 at 7:53







  • 8





    About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

    – J. Fabian Meier
    Apr 29 at 9:04






  • 6





    @Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

    – Matthieu M.
    Apr 29 at 11:12







  • 4





    Kelvins are no longer degrees. Degrees are also ambiguous. Why not just call it kelvin?

    – Solomon Ucko
    Apr 29 at 12:01






  • 5





    Python has more-or-less excellent fixed-point support, maybe that’s something for you.

    – Jonas Schäfer
    Apr 29 at 14:07













15












15








15


5






Let's say we have the following Python class (the problem exists in Java just the same with equals and hashCode)



class Temperature:
def __init__(self, degrees):
self.degrees = degrees


where degrees is the temperature in Kelvin as a float. Now, I would like to implement equality testing and hashing for Temperature in a way that



  • compares floats up to an epsilon difference instead of direct equality testing,

  • and honors the contract that a == b implies hash(a) == hash(b).

def __eq__(self, other):
return abs(self.degrees - other.degrees) < EPSILON

def __hash__(self):
return # What goes here?


The Python documentation talks a bit about hashing numbers to ensure that hash(2) == hash(2.0) but this is not quite the same problem.



Am I even on the right track? And if so, what is the standard way to implement hashing in this situation?



Update: Now I understand that this type of equality testing for floats eliminates the transitivity of == and equals. But how does that go together with the "common knowledge" that floats should not be compared directly? If you implement an equality operator by comparing floats, static analysis tools will complain. Are they right to do so?










share|improve this question
















Let's say we have the following Python class (the problem exists in Java just the same with equals and hashCode)



class Temperature:
def __init__(self, degrees):
self.degrees = degrees


where degrees is the temperature in Kelvin as a float. Now, I would like to implement equality testing and hashing for Temperature in a way that



  • compares floats up to an epsilon difference instead of direct equality testing,

  • and honors the contract that a == b implies hash(a) == hash(b).

def __eq__(self, other):
return abs(self.degrees - other.degrees) < EPSILON

def __hash__(self):
return # What goes here?


The Python documentation talks a bit about hashing numbers to ensure that hash(2) == hash(2.0) but this is not quite the same problem.



Am I even on the right track? And if so, what is the standard way to implement hashing in this situation?



Update: Now I understand that this type of equality testing for floats eliminates the transitivity of == and equals. But how does that go together with the "common knowledge" that floats should not be compared directly? If you implement an equality operator by comparing floats, static analysis tools will complain. Are they right to do so?







java python hashing floating-point






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Apr 30 at 9:43









Glorfindel

2,39541727




2,39541727










asked Apr 28 at 23:41









CQQLCQQL

17916




17916







  • 8





    why is the question has Java's tag?

    – Laiv
    Apr 29 at 7:53







  • 8





    About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

    – J. Fabian Meier
    Apr 29 at 9:04






  • 6





    @Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

    – Matthieu M.
    Apr 29 at 11:12







  • 4





    Kelvins are no longer degrees. Degrees are also ambiguous. Why not just call it kelvin?

    – Solomon Ucko
    Apr 29 at 12:01






  • 5





    Python has more-or-less excellent fixed-point support, maybe that’s something for you.

    – Jonas Schäfer
    Apr 29 at 14:07












  • 8





    why is the question has Java's tag?

    – Laiv
    Apr 29 at 7:53







  • 8





    About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

    – J. Fabian Meier
    Apr 29 at 9:04






  • 6





    @Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

    – Matthieu M.
    Apr 29 at 11:12







  • 4





    Kelvins are no longer degrees. Degrees are also ambiguous. Why not just call it kelvin?

    – Solomon Ucko
    Apr 29 at 12:01






  • 5





    Python has more-or-less excellent fixed-point support, maybe that’s something for you.

    – Jonas Schäfer
    Apr 29 at 14:07







8




8





why is the question has Java's tag?

– Laiv
Apr 29 at 7:53






why is the question has Java's tag?

– Laiv
Apr 29 at 7:53





8




8





About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

– J. Fabian Meier
Apr 29 at 9:04





About your update: I would say that hashing floats is generally a questionable thing. Try to avoid using floats as keys or as set elements.

– J. Fabian Meier
Apr 29 at 9:04




6




6





@Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

– Matthieu M.
Apr 29 at 11:12






@Neil: At the same time, doesn't rounding sound like integers? By that I mean: if you can round to, say, thousandths of degrees, then you could simply used a fixed-point representation -- an integer expressing the temperature in thousandths of degrees. For ease of use, you could have a getter/setter transparently converting from/to floats if you wish to...

– Matthieu M.
Apr 29 at 11:12





4




4





Kelvins are no longer degrees. Degrees are also ambiguous. Why not just call it kelvin?

– Solomon Ucko
Apr 29 at 12:01





Kelvins are no longer degrees. Degrees are also ambiguous. Why not just call it kelvin?

– Solomon Ucko
Apr 29 at 12:01




5




5





Python has more-or-less excellent fixed-point support, maybe that’s something for you.

– Jonas Schäfer
Apr 29 at 14:07





Python has more-or-less excellent fixed-point support, maybe that’s something for you.

– Jonas Schäfer
Apr 29 at 14:07










6 Answers
6






active

oldest

votes


















41















implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,




Fuzzy equality violates the requirements that Java places on the equals method, namely transitivity, i.e. that if x == y and y == z, then x == z. But if you do an fuzzy equality with, for example, an epsilon of 0.1, then 0.1 == 0.2 and 0.2 == 0.3, but 0.1 == 0.3 does not hold.



While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.



So I strongly recommend you don't do that.



Either provide exact equality and base your hash on that in the obvious way, and provide a separate method to do the fuzzy matching, or go with the equivalence class approach suggested by Kain. Though in the latter case, I recommend you fix your value to a representative member of the equivalence class in the constructor, and then go with simple exact equality and hashing for the rest; it's much easier to reason about the types this way.



(But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)






share|improve this answer


















  • 2





    interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

    – Christophe
    Apr 29 at 6:51











  • @Christophe Interesting question. If you think about it, you'll see that this approach will make a single large equivalence class out of floats whose resolution is greater than epsilon (it's centered on 0, of course) and leave the other floats in their own class each. But that's not the point, the real problem is that whether it concludes that 2 numbers are equal depends on whether there is a third one compared and the order in which that is done.

    – Ordous
    Apr 29 at 14:50











  • Addressing @OP's edit, I would add that the incorrectness of floating-point == should "infect" the == of types containing them. That is, if they follow your advice of providing an exact equality, then their static analysis tool should further be configured to warn when equality is used on Temperature. It's the only thing you can do, really.

    – HTNW
    Apr 29 at 16:56











  • @HTNW: That would be too simple. A ratio class might have a float approximation field which does not participate in ==. Besides, the static analysis tool will already give a warning inside the == implementation of classes when one of the members being compared is a float type.

    – MSalters
    Apr 30 at 9:58











  • @MSalters ? Presumably, sufficiently configurable static analysis tools can do what I suggested just fine. If a class has a float field that doesn't participate in ==, then don't configure your tool to warn on == on that class. If the class does, then presumably marking the class's == as "too exact" will cause the tool to ignore that sort of error within the implementation. E.g. in Java, if @Deprecated void foo(), then void bar() foo(); is a warning, but @Deprecated void bar() foo(); is not. Maybe many tools don't support this, but some might.

    – HTNW
    Apr 30 at 12:24


















16














Good Luck



You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.



Example:



Assume that each point hashes to its own unique hash value.



As floating point numbers are sequential there will be up to k numbers prior to a given floating point value, and up to k numbers after a given floating point value which are within some epsilon of the given point.




  1. For each two points within epsilon of each other that do not share the same hash value.



    • Adjust the hashing scheme so that these two points hash to the same value.


  2. Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

There are a few cases where this will not hold true:



  • Positive/Negative Infinity

  • NaN

  • A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

  • perhaps a few other format specific instances

However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.



Outcome



Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).



Or the epsilon is such that only exact matches are permitted.



Granular



You could of course go for a granular approach instead.



Under this approach you define exact buckets down to a particular resolution. ie:



[0.001, 0.002)
[0.002, 0.003)
[0.003, 0.004)
...
[122.999, 123.000)
...


Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.



Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.






share|improve this answer


















  • 2





    I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

    – Neil
    Apr 29 at 6:35






  • 4





    @DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

    – J. Fabian Meier
    Apr 29 at 8:59






  • 2





    The bucket approach can be modified by checking not only the bucket with the exact hash key, but also the two neighboured buckets (or at least one of them) for their content as well. That elimininates the problem of those edge cases for the cost of increasing the running time by a factor of at most two (when implemented correctly). However, it does not change the general running time order.

    – Doc Brown
    Apr 29 at 15:26












  • While you are right in spirit, not everything will collapse. With a fixed small epsilon, most numbers will only equal themselves. Of course, for those the epsilon will be useless, so again, in spirit you are correct.

    – Carsten S
    Apr 30 at 9:43






  • 1





    @CarstenS Yes, my statement that 99% of the range hashes to a single hash does not actually cover the whole float range. There are many high range values who are separated by more than epsilon that will hash to their own unique buckets.

    – Kain0_0
    Apr 30 at 23:50


















7














You can model your temperature as an integer under the hood. Temperature has a natural lower bound (-273.15 Celsius). So, double (-273.15 is equal to 0 for your underlying integer). The second element that you need is the granularity of your mapping. You are already using this granularity implicitly; it is your EPSILON.



Just divide your temperature by EPSILON and take the floor of it, now your hash and your equal will behave in sync. In Python 3 the integer is unbounded, EPSILON can be smaller if you like.



BEWARE
If you change the value of EPSILON and you have serialised the object they will be not compatible!



#Pseudo code
class Temperature:
def __init__(self, degrees):
#CHECK INVALID VALUES HERE
#TRANSFORM TO KELVIN HERE
self.degrees = Math.floor(kelvin/EPSILON)





share|improve this answer
































    1














    Implementing a floating-point hash table that can find things that are "approximately equal" to a given key will require using a couple of approaches or a combination thereof:



    1. Round each value to an increment which is somewhat larger than the "fuzzy" range before storing it in the hash table, and when trying to find a value, check the hash table for the rounded values above and below the value sought.


    2. Store each item within the hash table using keys that are above and below the value being sought.


    Note that using either approach will likely require that hash table entries not identify items, but rather lists, since there will likely be multiple items associated with each key. The first approach above will minimize the required hash table size, but each search for an item not in the table will require two hash-table lookups. The second approach will quickly be able to identify that items aren't in the table, but will generally require the table to hold about twice as many entries as would otherwise be required. If one is trying to find objects in 2D space, it may be useful to use one approach for the X direction and one for the Y direction, so that instead of having each item stored once but requiring four query operations for each lookup, or being able to use one lookup to find an item but having to store each item four times, one would store each item twice and use two lookup operations to find it.






    share|improve this answer






























      0














      You can of course define “almost equal” by deleting say the last eight bits of the mantissa and then comparing or hashing. The problem is that numbers very close to each other may be different.



      There is some confusion here: if two floating point numbers compare equal, they are equal. To check if they are equal, you use “==“. Sometimes you don’t want to check for equality, but when you do, “==“ is the way to go.






      share|improve this answer






























        0














        This isn't an answer, but an extended comment that may be helpful.



        I have been working on a similar problem, while using MPFR (based on GNU MP). The "bucket" approach as outlined by @Kain0_0 seems to give acceptable results, but be aware of the limitations highlighted in that answer.



        I wanted to add that -- depending on what you are trying to do -- using an "exact" (caveat emptor) computer algebra system like Mathematica may help supplement or verify an inexact numerical program. This will allow you to compute results without worrying about rounding, for example, 7*√2 - 5*√2 will yield 2 instead of 2.00000001 or similar. Of course, this will introduce additional complications that may or may not be worth it.






        share|improve this answer





















          protected by gnat Apr 30 at 5:15



          Thank you for your interest in this question.
          Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



          Would you like to answer one of these unanswered questions instead?














          6 Answers
          6






          active

          oldest

          votes








          6 Answers
          6






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          41















          implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,




          Fuzzy equality violates the requirements that Java places on the equals method, namely transitivity, i.e. that if x == y and y == z, then x == z. But if you do an fuzzy equality with, for example, an epsilon of 0.1, then 0.1 == 0.2 and 0.2 == 0.3, but 0.1 == 0.3 does not hold.



          While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.



          So I strongly recommend you don't do that.



          Either provide exact equality and base your hash on that in the obvious way, and provide a separate method to do the fuzzy matching, or go with the equivalence class approach suggested by Kain. Though in the latter case, I recommend you fix your value to a representative member of the equivalence class in the constructor, and then go with simple exact equality and hashing for the rest; it's much easier to reason about the types this way.



          (But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)






          share|improve this answer


















          • 2





            interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

            – Christophe
            Apr 29 at 6:51











          • @Christophe Interesting question. If you think about it, you'll see that this approach will make a single large equivalence class out of floats whose resolution is greater than epsilon (it's centered on 0, of course) and leave the other floats in their own class each. But that's not the point, the real problem is that whether it concludes that 2 numbers are equal depends on whether there is a third one compared and the order in which that is done.

            – Ordous
            Apr 29 at 14:50











          • Addressing @OP's edit, I would add that the incorrectness of floating-point == should "infect" the == of types containing them. That is, if they follow your advice of providing an exact equality, then their static analysis tool should further be configured to warn when equality is used on Temperature. It's the only thing you can do, really.

            – HTNW
            Apr 29 at 16:56











          • @HTNW: That would be too simple. A ratio class might have a float approximation field which does not participate in ==. Besides, the static analysis tool will already give a warning inside the == implementation of classes when one of the members being compared is a float type.

            – MSalters
            Apr 30 at 9:58











          • @MSalters ? Presumably, sufficiently configurable static analysis tools can do what I suggested just fine. If a class has a float field that doesn't participate in ==, then don't configure your tool to warn on == on that class. If the class does, then presumably marking the class's == as "too exact" will cause the tool to ignore that sort of error within the implementation. E.g. in Java, if @Deprecated void foo(), then void bar() foo(); is a warning, but @Deprecated void bar() foo(); is not. Maybe many tools don't support this, but some might.

            – HTNW
            Apr 30 at 12:24















          41















          implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,




          Fuzzy equality violates the requirements that Java places on the equals method, namely transitivity, i.e. that if x == y and y == z, then x == z. But if you do an fuzzy equality with, for example, an epsilon of 0.1, then 0.1 == 0.2 and 0.2 == 0.3, but 0.1 == 0.3 does not hold.



          While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.



          So I strongly recommend you don't do that.



          Either provide exact equality and base your hash on that in the obvious way, and provide a separate method to do the fuzzy matching, or go with the equivalence class approach suggested by Kain. Though in the latter case, I recommend you fix your value to a representative member of the equivalence class in the constructor, and then go with simple exact equality and hashing for the rest; it's much easier to reason about the types this way.



          (But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)






          share|improve this answer


















          • 2





            interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

            – Christophe
            Apr 29 at 6:51











          • @Christophe Interesting question. If you think about it, you'll see that this approach will make a single large equivalence class out of floats whose resolution is greater than epsilon (it's centered on 0, of course) and leave the other floats in their own class each. But that's not the point, the real problem is that whether it concludes that 2 numbers are equal depends on whether there is a third one compared and the order in which that is done.

            – Ordous
            Apr 29 at 14:50











          • Addressing @OP's edit, I would add that the incorrectness of floating-point == should "infect" the == of types containing them. That is, if they follow your advice of providing an exact equality, then their static analysis tool should further be configured to warn when equality is used on Temperature. It's the only thing you can do, really.

            – HTNW
            Apr 29 at 16:56











          • @HTNW: That would be too simple. A ratio class might have a float approximation field which does not participate in ==. Besides, the static analysis tool will already give a warning inside the == implementation of classes when one of the members being compared is a float type.

            – MSalters
            Apr 30 at 9:58











          • @MSalters ? Presumably, sufficiently configurable static analysis tools can do what I suggested just fine. If a class has a float field that doesn't participate in ==, then don't configure your tool to warn on == on that class. If the class does, then presumably marking the class's == as "too exact" will cause the tool to ignore that sort of error within the implementation. E.g. in Java, if @Deprecated void foo(), then void bar() foo(); is a warning, but @Deprecated void bar() foo(); is not. Maybe many tools don't support this, but some might.

            – HTNW
            Apr 30 at 12:24













          41












          41








          41








          implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,




          Fuzzy equality violates the requirements that Java places on the equals method, namely transitivity, i.e. that if x == y and y == z, then x == z. But if you do an fuzzy equality with, for example, an epsilon of 0.1, then 0.1 == 0.2 and 0.2 == 0.3, but 0.1 == 0.3 does not hold.



          While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.



          So I strongly recommend you don't do that.



          Either provide exact equality and base your hash on that in the obvious way, and provide a separate method to do the fuzzy matching, or go with the equivalence class approach suggested by Kain. Though in the latter case, I recommend you fix your value to a representative member of the equivalence class in the constructor, and then go with simple exact equality and hashing for the rest; it's much easier to reason about the types this way.



          (But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)






          share|improve this answer














          implement equality testing and hashing for Temperature in a way that compares floats up to an epsilon difference instead of direct equality testing,




          Fuzzy equality violates the requirements that Java places on the equals method, namely transitivity, i.e. that if x == y and y == z, then x == z. But if you do an fuzzy equality with, for example, an epsilon of 0.1, then 0.1 == 0.2 and 0.2 == 0.3, but 0.1 == 0.3 does not hold.



          While Python does not document such a requirement, still the implications of having a non-transitive equality make it a very bad idea; reasoning about such types is headache-inducing.



          So I strongly recommend you don't do that.



          Either provide exact equality and base your hash on that in the obvious way, and provide a separate method to do the fuzzy matching, or go with the equivalence class approach suggested by Kain. Though in the latter case, I recommend you fix your value to a representative member of the equivalence class in the constructor, and then go with simple exact equality and hashing for the rest; it's much easier to reason about the types this way.



          (But if you do that, you might as well use a fixed point representation instead of floating point, i.e. you use an integer to count thousandths of a degree, or whatever precision you require.)







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Apr 29 at 6:30









          Sebastian RedlSebastian Redl

          11.7k63842




          11.7k63842







          • 2





            interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

            – Christophe
            Apr 29 at 6:51











          • @Christophe Interesting question. If you think about it, you'll see that this approach will make a single large equivalence class out of floats whose resolution is greater than epsilon (it's centered on 0, of course) and leave the other floats in their own class each. But that's not the point, the real problem is that whether it concludes that 2 numbers are equal depends on whether there is a third one compared and the order in which that is done.

            – Ordous
            Apr 29 at 14:50











          • Addressing @OP's edit, I would add that the incorrectness of floating-point == should "infect" the == of types containing them. That is, if they follow your advice of providing an exact equality, then their static analysis tool should further be configured to warn when equality is used on Temperature. It's the only thing you can do, really.

            – HTNW
            Apr 29 at 16:56











          • @HTNW: That would be too simple. A ratio class might have a float approximation field which does not participate in ==. Besides, the static analysis tool will already give a warning inside the == implementation of classes when one of the members being compared is a float type.

            – MSalters
            Apr 30 at 9:58











          • @MSalters ? Presumably, sufficiently configurable static analysis tools can do what I suggested just fine. If a class has a float field that doesn't participate in ==, then don't configure your tool to warn on == on that class. If the class does, then presumably marking the class's == as "too exact" will cause the tool to ignore that sort of error within the implementation. E.g. in Java, if @Deprecated void foo(), then void bar() foo(); is a warning, but @Deprecated void bar() foo(); is not. Maybe many tools don't support this, but some might.

            – HTNW
            Apr 30 at 12:24












          • 2





            interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

            – Christophe
            Apr 29 at 6:51











          • @Christophe Interesting question. If you think about it, you'll see that this approach will make a single large equivalence class out of floats whose resolution is greater than epsilon (it's centered on 0, of course) and leave the other floats in their own class each. But that's not the point, the real problem is that whether it concludes that 2 numbers are equal depends on whether there is a third one compared and the order in which that is done.

            – Ordous
            Apr 29 at 14:50











          • Addressing @OP's edit, I would add that the incorrectness of floating-point == should "infect" the == of types containing them. That is, if they follow your advice of providing an exact equality, then their static analysis tool should further be configured to warn when equality is used on Temperature. It's the only thing you can do, really.

            – HTNW
            Apr 29 at 16:56











          • @HTNW: That would be too simple. A ratio class might have a float approximation field which does not participate in ==. Besides, the static analysis tool will already give a warning inside the == implementation of classes when one of the members being compared is a float type.

            – MSalters
            Apr 30 at 9:58











          • @MSalters ? Presumably, sufficiently configurable static analysis tools can do what I suggested just fine. If a class has a float field that doesn't participate in ==, then don't configure your tool to warn on == on that class. If the class does, then presumably marking the class's == as "too exact" will cause the tool to ignore that sort of error within the implementation. E.g. in Java, if @Deprecated void foo(), then void bar() foo(); is a warning, but @Deprecated void bar() foo(); is not. Maybe many tools don't support this, but some might.

            – HTNW
            Apr 30 at 12:24







          2




          2





          interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

          – Christophe
          Apr 29 at 6:51





          interesting thoughts. So by accumulating millions of epsilon and with transitivity you can conclude that anything is equal to anything else :-) But does this mathematic constraint acknowledge the discrete foundation of floating points, which in many cases are approximations of the number they are intended to represent ?

          – Christophe
          Apr 29 at 6:51













          @Christophe Interesting question. If you think about it, you'll see that this approach will make a single large equivalence class out of floats whose resolution is greater than epsilon (it's centered on 0, of course) and leave the other floats in their own class each. But that's not the point, the real problem is that whether it concludes that 2 numbers are equal depends on whether there is a third one compared and the order in which that is done.

          – Ordous
          Apr 29 at 14:50





          @Christophe Interesting question. If you think about it, you'll see that this approach will make a single large equivalence class out of floats whose resolution is greater than epsilon (it's centered on 0, of course) and leave the other floats in their own class each. But that's not the point, the real problem is that whether it concludes that 2 numbers are equal depends on whether there is a third one compared and the order in which that is done.

          – Ordous
          Apr 29 at 14:50













          Addressing @OP's edit, I would add that the incorrectness of floating-point == should "infect" the == of types containing them. That is, if they follow your advice of providing an exact equality, then their static analysis tool should further be configured to warn when equality is used on Temperature. It's the only thing you can do, really.

          – HTNW
          Apr 29 at 16:56





          Addressing @OP's edit, I would add that the incorrectness of floating-point == should "infect" the == of types containing them. That is, if they follow your advice of providing an exact equality, then their static analysis tool should further be configured to warn when equality is used on Temperature. It's the only thing you can do, really.

          – HTNW
          Apr 29 at 16:56













          @HTNW: That would be too simple. A ratio class might have a float approximation field which does not participate in ==. Besides, the static analysis tool will already give a warning inside the == implementation of classes when one of the members being compared is a float type.

          – MSalters
          Apr 30 at 9:58





          @HTNW: That would be too simple. A ratio class might have a float approximation field which does not participate in ==. Besides, the static analysis tool will already give a warning inside the == implementation of classes when one of the members being compared is a float type.

          – MSalters
          Apr 30 at 9:58













          @MSalters ? Presumably, sufficiently configurable static analysis tools can do what I suggested just fine. If a class has a float field that doesn't participate in ==, then don't configure your tool to warn on == on that class. If the class does, then presumably marking the class's == as "too exact" will cause the tool to ignore that sort of error within the implementation. E.g. in Java, if @Deprecated void foo(), then void bar() foo(); is a warning, but @Deprecated void bar() foo(); is not. Maybe many tools don't support this, but some might.

          – HTNW
          Apr 30 at 12:24





          @MSalters ? Presumably, sufficiently configurable static analysis tools can do what I suggested just fine. If a class has a float field that doesn't participate in ==, then don't configure your tool to warn on == on that class. If the class does, then presumably marking the class's == as "too exact" will cause the tool to ignore that sort of error within the implementation. E.g. in Java, if @Deprecated void foo(), then void bar() foo(); is a warning, but @Deprecated void bar() foo(); is not. Maybe many tools don't support this, but some might.

          – HTNW
          Apr 30 at 12:24













          16














          Good Luck



          You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.



          Example:



          Assume that each point hashes to its own unique hash value.



          As floating point numbers are sequential there will be up to k numbers prior to a given floating point value, and up to k numbers after a given floating point value which are within some epsilon of the given point.




          1. For each two points within epsilon of each other that do not share the same hash value.



            • Adjust the hashing scheme so that these two points hash to the same value.


          2. Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

          There are a few cases where this will not hold true:



          • Positive/Negative Infinity

          • NaN

          • A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

          • perhaps a few other format specific instances

          However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.



          Outcome



          Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).



          Or the epsilon is such that only exact matches are permitted.



          Granular



          You could of course go for a granular approach instead.



          Under this approach you define exact buckets down to a particular resolution. ie:



          [0.001, 0.002)
          [0.002, 0.003)
          [0.003, 0.004)
          ...
          [122.999, 123.000)
          ...


          Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.



          Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.






          share|improve this answer


















          • 2





            I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

            – Neil
            Apr 29 at 6:35






          • 4





            @DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

            – J. Fabian Meier
            Apr 29 at 8:59






          • 2





            The bucket approach can be modified by checking not only the bucket with the exact hash key, but also the two neighboured buckets (or at least one of them) for their content as well. That elimininates the problem of those edge cases for the cost of increasing the running time by a factor of at most two (when implemented correctly). However, it does not change the general running time order.

            – Doc Brown
            Apr 29 at 15:26












          • While you are right in spirit, not everything will collapse. With a fixed small epsilon, most numbers will only equal themselves. Of course, for those the epsilon will be useless, so again, in spirit you are correct.

            – Carsten S
            Apr 30 at 9:43






          • 1





            @CarstenS Yes, my statement that 99% of the range hashes to a single hash does not actually cover the whole float range. There are many high range values who are separated by more than epsilon that will hash to their own unique buckets.

            – Kain0_0
            Apr 30 at 23:50















          16














          Good Luck



          You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.



          Example:



          Assume that each point hashes to its own unique hash value.



          As floating point numbers are sequential there will be up to k numbers prior to a given floating point value, and up to k numbers after a given floating point value which are within some epsilon of the given point.




          1. For each two points within epsilon of each other that do not share the same hash value.



            • Adjust the hashing scheme so that these two points hash to the same value.


          2. Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

          There are a few cases where this will not hold true:



          • Positive/Negative Infinity

          • NaN

          • A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

          • perhaps a few other format specific instances

          However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.



          Outcome



          Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).



          Or the epsilon is such that only exact matches are permitted.



          Granular



          You could of course go for a granular approach instead.



          Under this approach you define exact buckets down to a particular resolution. ie:



          [0.001, 0.002)
          [0.002, 0.003)
          [0.003, 0.004)
          ...
          [122.999, 123.000)
          ...


          Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.



          Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.






          share|improve this answer


















          • 2





            I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

            – Neil
            Apr 29 at 6:35






          • 4





            @DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

            – J. Fabian Meier
            Apr 29 at 8:59






          • 2





            The bucket approach can be modified by checking not only the bucket with the exact hash key, but also the two neighboured buckets (or at least one of them) for their content as well. That elimininates the problem of those edge cases for the cost of increasing the running time by a factor of at most two (when implemented correctly). However, it does not change the general running time order.

            – Doc Brown
            Apr 29 at 15:26












          • While you are right in spirit, not everything will collapse. With a fixed small epsilon, most numbers will only equal themselves. Of course, for those the epsilon will be useless, so again, in spirit you are correct.

            – Carsten S
            Apr 30 at 9:43






          • 1





            @CarstenS Yes, my statement that 99% of the range hashes to a single hash does not actually cover the whole float range. There are many high range values who are separated by more than epsilon that will hash to their own unique buckets.

            – Kain0_0
            Apr 30 at 23:50













          16












          16








          16







          Good Luck



          You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.



          Example:



          Assume that each point hashes to its own unique hash value.



          As floating point numbers are sequential there will be up to k numbers prior to a given floating point value, and up to k numbers after a given floating point value which are within some epsilon of the given point.




          1. For each two points within epsilon of each other that do not share the same hash value.



            • Adjust the hashing scheme so that these two points hash to the same value.


          2. Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

          There are a few cases where this will not hold true:



          • Positive/Negative Infinity

          • NaN

          • A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

          • perhaps a few other format specific instances

          However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.



          Outcome



          Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).



          Or the epsilon is such that only exact matches are permitted.



          Granular



          You could of course go for a granular approach instead.



          Under this approach you define exact buckets down to a particular resolution. ie:



          [0.001, 0.002)
          [0.002, 0.003)
          [0.003, 0.004)
          ...
          [122.999, 123.000)
          ...


          Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.



          Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.






          share|improve this answer













          Good Luck



          You are not going to be able to achieve that, without being stupid with hashes, or sacrificing the epsilon.



          Example:



          Assume that each point hashes to its own unique hash value.



          As floating point numbers are sequential there will be up to k numbers prior to a given floating point value, and up to k numbers after a given floating point value which are within some epsilon of the given point.




          1. For each two points within epsilon of each other that do not share the same hash value.



            • Adjust the hashing scheme so that these two points hash to the same value.


          2. Inducting for all such pairs the entire sequence of floating point numbers will collapse toward a single has value.

          There are a few cases where this will not hold true:



          • Positive/Negative Infinity

          • NaN

          • A few De-normalised ranges that may not be linkable to the main range for a given epsilon.

          • perhaps a few other format specific instances

          However >=99% of the floating point range will hash to a single value for any value of epsilon that includes at least one floating point value above or below some given floating point value.



          Outcome



          Either >= 99% entire floating point range hashes to a single value seriously comprimising the intent of a hash value (and any device/container relying on a fairly distributed low-collision hash).



          Or the epsilon is such that only exact matches are permitted.



          Granular



          You could of course go for a granular approach instead.



          Under this approach you define exact buckets down to a particular resolution. ie:



          [0.001, 0.002)
          [0.002, 0.003)
          [0.003, 0.004)
          ...
          [122.999, 123.000)
          ...


          Each bucket has a unique hash, and any floating point within the bucket compares equal to any other float in the same bucket.



          Unfortunately it is still possible for two floats to be epsilon distance away, and have two separate hashes.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Apr 29 at 2:13









          Kain0_0Kain0_0

          4,832420




          4,832420







          • 2





            I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

            – Neil
            Apr 29 at 6:35






          • 4





            @DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

            – J. Fabian Meier
            Apr 29 at 8:59






          • 2





            The bucket approach can be modified by checking not only the bucket with the exact hash key, but also the two neighboured buckets (or at least one of them) for their content as well. That elimininates the problem of those edge cases for the cost of increasing the running time by a factor of at most two (when implemented correctly). However, it does not change the general running time order.

            – Doc Brown
            Apr 29 at 15:26












          • While you are right in spirit, not everything will collapse. With a fixed small epsilon, most numbers will only equal themselves. Of course, for those the epsilon will be useless, so again, in spirit you are correct.

            – Carsten S
            Apr 30 at 9:43






          • 1





            @CarstenS Yes, my statement that 99% of the range hashes to a single hash does not actually cover the whole float range. There are many high range values who are separated by more than epsilon that will hash to their own unique buckets.

            – Kain0_0
            Apr 30 at 23:50












          • 2





            I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

            – Neil
            Apr 29 at 6:35






          • 4





            @DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

            – J. Fabian Meier
            Apr 29 at 8:59






          • 2





            The bucket approach can be modified by checking not only the bucket with the exact hash key, but also the two neighboured buckets (or at least one of them) for their content as well. That elimininates the problem of those edge cases for the cost of increasing the running time by a factor of at most two (when implemented correctly). However, it does not change the general running time order.

            – Doc Brown
            Apr 29 at 15:26












          • While you are right in spirit, not everything will collapse. With a fixed small epsilon, most numbers will only equal themselves. Of course, for those the epsilon will be useless, so again, in spirit you are correct.

            – Carsten S
            Apr 30 at 9:43






          • 1





            @CarstenS Yes, my statement that 99% of the range hashes to a single hash does not actually cover the whole float range. There are many high range values who are separated by more than epsilon that will hash to their own unique buckets.

            – Kain0_0
            Apr 30 at 23:50







          2




          2





          I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

          – Neil
          Apr 29 at 6:35





          I agree that the granular approach here would probably be best, if that fits OP's requirements. Though I'm afraid OP has like +/- 0.1% type requirements, meaning it can't be granular.

          – Neil
          Apr 29 at 6:35




          4




          4





          @DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

          – J. Fabian Meier
          Apr 29 at 8:59





          @DocBrown The "not possible" part is correct. If epsilon based equality should imply that the hash codes are equal, then you automatically have all hash codes equal, so the hash function is not useful anymore. The buckets approach can be fruitful, but you will have numbers with different hash codes that are arbitrarily close to each other.

          – J. Fabian Meier
          Apr 29 at 8:59




          2




          2





          The bucket approach can be modified by checking not only the bucket with the exact hash key, but also the two neighboured buckets (or at least one of them) for their content as well. That elimininates the problem of those edge cases for the cost of increasing the running time by a factor of at most two (when implemented correctly). However, it does not change the general running time order.

          – Doc Brown
          Apr 29 at 15:26






          The bucket approach can be modified by checking not only the bucket with the exact hash key, but also the two neighboured buckets (or at least one of them) for their content as well. That elimininates the problem of those edge cases for the cost of increasing the running time by a factor of at most two (when implemented correctly). However, it does not change the general running time order.

          – Doc Brown
          Apr 29 at 15:26














          While you are right in spirit, not everything will collapse. With a fixed small epsilon, most numbers will only equal themselves. Of course, for those the epsilon will be useless, so again, in spirit you are correct.

          – Carsten S
          Apr 30 at 9:43





          While you are right in spirit, not everything will collapse. With a fixed small epsilon, most numbers will only equal themselves. Of course, for those the epsilon will be useless, so again, in spirit you are correct.

          – Carsten S
          Apr 30 at 9:43




          1




          1





          @CarstenS Yes, my statement that 99% of the range hashes to a single hash does not actually cover the whole float range. There are many high range values who are separated by more than epsilon that will hash to their own unique buckets.

          – Kain0_0
          Apr 30 at 23:50





          @CarstenS Yes, my statement that 99% of the range hashes to a single hash does not actually cover the whole float range. There are many high range values who are separated by more than epsilon that will hash to their own unique buckets.

          – Kain0_0
          Apr 30 at 23:50











          7














          You can model your temperature as an integer under the hood. Temperature has a natural lower bound (-273.15 Celsius). So, double (-273.15 is equal to 0 for your underlying integer). The second element that you need is the granularity of your mapping. You are already using this granularity implicitly; it is your EPSILON.



          Just divide your temperature by EPSILON and take the floor of it, now your hash and your equal will behave in sync. In Python 3 the integer is unbounded, EPSILON can be smaller if you like.



          BEWARE
          If you change the value of EPSILON and you have serialised the object they will be not compatible!



          #Pseudo code
          class Temperature:
          def __init__(self, degrees):
          #CHECK INVALID VALUES HERE
          #TRANSFORM TO KELVIN HERE
          self.degrees = Math.floor(kelvin/EPSILON)





          share|improve this answer





























            7














            You can model your temperature as an integer under the hood. Temperature has a natural lower bound (-273.15 Celsius). So, double (-273.15 is equal to 0 for your underlying integer). The second element that you need is the granularity of your mapping. You are already using this granularity implicitly; it is your EPSILON.



            Just divide your temperature by EPSILON and take the floor of it, now your hash and your equal will behave in sync. In Python 3 the integer is unbounded, EPSILON can be smaller if you like.



            BEWARE
            If you change the value of EPSILON and you have serialised the object they will be not compatible!



            #Pseudo code
            class Temperature:
            def __init__(self, degrees):
            #CHECK INVALID VALUES HERE
            #TRANSFORM TO KELVIN HERE
            self.degrees = Math.floor(kelvin/EPSILON)





            share|improve this answer



























              7












              7








              7







              You can model your temperature as an integer under the hood. Temperature has a natural lower bound (-273.15 Celsius). So, double (-273.15 is equal to 0 for your underlying integer). The second element that you need is the granularity of your mapping. You are already using this granularity implicitly; it is your EPSILON.



              Just divide your temperature by EPSILON and take the floor of it, now your hash and your equal will behave in sync. In Python 3 the integer is unbounded, EPSILON can be smaller if you like.



              BEWARE
              If you change the value of EPSILON and you have serialised the object they will be not compatible!



              #Pseudo code
              class Temperature:
              def __init__(self, degrees):
              #CHECK INVALID VALUES HERE
              #TRANSFORM TO KELVIN HERE
              self.degrees = Math.floor(kelvin/EPSILON)





              share|improve this answer















              You can model your temperature as an integer under the hood. Temperature has a natural lower bound (-273.15 Celsius). So, double (-273.15 is equal to 0 for your underlying integer). The second element that you need is the granularity of your mapping. You are already using this granularity implicitly; it is your EPSILON.



              Just divide your temperature by EPSILON and take the floor of it, now your hash and your equal will behave in sync. In Python 3 the integer is unbounded, EPSILON can be smaller if you like.



              BEWARE
              If you change the value of EPSILON and you have serialised the object they will be not compatible!



              #Pseudo code
              class Temperature:
              def __init__(self, degrees):
              #CHECK INVALID VALUES HERE
              #TRANSFORM TO KELVIN HERE
              self.degrees = Math.floor(kelvin/EPSILON)






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Apr 30 at 9:45









              Glorfindel

              2,39541727




              2,39541727










              answered Apr 29 at 21:55









              Alessandro TeruzziAlessandro Teruzzi

              1994




              1994





















                  1














                  Implementing a floating-point hash table that can find things that are "approximately equal" to a given key will require using a couple of approaches or a combination thereof:



                  1. Round each value to an increment which is somewhat larger than the "fuzzy" range before storing it in the hash table, and when trying to find a value, check the hash table for the rounded values above and below the value sought.


                  2. Store each item within the hash table using keys that are above and below the value being sought.


                  Note that using either approach will likely require that hash table entries not identify items, but rather lists, since there will likely be multiple items associated with each key. The first approach above will minimize the required hash table size, but each search for an item not in the table will require two hash-table lookups. The second approach will quickly be able to identify that items aren't in the table, but will generally require the table to hold about twice as many entries as would otherwise be required. If one is trying to find objects in 2D space, it may be useful to use one approach for the X direction and one for the Y direction, so that instead of having each item stored once but requiring four query operations for each lookup, or being able to use one lookup to find an item but having to store each item four times, one would store each item twice and use two lookup operations to find it.






                  share|improve this answer



























                    1














                    Implementing a floating-point hash table that can find things that are "approximately equal" to a given key will require using a couple of approaches or a combination thereof:



                    1. Round each value to an increment which is somewhat larger than the "fuzzy" range before storing it in the hash table, and when trying to find a value, check the hash table for the rounded values above and below the value sought.


                    2. Store each item within the hash table using keys that are above and below the value being sought.


                    Note that using either approach will likely require that hash table entries not identify items, but rather lists, since there will likely be multiple items associated with each key. The first approach above will minimize the required hash table size, but each search for an item not in the table will require two hash-table lookups. The second approach will quickly be able to identify that items aren't in the table, but will generally require the table to hold about twice as many entries as would otherwise be required. If one is trying to find objects in 2D space, it may be useful to use one approach for the X direction and one for the Y direction, so that instead of having each item stored once but requiring four query operations for each lookup, or being able to use one lookup to find an item but having to store each item four times, one would store each item twice and use two lookup operations to find it.






                    share|improve this answer

























                      1












                      1








                      1







                      Implementing a floating-point hash table that can find things that are "approximately equal" to a given key will require using a couple of approaches or a combination thereof:



                      1. Round each value to an increment which is somewhat larger than the "fuzzy" range before storing it in the hash table, and when trying to find a value, check the hash table for the rounded values above and below the value sought.


                      2. Store each item within the hash table using keys that are above and below the value being sought.


                      Note that using either approach will likely require that hash table entries not identify items, but rather lists, since there will likely be multiple items associated with each key. The first approach above will minimize the required hash table size, but each search for an item not in the table will require two hash-table lookups. The second approach will quickly be able to identify that items aren't in the table, but will generally require the table to hold about twice as many entries as would otherwise be required. If one is trying to find objects in 2D space, it may be useful to use one approach for the X direction and one for the Y direction, so that instead of having each item stored once but requiring four query operations for each lookup, or being able to use one lookup to find an item but having to store each item four times, one would store each item twice and use two lookup operations to find it.






                      share|improve this answer













                      Implementing a floating-point hash table that can find things that are "approximately equal" to a given key will require using a couple of approaches or a combination thereof:



                      1. Round each value to an increment which is somewhat larger than the "fuzzy" range before storing it in the hash table, and when trying to find a value, check the hash table for the rounded values above and below the value sought.


                      2. Store each item within the hash table using keys that are above and below the value being sought.


                      Note that using either approach will likely require that hash table entries not identify items, but rather lists, since there will likely be multiple items associated with each key. The first approach above will minimize the required hash table size, but each search for an item not in the table will require two hash-table lookups. The second approach will quickly be able to identify that items aren't in the table, but will generally require the table to hold about twice as many entries as would otherwise be required. If one is trying to find objects in 2D space, it may be useful to use one approach for the X direction and one for the Y direction, so that instead of having each item stored once but requiring four query operations for each lookup, or being able to use one lookup to find an item but having to store each item four times, one would store each item twice and use two lookup operations to find it.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Apr 29 at 15:50









                      supercatsupercat

                      7,1661727




                      7,1661727





















                          0














                          You can of course define “almost equal” by deleting say the last eight bits of the mantissa and then comparing or hashing. The problem is that numbers very close to each other may be different.



                          There is some confusion here: if two floating point numbers compare equal, they are equal. To check if they are equal, you use “==“. Sometimes you don’t want to check for equality, but when you do, “==“ is the way to go.






                          share|improve this answer



























                            0














                            You can of course define “almost equal” by deleting say the last eight bits of the mantissa and then comparing or hashing. The problem is that numbers very close to each other may be different.



                            There is some confusion here: if two floating point numbers compare equal, they are equal. To check if they are equal, you use “==“. Sometimes you don’t want to check for equality, but when you do, “==“ is the way to go.






                            share|improve this answer

























                              0












                              0








                              0







                              You can of course define “almost equal” by deleting say the last eight bits of the mantissa and then comparing or hashing. The problem is that numbers very close to each other may be different.



                              There is some confusion here: if two floating point numbers compare equal, they are equal. To check if they are equal, you use “==“. Sometimes you don’t want to check for equality, but when you do, “==“ is the way to go.






                              share|improve this answer













                              You can of course define “almost equal” by deleting say the last eight bits of the mantissa and then comparing or hashing. The problem is that numbers very close to each other may be different.



                              There is some confusion here: if two floating point numbers compare equal, they are equal. To check if they are equal, you use “==“. Sometimes you don’t want to check for equality, but when you do, “==“ is the way to go.







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered Apr 29 at 14:00









                              gnasher729gnasher729

                              21.1k22762




                              21.1k22762





















                                  0














                                  This isn't an answer, but an extended comment that may be helpful.



                                  I have been working on a similar problem, while using MPFR (based on GNU MP). The "bucket" approach as outlined by @Kain0_0 seems to give acceptable results, but be aware of the limitations highlighted in that answer.



                                  I wanted to add that -- depending on what you are trying to do -- using an "exact" (caveat emptor) computer algebra system like Mathematica may help supplement or verify an inexact numerical program. This will allow you to compute results without worrying about rounding, for example, 7*√2 - 5*√2 will yield 2 instead of 2.00000001 or similar. Of course, this will introduce additional complications that may or may not be worth it.






                                  share|improve this answer



























                                    0














                                    This isn't an answer, but an extended comment that may be helpful.



                                    I have been working on a similar problem, while using MPFR (based on GNU MP). The "bucket" approach as outlined by @Kain0_0 seems to give acceptable results, but be aware of the limitations highlighted in that answer.



                                    I wanted to add that -- depending on what you are trying to do -- using an "exact" (caveat emptor) computer algebra system like Mathematica may help supplement or verify an inexact numerical program. This will allow you to compute results without worrying about rounding, for example, 7*√2 - 5*√2 will yield 2 instead of 2.00000001 or similar. Of course, this will introduce additional complications that may or may not be worth it.






                                    share|improve this answer

























                                      0












                                      0








                                      0







                                      This isn't an answer, but an extended comment that may be helpful.



                                      I have been working on a similar problem, while using MPFR (based on GNU MP). The "bucket" approach as outlined by @Kain0_0 seems to give acceptable results, but be aware of the limitations highlighted in that answer.



                                      I wanted to add that -- depending on what you are trying to do -- using an "exact" (caveat emptor) computer algebra system like Mathematica may help supplement or verify an inexact numerical program. This will allow you to compute results without worrying about rounding, for example, 7*√2 - 5*√2 will yield 2 instead of 2.00000001 or similar. Of course, this will introduce additional complications that may or may not be worth it.






                                      share|improve this answer













                                      This isn't an answer, but an extended comment that may be helpful.



                                      I have been working on a similar problem, while using MPFR (based on GNU MP). The "bucket" approach as outlined by @Kain0_0 seems to give acceptable results, but be aware of the limitations highlighted in that answer.



                                      I wanted to add that -- depending on what you are trying to do -- using an "exact" (caveat emptor) computer algebra system like Mathematica may help supplement or verify an inexact numerical program. This will allow you to compute results without worrying about rounding, for example, 7*√2 - 5*√2 will yield 2 instead of 2.00000001 or similar. Of course, this will introduce additional complications that may or may not be worth it.







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Apr 29 at 14:45









                                      BurnsBABurnsBA

                                      1011




                                      1011















                                          protected by gnat Apr 30 at 5:15



                                          Thank you for your interest in this question.
                                          Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                          Would you like to answer one of these unanswered questions instead?



                                          Popular posts from this blog

                                          Club Baloncesto Breogán Índice Historia | Pavillón | Nome | O Breogán na cultura popular | Xogadores | Adestradores | Presidentes | Palmarés | Historial | Líderes | Notas | Véxase tamén | Menú de navegacióncbbreogan.galCadroGuía oficial da ACB 2009-10, páxina 201Guía oficial ACB 1992, páxina 183. Editorial DB.É de 6.500 espectadores sentados axeitándose á última normativa"Estudiantes Junior, entre as mellores canteiras"o orixinalHemeroteca El Mundo Deportivo, 16 setembro de 1970, páxina 12Historia do BreogánAlfredo Pérez, o último canoneiroHistoria C.B. BreogánHemeroteca de El Mundo DeportivoJimmy Wright, norteamericano do Breogán deixará Lugo por ameazas de morteResultados de Breogán en 1986-87Resultados de Breogán en 1990-91Ficha de Velimir Perasović en acb.comResultados de Breogán en 1994-95Breogán arrasa al Barça. "El Mundo Deportivo", 27 de setembro de 1999, páxina 58CB Breogán - FC BarcelonaA FEB invita a participar nunha nova Liga EuropeaCharlie Bell na prensa estatalMáximos anotadores 2005Tempada 2005-06 : Tódolos Xogadores da Xornada""Non quero pensar nunha man negra, mais pregúntome que está a pasar""o orixinalRaúl López, orgulloso dos xogadores, presume da boa saúde económica do BreogánJulio González confirma que cesa como presidente del BreogánHomenaxe a Lisardo GómezA tempada do rexurdimento celesteEntrevista a Lisardo GómezEl COB dinamita el Pazo para forzar el quinto (69-73)Cafés Candelas, patrocinador del CB Breogán"Suso Lázare, novo presidente do Breogán"o orixinalCafés Candelas Breogán firma el mayor triunfo de la historiaEl Breogán realizará 17 homenajes por su cincuenta aniversario"O Breogán honra ao seu fundador e primeiro presidente"o orixinalMiguel Giao recibiu a homenaxe do PazoHomenaxe aos primeiros gladiadores celestesO home que nos amosa como ver o Breo co corazónTita Franco será homenaxeada polos #50anosdeBreoJulio Vila recibirá unha homenaxe in memoriam polos #50anosdeBreo"O Breogán homenaxeará aos seus aboados máis veteráns"Pechada ovación a «Capi» Sanmartín e Ricardo «Corazón de González»Homenaxe por décadas de informaciónPaco García volve ao Pazo con motivo do 50 aniversario"Resultados y clasificaciones""O Cafés Candelas Breogán, campión da Copa Princesa""O Cafés Candelas Breogán, equipo ACB"C.B. Breogán"Proxecto social"o orixinal"Centros asociados"o orixinalFicha en imdb.comMario Camus trata la recuperación del amor en 'La vieja música', su última película"Páxina web oficial""Club Baloncesto Breogán""C. B. Breogán S.A.D."eehttp://www.fegaba.com

                                          Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070

                                          Cegueira Índice Epidemioloxía | Deficiencia visual | Tipos de cegueira | Principais causas de cegueira | Tratamento | Técnicas de adaptación e axudas | Vida dos cegos | Primeiros auxilios | Crenzas respecto das persoas cegas | Crenzas das persoas cegas | O neno deficiente visual | Aspectos psicolóxicos da cegueira | Notas | Véxase tamén | Menú de navegación54.054.154.436928256blindnessDicionario da Real Academia GalegaPortal das Palabras"International Standards: Visual Standards — Aspects and Ranges of Vision Loss with Emphasis on Population Surveys.""Visual impairment and blindness""Presentan un plan para previr a cegueira"o orixinalACCDV Associació Catalana de Cecs i Disminuïts Visuals - PMFTrachoma"Effect of gene therapy on visual function in Leber's congenital amaurosis"1844137110.1056/NEJMoa0802268Cans guía - os mellores amigos dos cegosArquivadoEscola de cans guía para cegos en Mortágua, PortugalArquivado"Tecnología para ciegos y deficientes visuales. Recopilación de recursos gratuitos en la Red""Colorino""‘COL.diesis’, escuchar los sonidos del color""COL.diesis: Transforming Colour into Melody and Implementing the Result in a Colour Sensor Device"o orixinal"Sistema de desarrollo de sinestesia color-sonido para invidentes utilizando un protocolo de audio""Enseñanza táctil - geometría y color. Juegos didácticos para niños ciegos y videntes""Sistema Constanz"L'ocupació laboral dels cecs a l'Estat espanyol està pràcticament equiparada a la de les persones amb visió, entrevista amb Pedro ZuritaONCE (Organización Nacional de Cegos de España)Prevención da cegueiraDescrición de deficiencias visuais (Disc@pnet)Braillín, un boneco atractivo para calquera neno, con ou sen discapacidade, que permite familiarizarse co sistema de escritura e lectura brailleAxudas Técnicas36838ID00897494007150-90057129528256DOID:1432HP:0000618D001766C10.597.751.941.162C97109C0155020