Init.Data.Float

FloatSpec structure

Full source

structure FloatSpec where
  float : Type
  val   : float
  lt    : float → float → Prop
  le    : float → float → Prop
  decLt : DecidableRel lt
  decLe : DecidableRel le

Floating-point number specifications

Informal description

The structure `FloatSpec` represents specifications or properties related to floating-point numbers. Without additional information, the exact nature of these specifications cannot be determined, but they likely include characteristics or constraints for floating-point arithmetic operations.

floatSpec opaque

: FloatSpec

Full source

opaque floatSpec : FloatSpec := {
  float := Unit,
  val   := (),
  lt    := fun _ _ => True,
  le    := fun _ _ => True,
  decLt := fun _ _ => inferInstanceAs (Decidable True),
  decLe := fun _ _ => inferInstanceAs (Decidable True)
}

Floating-point number specifications instance

Informal description

The term `floatSpec` is an instance of the `FloatSpec` structure, which contains specifications or properties related to floating-point numbers. The exact nature of these specifications is not provided, but they likely pertain to characteristics or constraints for floating-point arithmetic operations.

Float structure

Full source

/--
64-bit floating-point numbers.

`Float` corresponds to the IEEE 754 *binary64* format (`double` in C or `f64` in Rust).
Floating-point numbers are a finite representation of a subset of the real numbers, extended with
extra “sentinel” values that represent undefined and infinite results as well as separate positive
and negative zeroes. Arithmetic on floating-point numbers approximates the corresponding operations
on the real numbers by rounding the results to numbers that are representable, propagating error and
infinite values.

Floating-point numbers include [subnormal numbers](https://en.wikipedia.org/wiki/Subnormal_number).
Their special values are:
 * `NaN`, which denotes a class of “not a number” values that result from operations such as
   dividing zero by zero, and
 * `Inf` and `-Inf`, which represent positive and infinities that result from dividing non-zero
   values by zero.
-/
structure Float where
  val : floatSpec.float

64-bit Floating-Point Numbers (IEEE 754 binary64)

Informal description

The structure representing 64-bit floating-point numbers according to the IEEE 754 binary64 standard (also known as `double` in C or `f64` in Rust). Floating-point numbers approximate real numbers with finite precision and include special values: - `NaN` (Not a Number) for undefined results like 0/0, - `Inf` and `-Inf` for positive and negative infinities from operations like non-zero division by zero, - Subnormal numbers for representing values smaller than the smallest normal number. The structure also distinguishes between positive and negative zero.

instNonemptyFloat instance

: Nonempty Float

Full source

instance : Nonempty Float := ⟨{ val := floatSpec.val }⟩

Nonemptiness of Floating-Point Numbers

Informal description

The type of 64-bit floating-point numbers (IEEE 754 binary64) is nonempty.

Float.add opaque

: Float → Float → Float

Full source

/--
Adds two 64-bit floating-point numbers according to IEEE 754. Typically used via the `+` operator.

This function does not reduce in the kernel. It is compiled to the C addition operator.
-/
@[extern "lean_float_add"] opaque Float.add : Float → Float → Float

IEEE 754 Floating-Point Addition Operation

Informal description

The function $+$ performs addition of two 64-bit floating-point numbers according to the IEEE 754 binary64 standard. This operation handles special cases including: - Not a Number (`NaN`) for undefined results, - Positive and negative infinities (`Inf` and `-Inf`), - Subnormal numbers, and - Signed zeros.

Float.sub opaque

: Float → Float → Float

Full source

/--
Subtracts 64-bit floating-point numbers according to IEEE 754. Typically used via the `-` operator.

This function does not reduce in the kernel. It is compiled to the C subtraction operator.
-/
@[extern "lean_float_sub"] opaque Float.sub : Float → Float → Float

IEEE 754 Floating-Point Subtraction

Informal description

The function `Float.sub` performs subtraction on two 64-bit floating-point numbers according to the IEEE 754 binary64 standard. Given two floating-point numbers `x` and `y`, it computes `x - y`, handling special cases like infinities, NaNs, and signed zeros as specified by the standard.

Float.mul opaque

: Float → Float → Float

Full source

/--
Multiplies 64-bit floating-point numbers according to IEEE 754. Typically used via the `*` operator.

This function does not reduce in the kernel. It is compiled to the C multiplication operator.
-/
@[extern "lean_float_mul"] opaque Float.mul : Float → Float → Float

IEEE 754 Floating-Point Multiplication

Informal description

The multiplication operation for 64-bit floating-point numbers (IEEE 754 binary64) takes two floating-point numbers and returns their product. This operation handles special cases like `NaN`, infinities, and subnormal numbers according to IEEE 754 rules.

Float.div opaque

: Float → Float → Float

Full source

/--
Divides 64-bit floating-point numbers according to IEEE 754. Typically used via the `/` operator.

In Lean, division by zero typically yields zero. For `Float`, it instead yields either `Inf`,
`-Inf`, or `NaN`.

This function does not reduce in the kernel. It is compiled to the C division operator.
-/
@[extern "lean_float_div"] opaque Float.div : Float → Float → Float

IEEE 754 Floating-Point Division Operation (`/`)

Informal description

The division operation on 64-bit floating-point numbers (IEEE 754 binary64) is defined as a function that takes two `Float` values and returns their quotient. Unlike standard division in Lean (which returns zero for division by zero), this operation follows IEEE 754 semantics, returning: - `Inf` or `-Inf` for non-zero division by zero (depending on the sign of the numerator), - `NaN` for indeterminate forms like 0/0. The function is compiled to the native C division operator and does not reduce in the Lean kernel.

Float.neg opaque

: Float → Float

Full source

/--
Negates 64-bit floating-point numbers according to IEEE 754. Typically used via the `-` prefix
operator.

This function does not reduce in the kernel. It is compiled to the C negation operator.
-/
@[extern "lean_float_negate"] opaque Float.neg : Float → Float

Negation Operation for 64-bit Floating-Point Numbers

Informal description

The function $\mathrm{neg}$ takes a 64-bit floating-point number $x$ and returns its negation $-x$ according to the IEEE 754 binary64 standard. This operation is typically used via the `-` prefix operator.

Float.lt definition

: Float → Float → Prop

Full source

/--
Strict inequality of floating-point numbers. Typically used via the `<` operator.
-/
def Float.lt : Float → Float → Prop := fun a b =>
  match a, b with
  | ⟨a⟩, ⟨b⟩ => floatSpec.lt a b

Floating-point strict inequality relation

Informal description

The strict inequality relation on 64-bit floating-point numbers, defined as $a < b$ according to the IEEE 754 binary64 standard. This relation handles special floating-point values like `NaN`, infinities, and zeros appropriately.

Float.le definition

: Float → Float → Prop

Full source

/--
Non-strict inequality of floating-point numbers. Typically used via the `≤` operator.
-/
def Float.le : Float → Float → Prop := fun a b =>
  floatSpec.le a.val b.val

Floating-point less-than-or-equal relation

Informal description

The relation $\mathrm{le}$ on 64-bit floating-point numbers is defined as the non-strict inequality $a \leq b$ according to the IEEE 754 binary64 standard. This relation is typically used via the $\leq$ operator and includes special handling for floating-point values like `NaN`, infinities, and zeros.

Float.ofBits opaque

: UInt64 → Float

Full source

/--
Bit-for-bit conversion from `UInt64`. Interprets a `UInt64` as a `Float`, ignoring the numeric value
and treating the `UInt64`'s bit pattern as a `Float`.

`Float`s and `UInt64`s have the same endianness on all supported platforms. IEEE 754 very precisely
specifies the bit layout of floats.

This function does not reduce in the kernel.
-/
@[extern "lean_float_of_bits"] opaque Float.ofBits : UInt64 → Float

Bit-Pattern Conversion from Unsigned 64-bit Integer to Floating-Point Number

Informal description

The function `Float.ofBits` converts a 64-bit unsigned integer (`UInt64`) to a 64-bit floating-point number (`Float`) by interpreting the bit pattern of the `UInt64` as an IEEE 754 binary64 floating-point value. This conversion preserves the exact bit representation without any numerical interpretation or reduction.

Float.toBits opaque

: Float → UInt64

Full source

/--
Bit-for-bit conversion to `UInt64`. Interprets a `Float` as a `UInt64`, ignoring the numeric value
and treating the `Float`'s bit pattern as a `UInt64`.

`Float`s and `UInt64`s have the same endianness on all supported platforms. IEEE 754 very precisely
specifies the bit layout of floats.

This function is distinct from `Float.toUInt64`, which attempts to preserve the numeric value rather
than reinterpreting the bit pattern.
-/
@[extern "lean_float_to_bits"] opaque Float.toBits : Float → UInt64

Bitwise Conversion from Float to Unsigned 64-bit Integer

Informal description

The function `Float.toBits` converts a 64-bit floating-point number (`Float`) to an unsigned 64-bit integer (`UInt64`) by directly reinterpreting the bit pattern of the `Float` without any numerical conversion. This operation preserves the exact binary representation of the `Float` as specified by the IEEE 754 binary64 standard, including special values like `NaN`, infinities, and subnormal numbers.

instAddFloat instance

: Add Float

Full source

instance : Add Float := ⟨Float.add⟩

Addition Operation for 64-bit Floating-Point Numbers

Informal description

The 64-bit floating-point numbers (IEEE 754 binary64) have an addition operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.

instSubFloat instance

: Sub Float

Full source

instance : Sub Float := ⟨Float.sub⟩

Subtraction Operation for 64-bit Floating-Point Numbers

Informal description

The 64-bit floating-point numbers (IEEE 754 binary64) have a subtraction operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.

instMulFloat instance

: Mul Float

Full source

instance : Mul Float := ⟨Float.mul⟩

Multiplication Operation for 64-bit Floating-Point Numbers

Informal description

The 64-bit floating-point numbers (IEEE 754 binary64) have a multiplication operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.

instDivFloat instance

: Div Float

Full source

instance : Div Float := ⟨Float.div⟩

Division Operation for 64-bit Floating-Point Numbers

Informal description

The 64-bit floating-point numbers (IEEE 754 binary64) have a division operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.

instNegFloat instance

: Neg Float

Full source

instance : Neg Float := ⟨Float.neg⟩

Negation Operation for 64-bit Floating-Point Numbers

Informal description

The 64-bit floating-point numbers (IEEE 754 binary64) have a negation operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.

instLTFloat instance

: LT Float

Full source

instance : LT Float  := ⟨Float.lt⟩

Strict Inequality Relation on Floating-Point Numbers

Informal description

The 64-bit floating-point numbers (IEEE 754 binary64) have a strict inequality relation $<$ defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.

instLEFloat instance

: LE Float

Full source

instance : LE Float  := ⟨Float.le⟩

Non-strict Inequality Relation on Floating-Point Numbers

Informal description

The 64-bit floating-point numbers (IEEE 754 binary64) are equipped with a non-strict inequality relation $\leq$ defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.

Float.beq opaque

(a b : Float) : Bool

Full source

/--
Checks whether two floating-point numbers are equal according to IEEE 754.

Floating-point equality does not correspond with propositional equality. In particular, it is not
reflexive since `NaN != NaN`, and it is not a congruence because `0.0 == -0.0`, but
`1.0 / 0.0 != 1.0 / -0.0`.

This function does not reduce in the kernel. It is compiled to the C equality operator.
-/
@[extern "lean_float_beq"] opaque Float.beq (a b : Float) : Bool

IEEE 754 Floating-Point Equality Check

Informal description

The function `Float.beq` checks whether two floating-point numbers `a` and `b` are equal according to the IEEE 754 standard. This equality is not reflexive (since `NaN != NaN`) and is not a congruence (since `0.0 == -0.0` but `1.0 / 0.0 != 1.0 / -0.0`). The function compiles to the C equality operator and does not reduce in the kernel.

instBEqFloat instance

: BEq Float

Full source

instance : BEq Float := ⟨Float.beq⟩

IEEE 754 Floating-Point Boolean Equality

Informal description

The type `Float` of 64-bit floating-point numbers is equipped with a boolean equality relation `==` that follows the IEEE 754 standard. This relation is not reflexive (since `NaN != NaN`) and does not form a congruence (since `0.0 == -0.0` but `1.0 / 0.0 != 1.0 / -0.0`).

Float.decLt opaque

(a b : Float) : Decidable (a < b)

Full source

/--
Compares two floating point numbers for strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.
-/
@[extern "lean_float_decLt"] opaque Float.decLt (a b : Float) : Decidable (a < b) :=
  match a, b with
  | ⟨a⟩, ⟨b⟩ => floatSpec.decLt a b

Decidability of Strict Inequality for Floating-Point Numbers

Informal description

For any two 64-bit floating-point numbers $a$ and $b$ (IEEE 754 binary64), the proposition $a < b$ is decidable, meaning there exists a constructive proof that either $a < b$ holds or its negation holds.

Float.decLe opaque

(a b : Float) : Decidable (a ≤ b)

Full source

/--
Compares two floating point numbers for non-strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.
-/
@[extern "lean_float_decLe"] opaque Float.decLe (a b : Float) : Decidable (a ≤ b) :=
  match a, b with
  | ⟨a⟩, ⟨b⟩ => floatSpec.decLe a b

Decidability of Non-Strict Inequality for Floating-Point Numbers

Informal description

For any two 64-bit floating-point numbers $a$ and $b$ (IEEE 754 binary64), the non-strict inequality $a \leq b$ is decidable. This means there exists a constructive proof that either $a \leq b$ holds or its negation holds.

floatDecLt instance

(a b : Float) : Decidable (a < b)

Full source

instance floatDecLt (a b : Float) : Decidable (a < b) := Float.decLt a b

Decidability of Strict Inequality for Floating-Point Numbers

Informal description

For any two 64-bit floating-point numbers $a$ and $b$ (IEEE 754 binary64), the strict inequality $a < b$ is decidable. This means there exists a constructive proof that either $a < b$ holds or its negation holds.

floatDecLe instance

(a b : Float) : Decidable (a ≤ b)

Full source

instance floatDecLe (a b : Float) : Decidable (a ≤ b) := Float.decLe a b

Decidability of Non-Strict Inequality for Floating-Point Numbers

Informal description

For any two 64-bit floating-point numbers $a$ and $b$ (IEEE 754 binary64), the non-strict inequality $a \leq b$ is decidable. This means there exists a constructive proof that either $a \leq b$ holds or its negation holds.

Float.toString opaque

: Float → String

Full source

/--
Converts a floating-point number to a string.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_string"] opaque Float.toString : Float → String

String Conversion of Floating-Point Numbers

Informal description

The function converts a 64-bit floating-point number (IEEE 754 binary64) to its string representation.

Float.toUInt8 opaque

: Float → UInt8

Full source

/--
Converts a floating-point number to an 8-bit unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt8`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `UInt8` value (i.e. `UInt8.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_uint8"] opaque Float.toUInt8 : Float → UInt8

Conversion from `Float` to `UInt8` with Clamping

Informal description

The function converts a floating-point number to an 8-bit unsigned integer as follows: - If the input is non-negative, it truncates the value (rounding down) and clamps it to the range of `UInt8` (i.e., $[0, 255]$). - If the input is negative or `NaN`, it returns $0$. - If the input exceeds the maximum `UInt8` value, it returns $255$.

Float.toUInt16 opaque

: Float → UInt16

Full source

/--
Converts a floating-point number to a 16-bit unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt16`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `UInt16` value (i.e. `UInt16.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_uint16"] opaque Float.toUInt16 : Float → UInt16

Conversion from Float to 16-bit Unsigned Integer with Clamping

Informal description

The function converts a floating-point number $x$ to a 16-bit unsigned integer as follows: - If $x$ is non-negative, it truncates $x$ to an integer by rounding down and clamps the result to the range $[0, 2^{16}-1]$. - If $x$ is negative or `NaN`, it returns $0$. - If $x$ exceeds $2^{16}-1$, it returns $2^{16}-1$.

Float.toUInt32 opaque

: Float → UInt32

Full source

/--
Converts a floating-point number to a 32-bit unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt32`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `UInt32` value (i.e. `UInt32.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_uint32"] opaque Float.toUInt32 : Float → UInt32

Truncation and Clamping of Floating-Point to 32-bit Unsigned Integer

Informal description

The function converts a floating-point number $x$ to a 32-bit unsigned integer by: 1. Truncating $x$ to the nearest integer below if $x$ is non-negative, and clamping the result to the range $[0, 2^{32}-1]$. 2. Returning $0$ if $x$ is negative or `NaN`. 3. Returning $2^{32}-1$ if $x$ exceeds the maximum representable value of `UInt32`.

Float.toUInt64 opaque

: Float → UInt64

Full source

/--
Converts a floating-point number to a 64-bit unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt64`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `UInt64` value (i.e. `UInt64.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_uint64"] opaque Float.toUInt64 : Float → UInt64

Conversion from Floating-Point to 64-bit Unsigned Integer with Clamping

Informal description

The function converts a floating-point number to a 64-bit unsigned integer. For a non-negative input, it truncates the value by rounding down and clamps it to the range of `UInt64`. Specifically: - Returns $0$ if the input is negative or `NaN`. - Returns the maximum `UInt64` value (i.e., $2^{64} - 1$) if the input exceeds this value.

Float.toUSize opaque

: Float → USize

Full source

/--
Converts a floating-point number to a word-sized unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `USize`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `USize` value (i.e. `USize.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_usize"] opaque Float.toUSize : Float → USize

Truncating conversion from floating-point to unsigned word-size integer

Informal description

The function $\mathrm{toUSize}$ converts a floating-point number to a word-sized unsigned integer. For a non-negative input $x$, it truncates $x$ to the nearest integer less than or equal to $x$, clamping the result to the range $[0, N-1]$ where $N$ is the maximum value of $\mathrm{USize}$. If $x$ is negative, $\mathrm{NaN}$, or $-\infty$, it returns $0$. If $x$ exceeds $N-1$, it returns $N-1$.

Float.isNaN opaque

: Float → Bool

Full source

/--
Checks whether a floating point number is `NaN` (“not a number”) value.

`NaN` values result from operations that might otherwise be errors, such as dividing zero by zero.

This function does not reduce in the kernel. It is compiled to the C operator `isnan`.
-/
@[extern "lean_float_isnan"] opaque Float.isNaN : Float → Bool

Check for NaN in Floating-Point Numbers

Informal description

The function `Float.isNaN` checks whether a given floating-point number is `NaN` (Not a Number). It returns `true` if the input is `NaN` and `false` otherwise. This function is compiled to the C `isnan` operator and does not reduce in the kernel.

Float.isFinite opaque

: Float → Bool

Full source

/--
Checks whether a floating-point number is finite, that is, whether it is normal, subnormal, or zero,
but not infinite or `NaN`.

This function does not reduce in the kernel. It is compiled to the C operator `isfinite`.
-/
@[extern "lean_float_isfinite"] opaque Float.isFinite : Float → Bool

Finite Check for Floating-Point Numbers

Informal description

A floating-point number $x$ is finite if and only if it is either normal, subnormal, or zero, but not infinite or `NaN`. The function `isFinite` checks this property and returns a Boolean value.

Float.isInf opaque

: Float → Bool

Full source

/--
Checks whether a floating-point number is a positive or negative infinite number, but not a finite
number or `NaN`.

This function does not reduce in the kernel. It is compiled to the C operator `isinf`.
-/
@[extern "lean_float_isinf"] opaque Float.isInf : Float → Bool

Floating-Point Infinity Check

Informal description

The function `Float.isInf` checks whether a given floating-point number $x$ is either positive or negative infinity (i.e., $x = +\infty$ or $x = -\infty$), returning `true` in these cases and `false` otherwise. It does not classify finite numbers or `NaN` as infinite.

Float.frExp opaque

: Float → Float × Int

Full source

/--
Splits the given float `x` into a significand/exponent pair `(s, i)` such that `x = s * 2^i` where
`s ∈ (-1;-0.5] ∪ [0.5; 1)`. Returns an undefined value if `x` is not finite.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`frexp`.
-/
@[extern "lean_float_frexp"] opaque Float.frExp : Float → FloatFloat × Int

Floating-point significand/exponent decomposition: $x = s \times 2^i$ with $s \in (-1, -0.5] \cup [0.5, 1)$

Informal description

For any finite 64-bit floating-point number $x$, the function returns a pair $(s, i)$ where $s$ is the significand and $i$ is the exponent, such that $x = s \times 2^i$ with $s \in (-1, -0.5] \cup [0.5, 1)$. The result is undefined if $x$ is not finite (i.e., NaN or infinite).

instToStringFloat instance

: ToString Float

Full source

instance : ToString Float where
  toString := Float.toString

String Representation of Floating-Point Numbers

Informal description

The type `Float` of 64-bit floating-point numbers has a canonical string representation function.

UInt8.toFloat opaque

(n : UInt8) : Float

Full source

/-- Obtains the `Float` whose value is the same as the given `UInt8`. -/
@[extern "lean_uint8_to_float"] opaque UInt8.toFloat (n : UInt8) : Float

Conversion from Unsigned 8-bit Integer to 64-bit Floating-Point Number

Informal description

For any unsigned 8-bit integer $n$, the function `UInt8.toFloat` returns a 64-bit floating-point number with the same numerical value as $n$.

UInt16.toFloat opaque

(n : UInt16) : Float

Full source

/-- Obtains the `Float` whose value is the same as the given `UInt16`. -/
@[extern "lean_uint16_to_float"] opaque UInt16.toFloat (n : UInt16) : Float

Conversion from Unsigned 16-bit Integer to Floating-Point Number

Informal description

For any unsigned 16-bit integer $n$, the function `UInt16.toFloat` returns a 64-bit floating-point number with the same numerical value as $n$.

UInt32.toFloat opaque

(n : UInt32) : Float

Full source

/-- Obtains the `Float` whose value is the same as the given `UInt32`. -/
@[extern "lean_uint32_to_float"] opaque UInt32.toFloat (n : UInt32) : Float

Conversion from Unsigned 32-bit Integer to 64-bit Floating-Point Number

Informal description

Given an unsigned 32-bit integer $n$, the function returns a 64-bit floating-point number (IEEE 754 binary64) with the same numerical value as $n$.

UInt64.toFloat opaque

(n : UInt64) : Float

Full source

/--
Obtains a `Float` whose value is near the given `UInt64`.

It will be exactly the value of the given `UInt64` if such a `Float` exists. If no such `Float`
exists, the returned value will either be the smallest `Float` that is larger than the given value,
or the largest `Float` that is smaller than the given value.

This function is opaque in the kernel, but is overridden at runtime with an efficient
implementation.
-/
@[extern "lean_uint64_to_float"] opaque UInt64.toFloat (n : UInt64) : Float

Approximate Conversion from Unsigned 64-bit Integer to Floating-Point

Informal description

Given an unsigned 64-bit integer $n$, the function returns a `Float` value that approximates $n$. If $n$ can be exactly represented as a `Float`, the result is exact. Otherwise, the result is either the smallest `Float` larger than $n$ or the largest `Float` smaller than $n$.

USize.toFloat opaque

(n : USize) : Float

Full source

/--
Obtains a `Float` whose value is near the given `USize`.

It will be exactly the value of the given `USize` if such a `Float` exists. If no such `Float`
exists, the returned value will either be the smallest `Float` that is larger than the given value,
or the largest `Float` that is smaller than the given value.

This function is opaque in the kernel, but is overridden at runtime with an efficient
implementation.
-/
@[extern "lean_usize_to_float"] opaque USize.toFloat (n : USize) : Float

Approximate Conversion from Word-Size Unsigned Integer to Floating-Point Number

Informal description

For any unsigned word-size integer $n$, there exists a floating-point number that approximates $n$. If $n$ can be exactly represented as a `Float`, then the result is exactly $n$. Otherwise, the result is either the smallest `Float` larger than $n$ or the largest `Float` smaller than $n$.

instInhabitedFloat instance

: Inhabited Float

Full source

instance : Inhabited Float where
  default := UInt64.toFloat 0

Inhabited Instance for Floating-Point Numbers

Informal description

The type `Float` of 64-bit floating-point numbers is inhabited, meaning it has a designated default value.

instReprFloat instance

: Repr Float

Full source

instance : Repr Float where
  reprPrec n prec := if n < UInt64.toFloat 0 then Repr.addAppParen (toString n) prec else toString n

Canonical Representation of Floating-Point Numbers

Informal description

The type `Float` of 64-bit floating-point numbers (IEEE 754 binary64) has a canonical representation function.

instReprAtomFloat instance

: ReprAtom Float

Full source

instance : ReprAtom Float  := ⟨⟩

Atomic Representation of Floating-Point Numbers

Informal description

The type `Float` of 64-bit floating-point numbers has a canonical representation as atomic values.

Float.sin opaque

: Float → Float

Full source

/--
Computes the sine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`sin`.
-/
@[extern "sin"] opaque Float.sin : Float → Float

Sine Function for Floating-Point Numbers

Informal description

The function $\sin \colon \mathbb{F} \to \mathbb{F}$ computes the sine of a floating-point number (in radians), where $\mathbb{F}$ denotes the set of 64-bit floating-point numbers (IEEE 754 binary64). This function is implemented via the C `sin` function and does not reduce in the kernel.

Float.cos opaque

: Float → Float

Full source

/--
Computes the cosine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`cos`.
-/
@[extern "cos"] opaque Float.cos : Float → Float

Floating-point cosine function

Informal description

The function $\cos$ computes the cosine of a floating-point number $x$ (given in radians). This is implemented using the C standard library function `cos` and does not reduce in the Lean kernel.

Float.tan opaque

: Float → Float

Full source

/--
Computes the tangent of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`tan`.
-/
@[extern "tan"] opaque Float.tan : Float → Float

Floating-Point Tangent Function

Informal description

The tangent function for 64-bit floating-point numbers, denoted as $\tan(x)$, computes the tangent of $x$ radians, where $x$ is a floating-point number. This function is implemented via the C function `tan` and does not reduce in the kernel.

Float.asin opaque

: Float → Float

Full source

/--
Computes the arc sine (inverse sine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`asin`.
-/
@[extern "asin"] opaque Float.asin : Float → Float

Floating-Point Arc Sine Function ($\text{asin}$)

Informal description

The function $\text{asin} : \mathbb{F}_{64} \to \mathbb{F}_{64}$ computes the arc sine (inverse sine) of a 64-bit floating-point number in radians, where $\mathbb{F}_{64}$ denotes the set of IEEE 754 binary64 floating-point numbers.

Float.acos opaque

: Float → Float

Full source

/--
Computes the arc cosine (inverse cosine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`acos`.
-/
@[extern "acos"] opaque Float.acos : Float → Float

Floating-Point Arc Cosine Function

Informal description

The function $\text{acos} : \mathbb{F} \to \mathbb{F}$ computes the arc cosine (inverse cosine) of a floating-point number in radians, where $\mathbb{F}$ denotes the set of 64-bit floating-point numbers (IEEE 754 binary64). This function is implemented via the C function `acos` and does not reduce in the kernel.

Float.atan opaque

: Float → Float

Full source

/--
Computes the arc tangent (inverse tangent) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`atan`.
-/
@[extern "atan"] opaque Float.atan : Float → Float

Floating-Point Arc Tangent Function ($\text{atan}$)

Informal description

The function $\text{atan} : \mathbb{F}_{64} \to \mathbb{F}_{64}$ computes the arc tangent (inverse tangent) of a 64-bit floating-point number in radians, where $\mathbb{F}_{64}$ denotes the set of IEEE 754 binary64 floating-point numbers. This function is implemented via the C function `atan` and does not reduce in the kernel.

Float.atan2 opaque

(y x : Float) : Float

Full source

/--
Computes the arc tangent (inverse tangent) of `y / x` in radians, in the range `-π`–`π`. The signs
of the arguments determine the quadrant of the result.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`atan2`.
-/
@[extern "atan2"] opaque Float.atan2 (y x : Float) : Float

Two-argument Arc Tangent Function ($\text{atan2}$) for Floating-Point Numbers

Informal description

The function $\text{atan2}(y, x)$ computes the arc tangent (inverse tangent) of the ratio $y/x$ in radians, returning a value in the range $(-\pi, \pi]$. The signs of $y$ and $x$ determine the quadrant of the result. This function is implemented via the C function `atan2` and does not reduce in the kernel.

Float.sinh opaque

: Float → Float

Full source

/--
Computes the hyperbolic sine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`sinh`.
-/
@[extern "sinh"] opaque Float.sinh : Float → Float

Hyperbolic Sine Function for Floating-Point Numbers

Informal description

The function $\sinh(x)$ computes the hyperbolic sine of a floating-point number $x$.

Float.cosh opaque

: Float → Float

Full source

/--
Computes the hyperbolic cosine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`cosh`.
-/
@[extern "cosh"] opaque Float.cosh : Float → Float

Hyperbolic Cosine Function for Floating-Point Numbers

Informal description

The function $\cosh(x)$ computes the hyperbolic cosine of a floating-point number $x$.

Float.tanh opaque

: Float → Float

Full source

/--
Computes the hyperbolic tangent of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`tanh`.
-/
@[extern "tanh"] opaque Float.tanh : Float → Float

Hyperbolic Tangent Function for Floating-Point Numbers

Informal description

The hyperbolic tangent function $\tanh(x)$ for a 64-bit floating-point number $x$ computes the ratio of the hyperbolic sine to the hyperbolic cosine of $x$, i.e., $\tanh(x) = \frac{\sinh(x)}{\cosh(x)}$.

Float.asinh opaque

: Float → Float

Full source

/--
Computes the hyperbolic arc sine (inverse sine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`asinh`.
-/
@[extern "asinh"] opaque Float.asinh : Float → Float

Inverse Hyperbolic Sine for Floating-Point Numbers

Informal description

The function $\text{asinh} : \text{Float} \to \text{Float}$ computes the inverse hyperbolic sine (arcsinh) of a 64-bit floating-point number. This function is implemented via the C standard library function `asinh` and does not reduce in the Lean kernel.

Float.acosh opaque

: Float → Float

Full source

/--
Computes the hyperbolic arc cosine (inverse cosine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`acosh`.
-/
@[extern "acosh"] opaque Float.acosh : Float → Float

Inverse Hyperbolic Cosine Function for Floating-Point Numbers ($\operatorname{acosh}$)

Informal description

The function $\operatorname{acosh} : \mathbb{F} \to \mathbb{F}$ computes the inverse hyperbolic cosine (area hyperbolic cosine) of a floating-point number, where $\mathbb{F}$ represents the set of 64-bit floating-point numbers (IEEE 754 binary64).

Float.atanh opaque

: Float → Float

Full source

/--
Computes the hyperbolic arc tangent (inverse tangent) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`atanh`.
-/
@[extern "atanh"] opaque Float.atanh : Float → Float

Hyperbolic Arc Tangent Function for Floating-Point Numbers

Informal description

The function $\text{atanh} : \mathbb{F} \to \mathbb{F}$ computes the hyperbolic arc tangent (inverse hyperbolic tangent) of a floating-point number, where $\mathbb{F}$ denotes the set of 64-bit floating-point numbers according to the IEEE 754 binary64 standard.

Float.exp opaque

(x : Float) : Float

Full source

/--
Computes the exponential `e^x` of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`exp`.
-/
@[extern "exp"] opaque Float.exp (x : Float) : Float

Exponential Function for Floating-Point Numbers

Informal description

The function $\exp(x)$ computes the exponential of a floating-point number $x$, i.e., $e^x$.

Float.exp2 opaque

(x : Float) : Float

Full source

/--
Computes the base-2 exponential `2^x` of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`exp2`.
-/
@[extern "exp2"] opaque Float.exp2 (x : Float) : Float

Base-2 Exponential Function for Floating-Point Numbers

Informal description

The function $\mathrm{exp2}(x)$ computes the base-2 exponential $2^x$ for a given floating-point number $x$. This function is implemented in compiled code via the C function `exp2` and does not reduce in the Lean kernel.

Float.log opaque

(x : Float) : Float

Full source

/--
Computes the natural logarithm `ln x` of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`log`.
-/
@[extern "log"] opaque Float.log (x : Float) : Float

Natural Logarithm for Floating-Point Numbers

Informal description

The natural logarithm function $\ln(x)$ for a floating-point number $x$ computes $\log_e x$ (the logarithm with base $e$). This function is implemented via the C `log` function and does not reduce in the kernel.

Float.log2 opaque

: Float → Float

Full source

/--
Computes the base-2 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`log2`.
-/
@[extern "log2"] opaque Float.log2 : Float → Float

Base-2 Logarithm Function for Floating-Point Numbers

Informal description

The function $\log_2(x)$ computes the base-2 logarithm of a floating-point number $x$. This function is implemented in compiled code via the C function `log2` and does not reduce in the Lean kernel.

Float.log10 opaque

: Float → Float

Full source

/--
Computes the base-10 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`log10`.
-/
@[extern "log10"] opaque Float.log10 : Float → Float

Base-10 Logarithm for Floating-Point Numbers

Informal description

The base-10 logarithm function $\log_{10} : \mathbb{F} \to \mathbb{F}$ for 64-bit floating-point numbers $\mathbb{F}$ (IEEE 754 binary64), where $\mathbb{F}$ includes normal numbers, subnormal numbers, infinities, and NaN values.

Float.pow opaque

: Float → Float → Float

Full source

/--
Raises one floating-point number to the power of another. Typically used via the `^` operator.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`pow`.
-/
@[extern "pow"] opaque Float.pow : Float → Float → Float

Floating-point exponentiation function

Informal description

The function $\mathrm{pow} : \mathrm{Float} \to \mathrm{Float} \to \mathrm{Float}$ raises one floating-point number to the power of another. This is typically accessed via the `^` operator in Lean.

Float.sqrt opaque

: Float → Float

Full source

/--
Computes the square root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`sqrt`.
-/
@[extern "sqrt"] opaque Float.sqrt : Float → Float

Square Root Function for 64-bit Floating-Point Numbers ($\sqrt{x}$)

Informal description

The square root function for 64-bit floating-point numbers, denoted as $\sqrt{x}$, computes the principal (non-negative) square root of a floating-point number $x$ according to the IEEE 754 binary64 standard. Special cases include: - $\sqrt{\text{NaN}} = \text{NaN}$, - $\sqrt{+\infty} = +\infty$, - $\sqrt{-0.0} = -0.0$, - $\sqrt{x} = \text{NaN}$ for $x < 0$ (excluding $-0.0$).

Float.cbrt opaque

: Float → Float

Full source

/--
Computes the cube root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`cbrt`.
-/
@[extern "cbrt"] opaque Float.cbrt : Float → Float

Floating-point cube root function $\sqrt[3]{x}$

Informal description

The cube root function for 64-bit floating-point numbers, denoted as $\sqrt[3]{x}$, computes the real cube root of a given floating-point number $x$. This function is implemented using the C standard library function `cbrt` and handles all special cases (including NaN, infinities, and zeros) according to IEEE 754 specifications.

Float.ceil opaque

: Float → Float

Full source

/--
Computes the ceiling of a floating-point number, which is the smallest integer that's no smaller
than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`ceil`.

Examples:
 * `Float.ceil 1.5 = 2`
 * `Float.ceil (-1.5) = (-1)`
-/
@[extern "ceil"] opaque Float.ceil : Float → Float

Ceiling Function for Floating-Point Numbers

Informal description

For any floating-point number $x$, the ceiling function $\lceil x \rceil$ returns the smallest integer that is greater than or equal to $x$. This function is implemented using the C `ceil` function and does not reduce in the Lean kernel. Examples: - $\lceil 1.5 \rceil = 2$ - $\lceil -1.5 \rceil = -1$

Float.floor opaque

: Float → Float

Full source

/--
Computes the floor of a floating-point number, which is the largest integer that's no larger
than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`floor`.

Examples:
 * `Float.floor 1.5 = 1`
 * `Float.floor (-1.5) = (-2)`
-/
@[extern "floor"] opaque Float.floor : Float → Float

Floor Function for Floating-Point Numbers

Informal description

The function $\lfloor \cdot \rfloor : \mathbb{R}_{\text{float}} \to \mathbb{R}_{\text{float}}$ computes the floor of a floating-point number, returning the largest floating-point integer that is less than or equal to the input. Here $\mathbb{R}_{\text{float}}$ denotes the set of 64-bit floating-point numbers (IEEE 754 binary64).

Float.round opaque

: Float → Float

Full source

/--
Rounds to the nearest integer, rounding away from zero at half-way points.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`round`.
-/
@[extern "round"] opaque Float.round : Float → Float

Floating-Point Rounding Function (IEEE 754 round-to-nearest, ties away from zero)

Informal description

The function $\mathrm{round} : \mathbb{F} \to \mathbb{F}$ takes a 64-bit floating-point number $x$ and returns the nearest integer to $x$, rounding away from zero when $x$ is exactly halfway between two integers (where $\mathbb{F}$ denotes the set of IEEE 754 binary64 floating-point numbers).

Float.abs opaque

: Float → Float

Full source

/--
Computes the absolute value of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`fabs`.
-/
@[extern "fabs"] opaque Float.abs : Float → Float

Absolute Value for Floating-Point Numbers

Informal description

The function $\mathrm{abs}$ takes a floating-point number $x$ and returns its absolute value $|x|$, computed using the C function `fabs`.

instHomogeneousPowFloat instance

: HomogeneousPow Float

Full source

instance : HomogeneousPow Float := ⟨Float.pow⟩

Homogeneous Power Operation on Floating-Point Numbers

Informal description

The type `Float` of 64-bit floating-point numbers is equipped with a homogeneous power operation, allowing expressions of the form `x ^ y` where both `x` and `y` are floating-point numbers.

instMinFloat instance

: Min Float

Full source

instance : Min Float := minOfLe

Minimum Operation on Floating-Point Numbers

Informal description

The 64-bit floating-point numbers (IEEE 754 binary64) are equipped with a minimum operation $\min$, which takes two floating-point numbers and returns the smaller one according to the IEEE 754 standard ordering, including handling of special cases like `NaN`, infinities, and subnormal numbers.

instMaxFloat instance

: Max Float

Full source

instance : Max Float := maxOfLe

Maximum Operation on Floating-Point Numbers

Informal description

The 64-bit floating-point numbers (IEEE 754 binary64) are equipped with a maximum operation $\max$, which computes the larger of two floating-point numbers according to the IEEE 754 standard, including handling of special cases like `NaN`, infinities, and subnormal numbers.

Float.scaleB opaque

(x : Float) (i : @& Int) : Float

Full source

/--
Efficiently computes `x * 2^i`.

This function does not reduce in the kernel.
-/
@[extern "lean_float_scaleb"]
opaque Float.scaleB (x : Float) (i : @& Int) : Float

Floating-point scaling by powers of two: $x \cdot 2^i$

Informal description

Given a floating-point number $x$ and an integer $i$, the function `Float.scaleB` efficiently computes $x \cdot 2^i$ without kernel reduction.