doc-next-gen

Init.Data.Float

Module docstring

{}

FloatSpec structure
Full source
structure FloatSpec where
  float : Type
  val   : float
  lt    : float → float → Prop
  le    : float → float → Prop
  decLt : DecidableRel lt
  decLe : DecidableRel le
Floating-point number specifications
Informal description
The structure `FloatSpec` represents specifications or properties related to floating-point numbers. Without additional information, the exact nature of these specifications cannot be determined, but they likely include characteristics or constraints for floating-point arithmetic operations.
floatSpec opaque
: FloatSpec
Full source
opaque floatSpec : FloatSpec := {
  float := Unit,
  val   := (),
  lt    := fun _ _ => True,
  le    := fun _ _ => True,
  decLt := fun _ _ => inferInstanceAs (Decidable True),
  decLe := fun _ _ => inferInstanceAs (Decidable True)
}
Floating-point number specifications instance
Informal description
The term `floatSpec` is an instance of the `FloatSpec` structure, which contains specifications or properties related to floating-point numbers. The exact nature of these specifications is not provided, but they likely pertain to characteristics or constraints for floating-point arithmetic operations.
Float structure
Full source
/--
64-bit floating-point numbers.

`Float` corresponds to the IEEE 754 *binary64* format (`double` in C or `f64` in Rust).
Floating-point numbers are a finite representation of a subset of the real numbers, extended with
extra “sentinel” values that represent undefined and infinite results as well as separate positive
and negative zeroes. Arithmetic on floating-point numbers approximates the corresponding operations
on the real numbers by rounding the results to numbers that are representable, propagating error and
infinite values.

Floating-point numbers include [subnormal numbers](https://en.wikipedia.org/wiki/Subnormal_number).
Their special values are:
 * `NaN`, which denotes a class of “not a number” values that result from operations such as
   dividing zero by zero, and
 * `Inf` and `-Inf`, which represent positive and infinities that result from dividing non-zero
   values by zero.
-/
structure Float where
  val : floatSpec.float
64-bit Floating-Point Numbers (IEEE 754 binary64)
Informal description
The structure representing 64-bit floating-point numbers according to the IEEE 754 binary64 standard (also known as `double` in C or `f64` in Rust). Floating-point numbers approximate real numbers with finite precision and include special values: - `NaN` (Not a Number) for undefined results like 0/0, - `Inf` and `-Inf` for positive and negative infinities from operations like non-zero division by zero, - Subnormal numbers for representing values smaller than the smallest normal number. The structure also distinguishes between positive and negative zero.
Float.add opaque
: Float → Float → Float
Full source
/--
Adds two 64-bit floating-point numbers according to IEEE 754. Typically used via the `+` operator.

This function does not reduce in the kernel. It is compiled to the C addition operator.
-/
@[extern "lean_float_add"] opaque Float.add : FloatFloatFloat
IEEE 754 Floating-Point Addition Operation
Informal description
The function $+$ performs addition of two 64-bit floating-point numbers according to the IEEE 754 binary64 standard. This operation handles special cases including: - Not a Number (`NaN`) for undefined results, - Positive and negative infinities (`Inf` and `-Inf`), - Subnormal numbers, and - Signed zeros.
Float.sub opaque
: Float → Float → Float
Full source
/--
Subtracts 64-bit floating-point numbers according to IEEE 754. Typically used via the `-` operator.

This function does not reduce in the kernel. It is compiled to the C subtraction operator.
-/
@[extern "lean_float_sub"] opaque Float.sub : FloatFloatFloat
IEEE 754 Floating-Point Subtraction
Informal description
The function `Float.sub` performs subtraction on two 64-bit floating-point numbers according to the IEEE 754 binary64 standard. Given two floating-point numbers `x` and `y`, it computes `x - y`, handling special cases like infinities, NaNs, and signed zeros as specified by the standard.
Float.mul opaque
: Float → Float → Float
Full source
/--
Multiplies 64-bit floating-point numbers according to IEEE 754. Typically used via the `*` operator.

This function does not reduce in the kernel. It is compiled to the C multiplication operator.
-/
@[extern "lean_float_mul"] opaque Float.mul : FloatFloatFloat
IEEE 754 Floating-Point Multiplication
Informal description
The multiplication operation for 64-bit floating-point numbers (IEEE 754 binary64) takes two floating-point numbers and returns their product. This operation handles special cases like `NaN`, infinities, and subnormal numbers according to IEEE 754 rules.
Float.div opaque
: Float → Float → Float
Full source
/--
Divides 64-bit floating-point numbers according to IEEE 754. Typically used via the `/` operator.

In Lean, division by zero typically yields zero. For `Float`, it instead yields either `Inf`,
`-Inf`, or `NaN`.

This function does not reduce in the kernel. It is compiled to the C division operator.
-/
@[extern "lean_float_div"] opaque Float.div : FloatFloatFloat
IEEE 754 Floating-Point Division Operation (`/`)
Informal description
The division operation on 64-bit floating-point numbers (IEEE 754 binary64) is defined as a function that takes two `Float` values and returns their quotient. Unlike standard division in Lean (which returns zero for division by zero), this operation follows IEEE 754 semantics, returning: - `Inf` or `-Inf` for non-zero division by zero (depending on the sign of the numerator), - `NaN` for indeterminate forms like 0/0. The function is compiled to the native C division operator and does not reduce in the Lean kernel.
Float.neg opaque
: Float → Float
Full source
/--
Negates 64-bit floating-point numbers according to IEEE 754. Typically used via the `-` prefix
operator.

This function does not reduce in the kernel. It is compiled to the C negation operator.
-/
@[extern "lean_float_negate"] opaque Float.neg : FloatFloat
Negation Operation for 64-bit Floating-Point Numbers
Informal description
The function $\mathrm{neg}$ takes a 64-bit floating-point number $x$ and returns its negation $-x$ according to the IEEE 754 binary64 standard. This operation is typically used via the `-` prefix operator.
Float.lt definition
: Float → Float → Prop
Full source
/--
Strict inequality of floating-point numbers. Typically used via the `<` operator.
-/
def Float.lt : FloatFloat → Prop := fun a b =>
  match a, b with
  | ⟨a⟩, ⟨b⟩ => floatSpec.lt a b
Floating-point strict inequality relation
Informal description
The strict inequality relation on 64-bit floating-point numbers, defined as $a < b$ according to the IEEE 754 binary64 standard. This relation handles special floating-point values like `NaN`, infinities, and zeros appropriately.
Float.le definition
: Float → Float → Prop
Full source
/--
Non-strict inequality of floating-point numbers. Typically used via the `≤` operator.
-/
def Float.le : FloatFloat → Prop := fun a b =>
  floatSpec.le a.val b.val
Floating-point less-than-or-equal relation
Informal description
The relation $\mathrm{le}$ on 64-bit floating-point numbers is defined as the non-strict inequality $a \leq b$ according to the IEEE 754 binary64 standard. This relation is typically used via the $\leq$ operator and includes special handling for floating-point values like `NaN`, infinities, and zeros.
Float.ofBits opaque
: UInt64 → Float
Full source
/--
Bit-for-bit conversion from `UInt64`. Interprets a `UInt64` as a `Float`, ignoring the numeric value
and treating the `UInt64`'s bit pattern as a `Float`.

`Float`s and `UInt64`s have the same endianness on all supported platforms. IEEE 754 very precisely
specifies the bit layout of floats.

This function does not reduce in the kernel.
-/
@[extern "lean_float_of_bits"] opaque Float.ofBits : UInt64Float
Bit-Pattern Conversion from Unsigned 64-bit Integer to Floating-Point Number
Informal description
The function `Float.ofBits` converts a 64-bit unsigned integer (`UInt64`) to a 64-bit floating-point number (`Float`) by interpreting the bit pattern of the `UInt64` as an IEEE 754 binary64 floating-point value. This conversion preserves the exact bit representation without any numerical interpretation or reduction.
Float.toBits opaque
: Float → UInt64
Full source
/--
Bit-for-bit conversion to `UInt64`. Interprets a `Float` as a `UInt64`, ignoring the numeric value
and treating the `Float`'s bit pattern as a `UInt64`.

`Float`s and `UInt64`s have the same endianness on all supported platforms. IEEE 754 very precisely
specifies the bit layout of floats.

This function is distinct from `Float.toUInt64`, which attempts to preserve the numeric value rather
than reinterpreting the bit pattern.
-/
@[extern "lean_float_to_bits"] opaque Float.toBits : FloatUInt64
Bitwise Conversion from Float to Unsigned 64-bit Integer
Informal description
The function `Float.toBits` converts a 64-bit floating-point number (`Float`) to an unsigned 64-bit integer (`UInt64`) by directly reinterpreting the bit pattern of the `Float` without any numerical conversion. This operation preserves the exact binary representation of the `Float` as specified by the IEEE 754 binary64 standard, including special values like `NaN`, infinities, and subnormal numbers.
instAddFloat instance
: Add Float
Full source
instance : Add Float := ⟨Float.add⟩
Addition Operation for 64-bit Floating-Point Numbers
Informal description
The 64-bit floating-point numbers (IEEE 754 binary64) have an addition operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.
instSubFloat instance
: Sub Float
Full source
instance : Sub Float := ⟨Float.sub⟩
Subtraction Operation for 64-bit Floating-Point Numbers
Informal description
The 64-bit floating-point numbers (IEEE 754 binary64) have a subtraction operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.
instMulFloat instance
: Mul Float
Full source
instance : Mul Float := ⟨Float.mul⟩
Multiplication Operation for 64-bit Floating-Point Numbers
Informal description
The 64-bit floating-point numbers (IEEE 754 binary64) have a multiplication operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.
instDivFloat instance
: Div Float
Full source
instance : Div Float := ⟨Float.div⟩
Division Operation for 64-bit Floating-Point Numbers
Informal description
The 64-bit floating-point numbers (IEEE 754 binary64) have a division operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.
instNegFloat instance
: Neg Float
Full source
instance : Neg Float := ⟨Float.neg⟩
Negation Operation for 64-bit Floating-Point Numbers
Informal description
The 64-bit floating-point numbers (IEEE 754 binary64) have a negation operation defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.
instLTFloat instance
: LT Float
Full source
instance : LT Float  := ⟨Float.lt⟩
Strict Inequality Relation on Floating-Point Numbers
Informal description
The 64-bit floating-point numbers (IEEE 754 binary64) have a strict inequality relation $<$ defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.
instLEFloat instance
: LE Float
Full source
instance : LE Float  := ⟨Float.le⟩
Non-strict Inequality Relation on Floating-Point Numbers
Informal description
The 64-bit floating-point numbers (IEEE 754 binary64) are equipped with a non-strict inequality relation $\leq$ defined according to the IEEE 754 standard, which includes handling of special cases like `NaN`, infinities, and subnormal numbers.
Float.beq opaque
(a b : Float) : Bool
Full source
/--
Checks whether two floating-point numbers are equal according to IEEE 754.

Floating-point equality does not correspond with propositional equality. In particular, it is not
reflexive since `NaN != NaN`, and it is not a congruence because `0.0 == -0.0`, but
`1.0 / 0.0 != 1.0 / -0.0`.

This function does not reduce in the kernel. It is compiled to the C equality operator.
-/
@[extern "lean_float_beq"] opaque Float.beq (a b : Float) : Bool
IEEE 754 Floating-Point Equality Check
Informal description
The function `Float.beq` checks whether two floating-point numbers `a` and `b` are equal according to the IEEE 754 standard. This equality is not reflexive (since `NaN != NaN`) and is not a congruence (since `0.0 == -0.0` but `1.0 / 0.0 != 1.0 / -0.0`). The function compiles to the C equality operator and does not reduce in the kernel.
instBEqFloat instance
: BEq Float
Full source
instance : BEq Float := ⟨Float.beq⟩
IEEE 754 Floating-Point Boolean Equality
Informal description
The type `Float` of 64-bit floating-point numbers is equipped with a boolean equality relation `==` that follows the IEEE 754 standard. This relation is not reflexive (since `NaN != NaN`) and does not form a congruence (since `0.0 == -0.0` but `1.0 / 0.0 != 1.0 / -0.0`).
Float.decLt opaque
(a b : Float) : Decidable (a < b)
Full source
/--
Compares two floating point numbers for strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.
-/
@[extern "lean_float_decLt"] opaque Float.decLt (a b : Float) : Decidable (a < b) :=
  match a, b with
  | ⟨a⟩, ⟨b⟩ => floatSpec.decLt a b
Decidability of Strict Inequality for Floating-Point Numbers
Informal description
For any two 64-bit floating-point numbers $a$ and $b$ (IEEE 754 binary64), the proposition $a < b$ is decidable, meaning there exists a constructive proof that either $a < b$ holds or its negation holds.
Float.decLe opaque
(a b : Float) : Decidable (a ≤ b)
Full source
/--
Compares two floating point numbers for non-strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.
-/
@[extern "lean_float_decLe"] opaque Float.decLe (a b : Float) : Decidable (a ≤ b) :=
  match a, b with
  | ⟨a⟩, ⟨b⟩ => floatSpec.decLe a b
Decidability of Non-Strict Inequality for Floating-Point Numbers
Informal description
For any two 64-bit floating-point numbers $a$ and $b$ (IEEE 754 binary64), the non-strict inequality $a \leq b$ is decidable. This means there exists a constructive proof that either $a \leq b$ holds or its negation holds.
floatDecLt instance
(a b : Float) : Decidable (a < b)
Full source
instance floatDecLt (a b : Float) : Decidable (a < b) := Float.decLt a b
Decidability of Strict Inequality for Floating-Point Numbers
Informal description
For any two 64-bit floating-point numbers $a$ and $b$ (IEEE 754 binary64), the strict inequality $a < b$ is decidable. This means there exists a constructive proof that either $a < b$ holds or its negation holds.
floatDecLe instance
(a b : Float) : Decidable (a ≤ b)
Full source
instance floatDecLe (a b : Float) : Decidable (a ≤ b) := Float.decLe a b
Decidability of Non-Strict Inequality for Floating-Point Numbers
Informal description
For any two 64-bit floating-point numbers $a$ and $b$ (IEEE 754 binary64), the non-strict inequality $a \leq b$ is decidable. This means there exists a constructive proof that either $a \leq b$ holds or its negation holds.
Float.toString opaque
: Float → String
Full source
/--
Converts a floating-point number to a string.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_string"] opaque Float.toString : FloatString
String Conversion of Floating-Point Numbers
Informal description
The function converts a 64-bit floating-point number (IEEE 754 binary64) to its string representation.
Float.toUInt8 opaque
: Float → UInt8
Full source
/--
Converts a floating-point number to an 8-bit unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt8`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `UInt8` value (i.e. `UInt8.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_uint8"] opaque Float.toUInt8 : FloatUInt8
Conversion from `Float` to `UInt8` with Clamping
Informal description
The function converts a floating-point number to an 8-bit unsigned integer as follows: - If the input is non-negative, it truncates the value (rounding down) and clamps it to the range of `UInt8` (i.e., $[0, 255]$). - If the input is negative or `NaN`, it returns $0$. - If the input exceeds the maximum `UInt8` value, it returns $255$.
Float.toUInt16 opaque
: Float → UInt16
Full source
/--
Converts a floating-point number to a 16-bit unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt16`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `UInt16` value (i.e. `UInt16.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_uint16"] opaque Float.toUInt16 : FloatUInt16
Conversion from Float to 16-bit Unsigned Integer with Clamping
Informal description
The function converts a floating-point number $x$ to a 16-bit unsigned integer as follows: - If $x$ is non-negative, it truncates $x$ to an integer by rounding down and clamps the result to the range $[0, 2^{16}-1]$. - If $x$ is negative or `NaN`, it returns $0$. - If $x$ exceeds $2^{16}-1$, it returns $2^{16}-1$.
Float.toUInt32 opaque
: Float → UInt32
Full source
/--
Converts a floating-point number to a 32-bit unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt32`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `UInt32` value (i.e. `UInt32.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_uint32"] opaque Float.toUInt32 : FloatUInt32
Truncation and Clamping of Floating-Point to 32-bit Unsigned Integer
Informal description
The function converts a floating-point number $x$ to a 32-bit unsigned integer by: 1. Truncating $x$ to the nearest integer below if $x$ is non-negative, and clamping the result to the range $[0, 2^{32}-1]$. 2. Returning $0$ if $x$ is negative or `NaN`. 3. Returning $2^{32}-1$ if $x$ exceeds the maximum representable value of `UInt32`.
Float.toUInt64 opaque
: Float → UInt64
Full source
/--
Converts a floating-point number to a 64-bit unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt64`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `UInt64` value (i.e. `UInt64.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_uint64"] opaque Float.toUInt64 : FloatUInt64
Conversion from Floating-Point to 64-bit Unsigned Integer with Clamping
Informal description
The function converts a floating-point number to a 64-bit unsigned integer. For a non-negative input, it truncates the value by rounding down and clamps it to the range of `UInt64`. Specifically: - Returns $0$ if the input is negative or `NaN`. - Returns the maximum `UInt64` value (i.e., $2^{64} - 1$) if the input exceeds this value.
Float.toUSize opaque
: Float → USize
Full source
/--
Converts a floating-point number to a word-sized unsigned integer.

If the given `Float` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `USize`. Returns `0` if the `Float` is negative or `NaN`, and returns the
largest `USize` value (i.e. `USize.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_usize"] opaque Float.toUSize : FloatUSize
Truncating conversion from floating-point to unsigned word-size integer
Informal description
The function $\mathrm{toUSize}$ converts a floating-point number to a word-sized unsigned integer. For a non-negative input $x$, it truncates $x$ to the nearest integer less than or equal to $x$, clamping the result to the range $[0, N-1]$ where $N$ is the maximum value of $\mathrm{USize}$. If $x$ is negative, $\mathrm{NaN}$, or $-\infty$, it returns $0$. If $x$ exceeds $N-1$, it returns $N-1$.
Float.isNaN opaque
: Float → Bool
Full source
/--
Checks whether a floating point number is `NaN` (“not a number”) value.

`NaN` values result from operations that might otherwise be errors, such as dividing zero by zero.

This function does not reduce in the kernel. It is compiled to the C operator `isnan`.
-/
@[extern "lean_float_isnan"] opaque Float.isNaN : FloatBool
Check for NaN in Floating-Point Numbers
Informal description
The function `Float.isNaN` checks whether a given floating-point number is `NaN` (Not a Number). It returns `true` if the input is `NaN` and `false` otherwise. This function is compiled to the C `isnan` operator and does not reduce in the kernel.
Float.isFinite opaque
: Float → Bool
Full source
/--
Checks whether a floating-point number is finite, that is, whether it is normal, subnormal, or zero,
but not infinite or `NaN`.

This function does not reduce in the kernel. It is compiled to the C operator `isfinite`.
-/
@[extern "lean_float_isfinite"] opaque Float.isFinite : FloatBool
Finite Check for Floating-Point Numbers
Informal description
A floating-point number $x$ is finite if and only if it is either normal, subnormal, or zero, but not infinite or `NaN`. The function `isFinite` checks this property and returns a Boolean value.
Float.isInf opaque
: Float → Bool
Full source
/--
Checks whether a floating-point number is a positive or negative infinite number, but not a finite
number or `NaN`.

This function does not reduce in the kernel. It is compiled to the C operator `isinf`.
-/
@[extern "lean_float_isinf"] opaque Float.isInf : FloatBool
Floating-Point Infinity Check
Informal description
The function `Float.isInf` checks whether a given floating-point number $x$ is either positive or negative infinity (i.e., $x = +\infty$ or $x = -\infty$), returning `true` in these cases and `false` otherwise. It does not classify finite numbers or `NaN` as infinite.
Float.frExp opaque
: Float → Float × Int
Full source
/--
Splits the given float `x` into a significand/exponent pair `(s, i)` such that `x = s * 2^i` where
`s ∈ (-1;-0.5] ∪ [0.5; 1)`. Returns an undefined value if `x` is not finite.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`frexp`.
-/
@[extern "lean_float_frexp"] opaque Float.frExp : FloatFloatFloat × Int
Floating-point significand/exponent decomposition: $x = s \times 2^i$ with $s \in (-1, -0.5] \cup [0.5, 1)$
Informal description
For any finite 64-bit floating-point number $x$, the function returns a pair $(s, i)$ where $s$ is the significand and $i$ is the exponent, such that $x = s \times 2^i$ with $s \in (-1, -0.5] \cup [0.5, 1)$. The result is undefined if $x$ is not finite (i.e., NaN or infinite).
instToStringFloat instance
: ToString Float
Full source
instance : ToString Float where
  toString := Float.toString
String Representation of Floating-Point Numbers
Informal description
The type `Float` of 64-bit floating-point numbers has a canonical string representation function.
UInt8.toFloat opaque
(n : UInt8) : Float
Full source
/-- Obtains the `Float` whose value is the same as the given `UInt8`. -/
@[extern "lean_uint8_to_float"] opaque UInt8.toFloat (n : UInt8) : Float
Conversion from Unsigned 8-bit Integer to 64-bit Floating-Point Number
Informal description
For any unsigned 8-bit integer $n$, the function `UInt8.toFloat` returns a 64-bit floating-point number with the same numerical value as $n$.
UInt16.toFloat opaque
(n : UInt16) : Float
Full source
/-- Obtains the `Float` whose value is the same as the given `UInt16`. -/
@[extern "lean_uint16_to_float"] opaque UInt16.toFloat (n : UInt16) : Float
Conversion from Unsigned 16-bit Integer to Floating-Point Number
Informal description
For any unsigned 16-bit integer $n$, the function `UInt16.toFloat` returns a 64-bit floating-point number with the same numerical value as $n$.
UInt32.toFloat opaque
(n : UInt32) : Float
Full source
/-- Obtains the `Float` whose value is the same as the given `UInt32`. -/
@[extern "lean_uint32_to_float"] opaque UInt32.toFloat (n : UInt32) : Float
Conversion from Unsigned 32-bit Integer to 64-bit Floating-Point Number
Informal description
Given an unsigned 32-bit integer $n$, the function returns a 64-bit floating-point number (IEEE 754 binary64) with the same numerical value as $n$.
UInt64.toFloat opaque
(n : UInt64) : Float
Full source
/--
Obtains a `Float` whose value is near the given `UInt64`.

It will be exactly the value of the given `UInt64` if such a `Float` exists. If no such `Float`
exists, the returned value will either be the smallest `Float` that is larger than the given value,
or the largest `Float` that is smaller than the given value.

This function is opaque in the kernel, but is overridden at runtime with an efficient
implementation.
-/
@[extern "lean_uint64_to_float"] opaque UInt64.toFloat (n : UInt64) : Float
Approximate Conversion from Unsigned 64-bit Integer to Floating-Point
Informal description
Given an unsigned 64-bit integer $n$, the function returns a `Float` value that approximates $n$. If $n$ can be exactly represented as a `Float`, the result is exact. Otherwise, the result is either the smallest `Float` larger than $n$ or the largest `Float` smaller than $n$.
USize.toFloat opaque
(n : USize) : Float
Full source
/--
Obtains a `Float` whose value is near the given `USize`.

It will be exactly the value of the given `USize` if such a `Float` exists. If no such `Float`
exists, the returned value will either be the smallest `Float` that is larger than the given value,
or the largest `Float` that is smaller than the given value.

This function is opaque in the kernel, but is overridden at runtime with an efficient
implementation.
-/
@[extern "lean_usize_to_float"] opaque USize.toFloat (n : USize) : Float
Approximate Conversion from Word-Size Unsigned Integer to Floating-Point Number
Informal description
For any unsigned word-size integer $n$, there exists a floating-point number that approximates $n$. If $n$ can be exactly represented as a `Float`, then the result is exactly $n$. Otherwise, the result is either the smallest `Float` larger than $n$ or the largest `Float` smaller than $n$.
instInhabitedFloat instance
: Inhabited Float
Full source
instance : Inhabited Float where
  default := UInt64.toFloat 0
Inhabited Instance for Floating-Point Numbers
Informal description
The type `Float` of 64-bit floating-point numbers is inhabited, meaning it has a designated default value.
instReprAtomFloat instance
: ReprAtom Float
Full source
instance : ReprAtom Float  := ⟨⟩
Atomic Representation of Floating-Point Numbers
Informal description
The type `Float` of 64-bit floating-point numbers has a canonical representation as atomic values.
Float.sin opaque
: Float → Float
Full source
/--
Computes the sine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`sin`.
-/
@[extern "sin"] opaque Float.sin : FloatFloat
Sine Function for Floating-Point Numbers
Informal description
The function $\sin \colon \mathbb{F} \to \mathbb{F}$ computes the sine of a floating-point number (in radians), where $\mathbb{F}$ denotes the set of 64-bit floating-point numbers (IEEE 754 binary64). This function is implemented via the C `sin` function and does not reduce in the kernel.
Float.cos opaque
: Float → Float
Full source
/--
Computes the cosine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`cos`.
-/
@[extern "cos"] opaque Float.cos : FloatFloat
Floating-point cosine function
Informal description
The function $\cos$ computes the cosine of a floating-point number $x$ (given in radians). This is implemented using the C standard library function `cos` and does not reduce in the Lean kernel.
Float.tan opaque
: Float → Float
Full source
/--
Computes the tangent of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`tan`.
-/
@[extern "tan"] opaque Float.tan : FloatFloat
Floating-Point Tangent Function
Informal description
The tangent function for 64-bit floating-point numbers, denoted as $\tan(x)$, computes the tangent of $x$ radians, where $x$ is a floating-point number. This function is implemented via the C function `tan` and does not reduce in the kernel.
Float.asin opaque
: Float → Float
Full source
/--
Computes the arc sine (inverse sine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`asin`.
-/
@[extern "asin"] opaque Float.asin : FloatFloat
Floating-Point Arc Sine Function ($\text{asin}$)
Informal description
The function $\text{asin} : \mathbb{F}_{64} \to \mathbb{F}_{64}$ computes the arc sine (inverse sine) of a 64-bit floating-point number in radians, where $\mathbb{F}_{64}$ denotes the set of IEEE 754 binary64 floating-point numbers.
Float.acos opaque
: Float → Float
Full source
/--
Computes the arc cosine (inverse cosine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`acos`.
-/
@[extern "acos"] opaque Float.acos : FloatFloat
Floating-Point Arc Cosine Function
Informal description
The function $\text{acos} : \mathbb{F} \to \mathbb{F}$ computes the arc cosine (inverse cosine) of a floating-point number in radians, where $\mathbb{F}$ denotes the set of 64-bit floating-point numbers (IEEE 754 binary64). This function is implemented via the C function `acos` and does not reduce in the kernel.
Float.atan opaque
: Float → Float
Full source
/--
Computes the arc tangent (inverse tangent) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`atan`.
-/
@[extern "atan"] opaque Float.atan : FloatFloat
Floating-Point Arc Tangent Function ($\text{atan}$)
Informal description
The function $\text{atan} : \mathbb{F}_{64} \to \mathbb{F}_{64}$ computes the arc tangent (inverse tangent) of a 64-bit floating-point number in radians, where $\mathbb{F}_{64}$ denotes the set of IEEE 754 binary64 floating-point numbers. This function is implemented via the C function `atan` and does not reduce in the kernel.
Float.atan2 opaque
(y x : Float) : Float
Full source
/--
Computes the arc tangent (inverse tangent) of `y / x` in radians, in the range `-π`–`π`. The signs
of the arguments determine the quadrant of the result.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`atan2`.
-/
@[extern "atan2"] opaque Float.atan2 (y x : Float) : Float
Two-argument Arc Tangent Function ($\text{atan2}$) for Floating-Point Numbers
Informal description
The function $\text{atan2}(y, x)$ computes the arc tangent (inverse tangent) of the ratio $y/x$ in radians, returning a value in the range $(-\pi, \pi]$. The signs of $y$ and $x$ determine the quadrant of the result. This function is implemented via the C function `atan2` and does not reduce in the kernel.
Float.sinh opaque
: Float → Float
Full source
/--
Computes the hyperbolic sine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`sinh`.
-/
@[extern "sinh"] opaque Float.sinh : FloatFloat
Hyperbolic Sine Function for Floating-Point Numbers
Informal description
The function $\sinh(x)$ computes the hyperbolic sine of a floating-point number $x$.
Float.cosh opaque
: Float → Float
Full source
/--
Computes the hyperbolic cosine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`cosh`.
-/
@[extern "cosh"] opaque Float.cosh : FloatFloat
Hyperbolic Cosine Function for Floating-Point Numbers
Informal description
The function $\cosh(x)$ computes the hyperbolic cosine of a floating-point number $x$.
Float.tanh opaque
: Float → Float
Full source
/--
Computes the hyperbolic tangent of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`tanh`.
-/
@[extern "tanh"] opaque Float.tanh : FloatFloat
Hyperbolic Tangent Function for Floating-Point Numbers
Informal description
The hyperbolic tangent function $\tanh(x)$ for a 64-bit floating-point number $x$ computes the ratio of the hyperbolic sine to the hyperbolic cosine of $x$, i.e., $\tanh(x) = \frac{\sinh(x)}{\cosh(x)}$.
Float.asinh opaque
: Float → Float
Full source
/--
Computes the hyperbolic arc sine (inverse sine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`asinh`.
-/
@[extern "asinh"] opaque Float.asinh : FloatFloat
Inverse Hyperbolic Sine for Floating-Point Numbers
Informal description
The function $\text{asinh} : \text{Float} \to \text{Float}$ computes the inverse hyperbolic sine (arcsinh) of a 64-bit floating-point number. This function is implemented via the C standard library function `asinh` and does not reduce in the Lean kernel.
Float.acosh opaque
: Float → Float
Full source
/--
Computes the hyperbolic arc cosine (inverse cosine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`acosh`.
-/
@[extern "acosh"] opaque Float.acosh : FloatFloat
Inverse Hyperbolic Cosine Function for Floating-Point Numbers ($\operatorname{acosh}$)
Informal description
The function $\operatorname{acosh} : \mathbb{F} \to \mathbb{F}$ computes the inverse hyperbolic cosine (area hyperbolic cosine) of a floating-point number, where $\mathbb{F}$ represents the set of 64-bit floating-point numbers (IEEE 754 binary64).
Float.atanh opaque
: Float → Float
Full source
/--
Computes the hyperbolic arc tangent (inverse tangent) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`atanh`.
-/
@[extern "atanh"] opaque Float.atanh : FloatFloat
Hyperbolic Arc Tangent Function for Floating-Point Numbers
Informal description
The function $\text{atanh} : \mathbb{F} \to \mathbb{F}$ computes the hyperbolic arc tangent (inverse hyperbolic tangent) of a floating-point number, where $\mathbb{F}$ denotes the set of 64-bit floating-point numbers according to the IEEE 754 binary64 standard.
Float.exp opaque
(x : Float) : Float
Full source
/--
Computes the exponential `e^x` of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`exp`.
-/
@[extern "exp"] opaque Float.exp (x : Float) : Float
Exponential Function for Floating-Point Numbers
Informal description
The function $\exp(x)$ computes the exponential of a floating-point number $x$, i.e., $e^x$.
Float.exp2 opaque
(x : Float) : Float
Full source
/--
Computes the base-2 exponential `2^x` of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`exp2`.
-/
@[extern "exp2"] opaque Float.exp2 (x : Float) : Float
Base-2 Exponential Function for Floating-Point Numbers
Informal description
The function $\mathrm{exp2}(x)$ computes the base-2 exponential $2^x$ for a given floating-point number $x$. This function is implemented in compiled code via the C function `exp2` and does not reduce in the Lean kernel.
Float.log opaque
(x : Float) : Float
Full source
/--
Computes the natural logarithm `ln x` of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`log`.
-/
@[extern "log"] opaque Float.log (x : Float) : Float
Natural Logarithm for Floating-Point Numbers
Informal description
The natural logarithm function $\ln(x)$ for a floating-point number $x$ computes $\log_e x$ (the logarithm with base $e$). This function is implemented via the C `log` function and does not reduce in the kernel.
Float.log2 opaque
: Float → Float
Full source
/--
Computes the base-2 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`log2`.
-/
@[extern "log2"] opaque Float.log2 : FloatFloat
Base-2 Logarithm Function for Floating-Point Numbers
Informal description
The function $\log_2(x)$ computes the base-2 logarithm of a floating-point number $x$. This function is implemented in compiled code via the C function `log2` and does not reduce in the Lean kernel.
Float.log10 opaque
: Float → Float
Full source
/--
Computes the base-10 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`log10`.
-/
@[extern "log10"] opaque Float.log10 : FloatFloat
Base-10 Logarithm for Floating-Point Numbers
Informal description
The base-10 logarithm function $\log_{10} : \mathbb{F} \to \mathbb{F}$ for 64-bit floating-point numbers $\mathbb{F}$ (IEEE 754 binary64), where $\mathbb{F}$ includes normal numbers, subnormal numbers, infinities, and NaN values.
Float.pow opaque
: Float → Float → Float
Full source
/--
Raises one floating-point number to the power of another. Typically used via the `^` operator.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`pow`.
-/
@[extern "pow"] opaque Float.pow : FloatFloatFloat
Floating-point exponentiation function
Informal description
The function $\mathrm{pow} : \mathrm{Float} \to \mathrm{Float} \to \mathrm{Float}$ raises one floating-point number to the power of another. This is typically accessed via the `^` operator in Lean.
Float.sqrt opaque
: Float → Float
Full source
/--
Computes the square root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`sqrt`.
-/
@[extern "sqrt"] opaque Float.sqrt : FloatFloat
Square Root Function for 64-bit Floating-Point Numbers ($\sqrt{x}$)
Informal description
The square root function for 64-bit floating-point numbers, denoted as $\sqrt{x}$, computes the principal (non-negative) square root of a floating-point number $x$ according to the IEEE 754 binary64 standard. Special cases include: - $\sqrt{\text{NaN}} = \text{NaN}$, - $\sqrt{+\infty} = +\infty$, - $\sqrt{-0.0} = -0.0$, - $\sqrt{x} = \text{NaN}$ for $x < 0$ (excluding $-0.0$).
Float.cbrt opaque
: Float → Float
Full source
/--
Computes the cube root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`cbrt`.
-/
@[extern "cbrt"] opaque Float.cbrt : FloatFloat
Floating-point cube root function $\sqrt[3]{x}$
Informal description
The cube root function for 64-bit floating-point numbers, denoted as $\sqrt[3]{x}$, computes the real cube root of a given floating-point number $x$. This function is implemented using the C standard library function `cbrt` and handles all special cases (including NaN, infinities, and zeros) according to IEEE 754 specifications.
Float.ceil opaque
: Float → Float
Full source
/--
Computes the ceiling of a floating-point number, which is the smallest integer that's no smaller
than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`ceil`.

Examples:
 * `Float.ceil 1.5 = 2`
 * `Float.ceil (-1.5) = (-1)`
-/
@[extern "ceil"] opaque Float.ceil : FloatFloat
Ceiling Function for Floating-Point Numbers
Informal description
For any floating-point number $x$, the ceiling function $\lceil x \rceil$ returns the smallest integer that is greater than or equal to $x$. This function is implemented using the C `ceil` function and does not reduce in the Lean kernel. Examples: - $\lceil 1.5 \rceil = 2$ - $\lceil -1.5 \rceil = -1$
Float.floor opaque
: Float → Float
Full source
/--
Computes the floor of a floating-point number, which is the largest integer that's no larger
than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`floor`.

Examples:
 * `Float.floor 1.5 = 1`
 * `Float.floor (-1.5) = (-2)`
-/
@[extern "floor"] opaque Float.floor : FloatFloat
Floor Function for Floating-Point Numbers
Informal description
The function $\lfloor \cdot \rfloor : \mathbb{R}_{\text{float}} \to \mathbb{R}_{\text{float}}$ computes the floor of a floating-point number, returning the largest floating-point integer that is less than or equal to the input. Here $\mathbb{R}_{\text{float}}$ denotes the set of 64-bit floating-point numbers (IEEE 754 binary64).
Float.round opaque
: Float → Float
Full source
/--
Rounds to the nearest integer, rounding away from zero at half-way points.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`round`.
-/
@[extern "round"] opaque Float.round : FloatFloat
Floating-Point Rounding Function (IEEE 754 round-to-nearest, ties away from zero)
Informal description
The function $\mathrm{round} : \mathbb{F} \to \mathbb{F}$ takes a 64-bit floating-point number $x$ and returns the nearest integer to $x$, rounding away from zero when $x$ is exactly halfway between two integers (where $\mathbb{F}$ denotes the set of IEEE 754 binary64 floating-point numbers).
Float.abs opaque
: Float → Float
Full source
/--
Computes the absolute value of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`fabs`.
-/
@[extern "fabs"] opaque Float.abs : FloatFloat
Absolute Value for Floating-Point Numbers
Informal description
The function $\mathrm{abs}$ takes a floating-point number $x$ and returns its absolute value $|x|$, computed using the C function `fabs`.
instHomogeneousPowFloat instance
: HomogeneousPow Float
Full source
instance : HomogeneousPow Float := ⟨Float.pow⟩
Homogeneous Power Operation on Floating-Point Numbers
Informal description
The type `Float` of 64-bit floating-point numbers is equipped with a homogeneous power operation, allowing expressions of the form `x ^ y` where both `x` and `y` are floating-point numbers.
instMinFloat instance
: Min Float
Full source
instance : Min Float := minOfLe
Minimum Operation on Floating-Point Numbers
Informal description
The 64-bit floating-point numbers (IEEE 754 binary64) are equipped with a minimum operation $\min$, which takes two floating-point numbers and returns the smaller one according to the IEEE 754 standard ordering, including handling of special cases like `NaN`, infinities, and subnormal numbers.
instMaxFloat instance
: Max Float
Full source
instance : Max Float := maxOfLe
Maximum Operation on Floating-Point Numbers
Informal description
The 64-bit floating-point numbers (IEEE 754 binary64) are equipped with a maximum operation $\max$, which computes the larger of two floating-point numbers according to the IEEE 754 standard, including handling of special cases like `NaN`, infinities, and subnormal numbers.
Float.scaleB opaque
(x : Float) (i : @& Int) : Float
Full source
/--
Efficiently computes `x * 2^i`.

This function does not reduce in the kernel.
-/
@[extern "lean_float_scaleb"]
opaque Float.scaleB (x : Float) (i : @& Int) : Float
Floating-point scaling by powers of two: $x \cdot 2^i$
Informal description
Given a floating-point number $x$ and an integer $i$, the function `Float.scaleB` efficiently computes $x \cdot 2^i$ without kernel reduction.