Init.Data.Float32

float32Spec opaque

: FloatSpec

Full source

opaque float32Spec : FloatSpec := {
  float := Unit,
  val   := (),
  lt    := fun _ _ => True,
  le    := fun _ _ => True,
  decLt := fun _ _ => inferInstanceAs (Decidable True),
  decLe := fun _ _ => inferInstanceAs (Decidable True)
}

32-bit Floating-Point Specification

Informal description

The specification for the 32-bit floating-point type `Float32`.

Float32 structure

Full source

/--
32-bit floating-point numbers.

`Float32` corresponds to the IEEE 754 *binary32* format (`float` in C or `f32` in Rust).
Floating-point numbers are a finite representation of a subset of the real numbers, extended with
extra “sentinel” values that represent undefined and infinite results as well as separate positive
and negative zeroes. Arithmetic on floating-point numbers approximates the corresponding operations
on the real numbers by rounding the results to numbers that are representable, propagating error and
infinite values.

Floating-point numbers include [subnormal numbers](https://en.wikipedia.org/wiki/Subnormal_number).
Their special values are:
 * `NaN`, which denotes a class of “not a number” values that result from operations such as
   dividing zero by zero, and
 * `Inf` and `-Inf`, which represent positive and infinities that result from dividing non-zero
   values by zero.

-/
structure Float32 where
  val : float32Spec.float

32-bit Floating-Point Numbers (IEEE 754 binary32)

Informal description

The structure representing 32-bit floating-point numbers, corresponding to the IEEE 754 binary32 format (known as `float` in C or `f32` in Rust). Floating-point numbers provide a finite representation of a subset of the real numbers, extended with special values including positive and negative zero, infinities (`Inf` and `-Inf`), and "not a number" (`NaN`) values that arise from undefined operations like division by zero. The arithmetic operations on floating-point numbers approximate real arithmetic by rounding results to representable values, propagating errors and special values appropriately. This format also includes subnormal numbers for improved precision near zero.

instNonemptyFloat32 instance

: Nonempty Float32

Full source

instance : Nonempty Float32 := ⟨{ val := float32Spec.val }⟩

Nonemptiness of 32-bit Floating-Point Numbers

Informal description

The type of 32-bit floating-point numbers is nonempty.

Float32.add opaque

: Float32 → Float32 → Float32

Full source

/--
Adds two 32-bit floating-point numbers according to IEEE 754. Typically used via the `+` operator.

This function does not reduce in the kernel. It is compiled to the C addition operator.
-/
@[extern "lean_float32_add"] opaque Float32.add : Float32 → Float32 → Float32

IEEE 754 binary32 floating-point addition

Informal description

The function $\mathrm{add} : \mathrm{Float32} \times \mathrm{Float32} \to \mathrm{Float32}$ implements IEEE 754 binary32 floating-point addition. Given two 32-bit floating-point numbers $x$ and $y$, it returns their sum $x + y$, computed according to IEEE 754 rules including proper handling of special values (zeros, infinities, NaN) and rounding.

Float32.sub opaque

: Float32 → Float32 → Float32

Full source

/--
Subtracts 32-bit floating-point numbers according to IEEE 754. Typically used via the `-` operator.

This function does not reduce in the kernel. It is compiled to the C subtraction operator.
-/
@[extern "lean_float32_sub"] opaque Float32.sub : Float32 → Float32 → Float32

IEEE 754 Subtraction for 32-bit Floating-Point Numbers

Informal description

The subtraction operation for 32-bit floating-point numbers, denoted as $x - y$, follows the IEEE 754 binary32 standard. This operation is typically accessed via the `-` operator and is compiled to the corresponding C subtraction operator. The operation does not reduce in the kernel and handles special cases such as infinities and NaN values according to IEEE 754 rules.

Float32.mul opaque

: Float32 → Float32 → Float32

Full source

/--
Multiplies 32-bit floating-point numbers according to IEEE 754. Typically used via the `*` operator.

This function does not reduce in the kernel. It is compiled to the C multiplication operator.
-/
@[extern "lean_float32_mul"] opaque Float32.mul : Float32 → Float32 → Float32

IEEE 754 Binary32 Floating-Point Multiplication

Informal description

The multiplication operation for 32-bit floating-point numbers, denoted as $x \cdot y$, follows the IEEE 754 binary32 standard. Given two floating-point numbers $x$ and $y$ of type `Float32`, this operation computes their product, handling special cases such as infinities, zeros, and NaN values according to the standard.

Float32.div opaque

: Float32 → Float32 → Float32

Full source

/--
Divides 32-bit floating-point numbers according to IEEE 754. Typically used via the `/` operator.

In Lean, division by zero typically yields zero. For `Float32`, it instead yields either `Inf`,
`-Inf`, or `NaN`.

This function does not reduce in the kernel. It is compiled to the C division operator.
-/
@[extern "lean_float32_div"] opaque Float32.div : Float32 → Float32 → Float32

IEEE 754 Division for 32-bit Floating-Point Numbers

Informal description

The division operation for 32-bit floating-point numbers, denoted $x / y$, follows the IEEE 754 binary32 standard. Given two floating-point numbers $x$ and $y$ of type `Float32`, this operation computes their quotient, handling special cases such as division by zero (yielding `Inf`, `-Inf`, or `NaN`), infinities, and NaN values according to the standard.

Float32.neg opaque

: Float32 → Float32

Full source

/--
Negates 32-bit floating-point numbers according to IEEE 754. Typically used via the `-` prefix
operator.

This function does not reduce in the kernel. It is compiled to the C negation operator.
-/
@[extern "lean_float32_negate"] opaque Float32.neg : Float32 → Float32

Negation Operation for 32-bit Floating-Point Numbers

Informal description

The unary negation operation on 32-bit floating-point numbers, denoted $-x$, follows the IEEE 754 standard for binary32 format. This operation is typically used via the `-` prefix operator and does not reduce in the kernel, instead being compiled to the C negation operator.

Float32.lt definition

: Float32 → Float32 → Prop

Full source

/--
Strict inequality of floating-point numbers. Typically used via the `<` operator.
-/
def Float32.lt : Float32 → Float32 → Prop := fun a b =>
  match a, b with
  | ⟨a⟩, ⟨b⟩ => float32Spec.lt a b

Strict inequality of 32-bit floating-point numbers

Informal description

The strict inequality relation `<` between two 32-bit floating-point numbers $a$ and $b$, defined according to the IEEE 754 binary32 specification. This relation holds when $a$ is strictly less than $b$ in the floating-point ordering, handling special cases like infinities and NaN values as specified by the standard.

Float32.le definition

: Float32 → Float32 → Prop

Full source

/--
Non-strict inequality of floating-point numbers. Typically used via the `≤` operator.
-/
def Float32.le : Float32 → Float32 → Prop := fun a b =>
  float32Spec.le a.val b.val

Non-strict inequality of 32-bit floating-point numbers

Informal description

The non-strict inequality relation $\leq$ between two 32-bit floating-point numbers $a$ and $b$, defined according to the IEEE 754 binary32 specification. This relation holds when $a$ is less than or equal to $b$ in the floating-point ordering, handling special cases like infinities and NaN values as specified by the standard.

Float32.ofBits opaque

: UInt32 → Float32

Full source

/--
Bit-for-bit conversion from `UInt32`. Interprets a `UInt32` as a `Float32`, ignoring the numeric
value and treating the `UInt32`'s bit pattern as a `Float32`.

`Float32`s and `UInt32`s have the same endianness on all supported platforms. IEEE 754 very
precisely specifies the bit layout of floats.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_of_bits"] opaque Float32.ofBits : UInt32 → Float32

Bitwise Conversion from Unsigned 32-bit Integer to 32-bit Floating-Point Number

Informal description

The function $\text{Float32.ofBits}$ converts a 32-bit unsigned integer to a 32-bit floating-point number by interpreting the bit pattern of the unsigned integer as an IEEE 754 binary32 floating-point number. This conversion preserves the exact bit representation without any numerical interpretation or rounding.

Float32.toBits opaque

: Float32 → UInt32

Full source

/--
Bit-for-bit conversion to `UInt32`. Interprets a `Float32` as a `UInt32`, ignoring the numeric value
and treating the `Float32`'s bit pattern as a `UInt32`.

`Float32`s and `UInt32`s have the same endianness on all supported platforms. IEEE 754 very
precisely specifies the bit layout of floats.

This function is distinct from `Float.toUInt32`, which attempts to preserve the numeric value rather
than reinterpreting the bit pattern.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_to_bits"] opaque Float32.toBits : Float32 → UInt32

Bitwise Conversion from Float32 to UInt32

Informal description

The function $\mathrm{toBits}$ converts a 32-bit floating-point number (IEEE 754 binary32) to an unsigned 32-bit integer by reinterpreting the bit pattern of the floating-point number without changing its numeric value. This operation preserves the exact bit representation, including special values like infinities and NaN, and is distinct from numeric conversion functions that attempt to preserve the numeric value.

instAddFloat32 instance

: Add Float32

Full source

instance : Add Float32 := ⟨Float32.add⟩

Addition Operation on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\mathrm{Float32}$ are equipped with an addition operation $+ : \mathrm{Float32} \to \mathrm{Float32} \to \mathrm{Float32}$ following the IEEE 754 binary32 standard.

instSubFloat32 instance

: Sub Float32

Full source

instance : Sub Float32 := ⟨Float32.sub⟩

Subtraction Operation on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\text{Float32}$ are equipped with a subtraction operation $- : \text{Float32} \to \text{Float32} \to \text{Float32}$ following the IEEE 754 binary32 standard.

instMulFloat32 instance

: Mul Float32

Full source

instance : Mul Float32 := ⟨Float32.mul⟩

Multiplication Operation on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\text{Float32}$ are equipped with a multiplication operation $* : \text{Float32} \to \text{Float32} \to \text{Float32}$ following the IEEE 754 binary32 standard.

instDivFloat32 instance

: Div Float32

Full source

instance : Div Float32 := ⟨Float32.div⟩

Division Operation on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\text{Float32}$ are equipped with a division operation $/ : \text{Float32} \to \text{Float32} \to \text{Float32}$.

instNegFloat32 instance

: Neg Float32

Full source

instance : Neg Float32 := ⟨Float32.neg⟩

Negation Operation on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\text{Float32}$ are equipped with a negation operation $- : \text{Float32} \to \text{Float32}$ following the IEEE 754 binary32 standard.

instLTFloat32 instance

: LT Float32

Full source

instance : LT Float32  := ⟨Float32.lt⟩

Strict Inequality on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\text{Float32}$ are equipped with a strict inequality relation $<$, defined according to the IEEE 754 binary32 standard.

instLEFloat32 instance

: LE Float32

Full source

instance : LE Float32  := ⟨Float32.le⟩

Non-strict Inequality on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\text{Float32}$ are equipped with a non-strict inequality relation $\leq$, defined according to the IEEE 754 binary32 standard.

Float32.beq opaque

(a b : Float32) : Bool

Full source

/--
Checks whether two floating-point numbers are equal according to IEEE 754.

Floating-point equality does not correspond with propositional equality. In particular, it is not
reflexive since `NaN != NaN`, and it is not a congruence because `0.0 == -0.0`, but
`1.0 / 0.0 != 1.0 / -0.0`.

This function does not reduce in the kernel. It is compiled to the C equality operator.
-/
@[extern "lean_float32_beq"] opaque Float32.beq (a b : Float32) : Bool

IEEE 754 Equality for 32-bit Floating-Point Numbers

Informal description

The function `Float32.beq` checks whether two 32-bit floating-point numbers `a` and `b` are equal according to the IEEE 754 standard. This equality is not reflexive (since `NaN != NaN`) and not a congruence (since `0.0 == -0.0` but `1.0 / 0.0 != 1.0 / -0.0`). The function returns a Boolean value indicating the result of this comparison.

instBEqFloat32 instance

: BEq Float32

Full source

instance : BEq Float32 := ⟨Float32.beq⟩

IEEE 754 Boolean Equality for 32-bit Floating-Point Numbers

Informal description

The type `Float32` of 32-bit floating-point numbers is equipped with a boolean equality relation `==` that follows the IEEE 754 standard. This relation is not reflexive (since `NaN != NaN`) and not a congruence (since `0.0 == -0.0` but `1.0 / 0.0 != 1.0 / -0.0`).

Float32.decLt opaque

(a b : Float32) : Decidable (a < b)

Full source

/--
Compares two floating point numbers for strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.
-/
@[extern "lean_float32_decLt"] opaque Float32.decLt (a b : Float32) : Decidable (a < b) :=
  match a, b with
  | ⟨a⟩, ⟨b⟩ => float32Spec.decLt a b

Decidability of Strict Inequality for 32-bit Floating-Point Numbers

Informal description

For any two 32-bit floating-point numbers `a` and `b`, the function `Float32.decLt` provides a constructive proof (via the `Decidable` typeclass) that either `a < b` holds or its negation holds, according to the IEEE 754 binary32 standard for strict inequality.

Float32.decLe opaque

(a b : Float32) : Decidable (a ≤ b)

Full source

/--
Compares two floating point numbers for non-strict inequality.

This function does not reduce in the kernel. It is compiled to the C inequality operator.
-/
@[extern "lean_float32_decLe"] opaque Float32.decLe (a b : Float32) : Decidable (a ≤ b) :=
  match a, b with
  | ⟨a⟩, ⟨b⟩ => float32Spec.decLe a b

Decidability of Non-Strict Inequality for 32-bit Floating-Point Numbers

Informal description

For any two 32-bit floating-point numbers $a$ and $b$, the proposition $a \leq b$ is decidable according to the IEEE 754 binary32 standard for non-strict inequality.

float32DecLt instance

(a b : Float32) : Decidable (a < b)

Full source

instance float32DecLt (a b : Float32) : Decidable (a < b) := Float32.decLt a b

Decidability of Strict Inequality for 32-bit Floating-Point Numbers

Informal description

For any two 32-bit floating-point numbers $a$ and $b$, the strict inequality $a < b$ is decidable according to the IEEE 754 binary32 standard.

float32DecLe instance

(a b : Float32) : Decidable (a ≤ b)

Full source

instance float32DecLe (a b : Float32) : Decidable (a ≤ b) := Float32.decLe a b

Decidability of Non-Strict Inequality for 32-bit Floating-Point Numbers

Informal description

For any two 32-bit floating-point numbers $a$ and $b$, the proposition $a \leq b$ is decidable according to the IEEE 754 binary32 standard for non-strict inequality.

Float32.toString opaque

: Float32 → String

Full source

/--
Converts a floating-point number to a string.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_to_string"] opaque Float32.toString : Float32 → String

String Conversion for 32-bit Floating-Point Numbers

Informal description

The function converts a 32-bit floating-point number to its string representation.

Float32.toUInt8 opaque

: Float32 → UInt8

Full source

/--
Converts a floating-point number to an 8-bit unsigned integer.

If the given `Float32` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt8`. Returns `0` if the `Float32` is negative or `NaN`, and returns the
largest `UInt8` value (i.e. `UInt8.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_to_uint8"] opaque Float32.toUInt8 : Float32 → UInt8

Conversion from Float32 to UInt8 with Clamping

Informal description

The function converts a 32-bit floating-point number to an 8-bit unsigned integer by truncating and clamping the value. Specifically: - If the input is non-negative, it rounds down to the nearest integer and clamps the result to the range $[0, 255]$. - If the input is negative or `NaN`, it returns $0$. - If the input exceeds the maximum representable value of `UInt8`, it returns $255$.

Float32.toUInt16 opaque

: Float32 → UInt16

Full source

/--
Converts a floating-point number to a 16-bit unsigned integer.

If the given `Float32` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt16`. Returns `0` if the `Float32` is negative or `NaN`, and returns
the largest `UInt16` value (i.e. `UInt16.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_to_uint16"] opaque Float32.toUInt16 : Float32 → UInt16

Conversion from `Float32` to `UInt16` with Clamping and Truncation

Informal description

The function converts a 32-bit floating-point number to a 16-bit unsigned integer. For a non-negative input, it truncates the value by rounding down and clamps it to the range of `UInt16` (i.e., between 0 and 65535). If the input is negative, `NaN`, or `-Inf`, it returns 0. If the input exceeds the maximum representable `UInt16` value, it returns 65535.

Float32.toUInt32 opaque

: Float32 → UInt32

Full source

/--
Converts a floating-point number to a 32-bit unsigned integer.

If the given `Float32` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt32`. Returns `0` if the `Float32` is negative or `NaN`, and returns
the largest `UInt32` value (i.e. `UInt32.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_to_uint32"] opaque Float32.toUInt32 : Float32 → UInt32

Conversion from `Float32` to `UInt32` with truncation and clamping

Informal description

The function converts a 32-bit floating-point number to a 32-bit unsigned integer. For a non-negative input, it truncates the value towards zero and clamps the result to the range of `UInt32`. Specifically: - If the input is negative or `NaN`, the output is `0`. - If the input exceeds the maximum value of `UInt32`, the output is `UInt32.size - 1`. - Otherwise, the output is the truncated integer value of the input.

Float32.toUInt64 opaque

: Float32 → UInt64

Full source

/--
Converts a floating-point number to a 64-bit unsigned integer.

If the given `Float32` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `UInt64`. Returns `0` if the `Float32` is negative or `NaN`, and returns
the largest `UInt64` value (i.e. `UInt64.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_to_uint64"] opaque Float32.toUInt64 : Float32 → UInt64

Truncation and Clamping of `Float32` to `UInt64`

Informal description

Given a 32-bit floating-point number $x$, the function converts $x$ to a 64-bit unsigned integer as follows: - If $x$ is non-negative, it truncates $x$ to the nearest integer less than or equal to $x$, clamping the result to the range $[0, 2^{64} - 1]$. - If $x$ is negative or `NaN`, it returns $0$. - If $x$ exceeds the maximum representable value of `UInt64`, it returns $2^{64} - 1$.

Float32.toUSize opaque

: Float32 → USize

Full source

/--
Converts a floating-point number to a word-sized unsigned integer.

If the given `Float32` is non-negative, truncates the value to a positive integer, rounding down and
clamping to the range of `USize`. Returns `0` if the `Float32` is negative or `NaN`, and returns the
largest `USize` value (i.e. `USize.size - 1`) if the float is larger than it.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_to_usize"] opaque Float32.toUSize : Float32 → USize

Truncation and Clamping of `Float32` to Platform-Dependent Unsigned Word-Size Integer

Informal description

The function converts a 32-bit floating-point number $x$ to a platform-dependent unsigned word-size integer as follows: - If $x$ is non-negative, it truncates $x$ to the nearest integer less than or equal to $x$, clamping the result to the range $[0, n-1]$ where $n$ is the maximum value of `USize` (i.e., `USize.size - 1`). - If $x$ is negative or `NaN`, it returns $0$. - If $x$ exceeds the maximum representable value of `USize`, it returns $n-1$.

Float32.isNaN opaque

: Float32 → Bool

Full source

/--
Checks whether a floating point number is `NaN` ("not a number") value.

`NaN` values result from operations that might otherwise be errors, such as dividing zero by zero.

This function does not reduce in the kernel. It is compiled to the C operator `isnan`.
-/
@[extern "lean_float32_isnan"] opaque Float32.isNaN : Float32 → Bool

Check for NaN in 32-bit floating-point numbers

Informal description

Given a 32-bit floating-point number $x$, the function $\mathrm{isNaN}(x)$ returns `true` if $x$ is a "not a number" (`NaN`) value, and `false` otherwise. `NaN` values arise from undefined operations such as division of zero by zero.

Float32.isFinite opaque

: Float32 → Bool

Full source

/--
Checks whether a floating-point number is finite, that is, whether it is normal, subnormal, or zero,
but not infinite or `NaN`.

This function does not reduce in the kernel. It is compiled to the C operator `isfinite`.
-/
@[extern "lean_float32_isfinite"] opaque Float32.isFinite : Float32 → Bool

Finite Check for 32-bit Floating-Point Numbers

Informal description

The function `isFinite` checks whether a given 32-bit floating-point number is finite, i.e., it is either normal, subnormal, or zero, but not infinite or `NaN`.

Float32.isInf opaque

: Float32 → Bool

Full source

/--
Checks whether a floating-point number is a positive or negative infinite number, but not a finite
number or `NaN`.

This function does not reduce in the kernel. It is compiled to the C operator `isinf`.
-/
@[extern "lean_float32_isinf"] opaque Float32.isInf : Float32 → Bool

Infinity Check for 32-bit Floating-Point Numbers

Informal description

The function `isInf` checks whether a given 32-bit floating-point number is either positive or negative infinity, returning `true` if it is and `false` otherwise. It does not classify finite numbers or `NaN` as infinite.

Float32.frExp opaque

: Float32 → Float32 × Int

Full source

/--
Splits the given float `x` into a significand/exponent pair `(s, i)` such that `x = s * 2^i` where
`s ∈ (-1;-0.5] ∪ [0.5; 1)`. Returns an undefined value if `x` is not finite.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`frexp`.
-/
@[extern "lean_float32_frexp"] opaque Float32.frExp : Float32 → Float32Float32 × Int

Floating-point significand/exponent decomposition ($x = s \cdot 2^i$)

Informal description

For any finite 32-bit floating-point number $x$, the function returns a pair $(s, i)$ where $s$ is the significand in the range $(-1, -0.5] \cup [0.5, 1)$ and $i$ is the exponent, such that $x = s \cdot 2^i$. The result is undefined if $x$ is not finite.

instToStringFloat32 instance

: ToString Float32

Full source

instance : ToString Float32 where
  toString := Float32.toString

String Representation of 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers have a canonical string representation.

UInt8.toFloat32 opaque

(n : UInt8) : Float32

Full source

/-- Obtains the `Float32` whose value is the same as the given `UInt8`. -/
@[extern "lean_uint8_to_float32"] opaque UInt8.toFloat32 (n : UInt8) : Float32

Conversion from Unsigned 8-bit Integer to 32-bit Floating-Point Number

Informal description

For any unsigned 8-bit integer $n$, the function `UInt8.toFloat32` returns a 32-bit floating-point number with the same numerical value as $n$.

UInt16.toFloat32 opaque

(n : UInt16) : Float32

Full source

/-- Obtains the `Float32` whose value is the same as the given `UInt16`. -/
@[extern "lean_uint16_to_float32"] opaque UInt16.toFloat32 (n : UInt16) : Float32

Conversion from Unsigned 16-bit Integer to 32-bit Float

Informal description

Given an unsigned 16-bit integer $n$, the function returns a 32-bit floating-point number with the same value as $n$.

UInt32.toFloat32 opaque

(n : UInt32) : Float32

Full source

/--
Obtains a `Float32` whose value is near the given `UInt32`.

It will be exactly the value of the given `UInt32` if such a `Float32` exists. If no such `Float32`
exists, the returned value will either be the smallest `Float32` that is larger than the given
value, or the largest `Float32` that is smaller than the given value.

This function is opaque in the kernel, but is overridden at runtime with an efficient
implementation.
-/
@[extern "lean_uint32_to_float32"] opaque UInt32.toFloat32 (n : UInt32) : Float32

Approximate Conversion from Unsigned 32-bit Integer to 32-bit Float

Informal description

Given an unsigned 32-bit integer $n$, this function returns a 32-bit floating-point number that approximates $n$. If $n$ can be exactly represented as a `Float32`, the result is exact. Otherwise, the result is either the smallest `Float32` larger than $n$ or the largest `Float32` smaller than $n$.

UInt64.toFloat32 opaque

(n : UInt64) : Float32

Full source

/--
Obtains a `Float32` whose value is near the given `UInt64`.

It will be exactly the value of the given `UInt64` if such a `Float32` exists. If no such `Float32`
exists, the returned value will either be the smallest `Float32` that is larger than the given
value, or the largest `Float32` that is smaller than the given value.

This function is opaque in the kernel, but is overridden at runtime with an efficient
implementation.
-/
@[extern "lean_uint64_to_float32"] opaque UInt64.toFloat32 (n : UInt64) : Float32

Approximate Conversion from Unsigned 64-bit Integer to 32-bit Float

Informal description

Given an unsigned 64-bit integer $n$, the function returns a 32-bit floating-point number that approximates $n$. If $n$ is exactly representable as a `Float32`, the result is exact. Otherwise, the result is either the smallest `Float32` larger than $n$ or the largest `Float32` smaller than $n$.

USize.toFloat32 opaque

(n : USize) : Float32

Full source

/-- Obtains a `Float32` whose value is near the given `USize`.

It will be exactly the value of the given `USize` if such a `Float32` exists. If no such `Float32`
exists, the returned value will either be the smallest `Float32` that is larger than the given
value, or the largest `Float32` that is smaller than the given value.

This function is opaque in the kernel, but is overridden at runtime with an efficient
implementation.
-/
@[extern "lean_usize_to_float32"] opaque USize.toFloat32 (n : USize) : Float32

Approximate Conversion from Word-Size Unsigned Integer to 32-bit Float

Informal description

Given an unsigned word-size integer $n$ (of type `USize`), the function returns a 32-bit floating-point number (of type `Float32`) that approximates $n$. If $n$ can be exactly represented as a `Float32`, the result is exact. Otherwise, the result is either the smallest `Float32` larger than $n$ or the largest `Float32` smaller than $n$.

instInhabitedFloat32 instance

: Inhabited Float32

Full source

instance : Inhabited Float32 where
  default := UInt64.toFloat32 0

Inhabited Instance for 32-bit Floating-Point Numbers

Informal description

The type of 32-bit floating-point numbers is inhabited, with a default value (typically 0.0 or NaN).

instReprFloat32 instance

: Repr Float32

Full source

instance : Repr Float32 where
  reprPrec n prec := if n < UInt64.toFloat32 0 then Repr.addAppParen (toString n) prec else toString n

Canonical Representation of 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\text{Float32}$ have a canonical representation as terms.

instReprAtomFloat32 instance

: ReprAtom Float32

Full source

instance : ReprAtom Float32  := ⟨⟩

Atomic Representation of 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers (`Float32`) have a canonical representation as atomic terms.

Float32.sin opaque

: Float32 → Float32

Full source

/--
Computes the sine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`sinf`.
-/
@[extern "sinf"] opaque Float32.sin : Float32 → Float32

Sine Function for 32-bit Floating-Point Numbers

Informal description

The sine function for 32-bit floating-point numbers, which computes $\sin(x)$ where $x$ is given in radians. This function is implemented via the C function `sinf` and does not reduce in the Lean kernel.

Float32.cos opaque

: Float32 → Float32

Full source

/--
Computes the cosine of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`cosf`.
-/
@[extern "cosf"] opaque Float32.cos : Float32 → Float32

Cosine Function for 32-bit Floating-Point Numbers

Informal description

The function $\cos \colon \text{Float32} \to \text{Float32}$ computes the cosine of a 32-bit floating-point number (in radians). This function is implemented via the C function `cosf` and does not reduce in the kernel.

Float32.tan opaque

: Float32 → Float32

Full source

/--
Computes the tangent of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`tanf`.
-/
@[extern "tanf"] opaque Float32.tan : Float32 → Float32

Tangent function for 32-bit floating-point numbers

Informal description

The function $\tan$ computes the tangent of a 32-bit floating-point number given in radians.

Float32.asin opaque

: Float32 → Float32

Full source

/--
Computes the arc sine (inverse sine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`asinf`.
-/
@[extern "asinf"] opaque Float32.asin : Float32 → Float32

Arc Sine Function for 32-bit Floating-Point Numbers

Informal description

The function $\arcsin$ computes the inverse sine (arc sine) of a 32-bit floating-point number, returning the result in radians. This function is implemented via the C function `asinf` and does not reduce in the Lean kernel.

Float32.acos opaque

: Float32 → Float32

Full source

/--
Computes the arc cosine (inverse cosine) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`acosf`.
-/
@[extern "acosf"] opaque Float32.acos : Float32 → Float32

Arc Cosine Function for 32-bit Floating-Point Numbers

Informal description

The function $\text{acos} : \text{Float32} \to \text{Float32}$ computes the arc cosine (inverse cosine) of a 32-bit floating-point number in radians. This function is implemented via the C function `acosf` and does not reduce in the kernel.

Float32.atan opaque

: Float32 → Float32

Full source

/--
Computes the arc tangent (inverse tangent) of a floating-point number in radians.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`atanf`.
-/
@[extern "atanf"] opaque Float32.atan : Float32 → Float32

Arc Tangent Function for 32-bit Floating-Point Numbers

Informal description

The function $\text{atan} : \text{Float32} \to \text{Float32}$ computes the principal value of the arc tangent (inverse tangent) of a 32-bit floating-point number, returning the result in radians.

Float32.atan2 opaque

: Float32 → Float32 → Float32

Full source

/--
Computes the arc tangent (inverse tangent) of `y / x` in radians, in the range `-π`–`π`. The signs
of the arguments determine the quadrant of the result.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`atan2f`.
-/
@[extern "atan2f"] opaque Float32.atan2 : Float32 → Float32 → Float32

Two-Argument Arc Tangent Function for 32-bit Floats

Informal description

The function $\operatorname{atan2}(y, x)$ computes the arc tangent (inverse tangent) of $y/x$ in radians, returning a value in the range $(-\pi, \pi]$. The signs of $x$ and $y$ determine the quadrant of the result. This function is implemented via the C function `atan2f` and does not reduce in the Lean kernel.

Float32.sinh opaque

: Float32 → Float32

Full source

/--
Computes the hyperbolic sine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`sinhf`.
-/
@[extern "sinhf"] opaque Float32.sinh : Float32 → Float32

Hyperbolic Sine Function for 32-bit Floating-Point Numbers

Informal description

The function $\sinh(x)$ computes the hyperbolic sine of a 32-bit floating-point number $x$.

Float32.cosh opaque

: Float32 → Float32

Full source

/--
Computes the hyperbolic cosine of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`coshf`.
-/
@[extern "coshf"] opaque Float32.cosh : Float32 → Float32

Hyperbolic Cosine Function for 32-bit Floating-Point Numbers

Informal description

The function $\cosh(x)$ computes the hyperbolic cosine of a 32-bit floating-point number $x$.

Float32.tanh opaque

: Float32 → Float32

Full source

/--
Computes the hyperbolic tangent of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`tanhf`.
-/
@[extern "tanhf"] opaque Float32.tanh : Float32 → Float32

Hyperbolic Tangent Function for 32-bit Floating-Point Numbers

Informal description

The hyperbolic tangent function $\tanh(x)$ for a 32-bit floating-point number $x$ is computed as $\tanh(x) = \frac{\sinh(x)}{\cosh(x)}$.

Float32.asinh opaque

: Float32 → Float32

Full source

/--
Computes the hyperbolic arc sine (inverse sine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`asinhf`.
-/
@[extern "asinhf"] opaque Float32.asinh : Float32 → Float32

Inverse Hyperbolic Sine for 32-bit Floating-Point Numbers

Informal description

The function $\text{asinh} : \text{Float32} \to \text{Float32}$ computes the inverse hyperbolic sine (arcsinh) of a 32-bit floating-point number. This function is implemented via the C function `asinhf` and does not reduce in the Lean kernel.

Float32.acosh opaque

: Float32 → Float32

Full source

/--
Computes the hyperbolic arc cosine (inverse cosine) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`acoshf`.
-/
@[extern "acoshf"] opaque Float32.acosh : Float32 → Float32

Inverse Hyperbolic Cosine for 32-bit Floating-Point Numbers

Informal description

The function $\text{acosh} : \mathbb{F}_{32} \to \mathbb{F}_{32}$ computes the inverse hyperbolic cosine (area hyperbolic cosine) of a 32-bit floating-point number, where $\mathbb{F}_{32}$ denotes the set of IEEE 754 binary32 floating-point numbers.

Float32.atanh opaque

: Float32 → Float32

Full source

/--
Computes the hyperbolic arc tangent (inverse tangent) of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`atanhf`.
-/
@[extern "atanhf"] opaque Float32.atanh : Float32 → Float32

Inverse Hyperbolic Tangent for 32-bit Floating-Point Numbers

Informal description

The function $\mathrm{atanh}$ computes the inverse hyperbolic tangent of a 32-bit floating-point number. This function is implemented in compiled code via the C function `atanhf` and does not reduce in the kernel.

Float32.exp opaque

: Float32 → Float32

Full source

/--
Computes the exponential `e^x` of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`expf`.
-/
@[extern "expf"] opaque Float32.exp : Float32 → Float32

Exponential Function for 32-bit Floating-Point Numbers

Informal description

The function $\exp \colon \mathrm{Float32} \to \mathrm{Float32}$ computes the exponential $e^x$ of a 32-bit floating-point number $x$, where $e$ is the base of the natural logarithm. This function is implemented via the C function `expf` and does not reduce in the Lean kernel.

Float32.exp2 opaque

: Float32 → Float32

Full source

/--
Computes the base-2 exponential `2^x` of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`exp2f`.
-/
@[extern "exp2f"] opaque Float32.exp2 : Float32 → Float32

Base-2 Exponential Function for 32-bit Floating-Point Numbers

Informal description

The function $\exp_2$ computes the base-2 exponential $2^x$ for a given 32-bit floating-point number $x$. This function is implemented in compiled code via the C function `exp2f` and does not reduce in the kernel.

Float32.log opaque

: Float32 → Float32

Full source

/--
Computes the natural logarithm `ln x` of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`logf`.
-/
@[extern "logf"] opaque Float32.log : Float32 → Float32

Natural Logarithm for 32-bit Floating-Point Numbers

Informal description

The natural logarithm function $\ln \colon \mathrm{Float32} \to \mathrm{Float32}$ computes $\ln x$ for a 32-bit floating-point number $x$. This function is implemented via the C function `logf` and does not reduce in the Lean kernel.

Float32.log2 opaque

: Float32 → Float32

Full source

/--
Computes the base-2 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`log2f`.
-/
@[extern "log2f"] opaque Float32.log2 : Float32 → Float32

Base-2 Logarithm for 32-bit Floating-Point Numbers

Informal description

The function $\log_2$ computes the base-2 logarithm of a 32-bit floating-point number. This function is implemented in compiled code via the C function `log2f` and does not reduce in the kernel.

Float32.log10 opaque

: Float32 → Float32

Full source

/--
Computes the base-10 logarithm of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`log10f`.
-/
@[extern "log10f"] opaque Float32.log10 : Float32 → Float32

Base-10 Logarithm for 32-bit Floating-Point Numbers

Informal description

The function $\log_{10}$ computes the base-10 logarithm of a 32-bit floating-point number. This function is implemented via the C function `log10f` and does not reduce in the Lean kernel.

Float32.pow opaque

: Float32 → Float32 → Float32

Full source

/--
Raises one floating-point number to the power of another. Typically used via the `^` operator.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`powf`.
-/
@[extern "powf"] opaque Float32.pow : Float32 → Float32 → Float32

Floating-point exponentiation for 32-bit numbers

Informal description

The function $\mathrm{pow} : \mathrm{Float32} \to \mathrm{Float32} \to \mathrm{Float32}$ computes the first 32-bit floating-point number raised to the power of the second 32-bit floating-point number. This operation is typically accessed via the `^` operator and is implemented using the C function `powf`.

Float32.sqrt opaque

: Float32 → Float32

Full source

/--
Computes the square root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`sqrtf`.
-/
@[extern "sqrtf"] opaque Float32.sqrt : Float32 → Float32

Square Root Function for 32-bit Floating-Point Numbers

Informal description

The function $\sqrt{\cdot}$ computes the square root of a 32-bit floating-point number. This function is implemented via the C function `sqrtf` and does not reduce in the Lean kernel.

Float32.cbrt opaque

: Float32 → Float32

Full source

/--
Computes the cube root of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`cbrtf`.
-/
@[extern "cbrtf"] opaque Float32.cbrt : Float32 → Float32

Cube Root Function for 32-bit Floating-Point Numbers

Informal description

The cube root function for 32-bit floating-point numbers, denoted as $\sqrt[3]{x}$, takes a floating-point number $x$ and returns its cube root. This function is implemented using the C function `cbrtf` and does not reduce in the kernel.

Float32.ceil opaque

: Float32 → Float32

Full source

/--
Computes the ceiling of a floating-point number, which is the smallest integer that's no smaller
than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`ceilf`.

Examples:
 * `Float32.ceil 1.5 = 2`
 * `Float32.ceil (-1.5) = (-1)`
-/
@[extern "ceilf"] opaque Float32.ceil : Float32 → Float32

Ceiling Function for 32-bit Floating-Point Numbers

Informal description

The ceiling function $\lceil x \rceil$ for a 32-bit floating-point number $x$ returns the smallest integer that is greater than or equal to $x$. This function is implemented using the C `ceilf` function and does not reduce in the kernel.

Float32.floor opaque

: Float32 → Float32

Full source

/--
Computes the floor of a floating-point number, which is the largest integer that's no larger
than the given number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`floorf`.

Examples:
 * `Float32.floor 1.5 = 1`
 * `Float32.floor (-1.5) = (-2)`
-/
@[extern "floorf"] opaque Float32.floor : Float32 → Float32

Floor function for 32-bit floating-point numbers

Informal description

The function $\lfloor \cdot \rfloor : \mathbb{F}_{32} \to \mathbb{F}_{32}$ computes the floor of a 32-bit floating-point number, returning the largest representable floating-point number that is less than or equal to the input value. Here $\mathbb{F}_{32}$ denotes the set of IEEE 754 binary32 floating-point numbers.

Float32.round opaque

: Float32 → Float32

Full source

/--
Rounds to the nearest integer, rounding away from zero at half-way points.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`roundf`.
-/
@[extern "roundf"] opaque Float32.round : Float32 → Float32

Rounding Function for 32-bit Floating-Point Numbers

Informal description

The function $\mathrm{round} : \mathrm{Float32} \to \mathrm{Float32}$ rounds a 32-bit floating-point number to the nearest integer, with ties (numbers exactly halfway between two integers) rounded away from zero.

Float32.abs opaque

: Float32 → Float32

Full source

/--
Computes the absolute value of a floating-point number.

This function does not reduce in the kernel. It is implemented in compiled code by the C function
`fabsf`.
-/
@[extern "fabsf"] opaque Float32.abs : Float32 → Float32

Absolute Value Function for 32-bit Floating-Point Numbers

Informal description

The function $\text{abs} : \text{Float32} \to \text{Float32}$ computes the absolute value of a 32-bit floating-point number. For a given floating-point number $x$, it returns $|x|$, which is $x$ if $x \geq 0$ and $-x$ otherwise. This function handles special floating-point values (such as infinities and NaN) according to the IEEE 754 standard.

instHomogeneousPowFloat32 instance

: HomogeneousPow Float32

Full source

instance : HomogeneousPow Float32 := ⟨Float32.pow⟩

Homogeneous Power Operation on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\mathrm{Float32}$ have a homogeneous power operation, where both the base and exponent are of type $\mathrm{Float32}$.

instMinFloat32 instance

: Min Float32

Full source

instance : Min Float32 := minOfLe

Minimum Operation on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\mathrm{Float32}$ are equipped with a minimum operation, where for any two numbers $a$ and $b$, $\min(a, b)$ is defined to be $a$ if $a \leq b$ and $b$ otherwise, according to the IEEE 754 binary32 standard.

instMaxFloat32 instance

: Max Float32

Full source

instance : Max Float32 := maxOfLe

Maximum Operation on 32-bit Floating-Point Numbers

Informal description

The 32-bit floating-point numbers $\mathrm{Float32}$ are equipped with a maximum operation $\max$, defined according to the IEEE 754 binary32 standard.

Float32.scaleB opaque

(x : Float32) (i : @& Int) : Float32

Full source

/--
Efficiently computes `x * 2^i`.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_scaleb"]
opaque Float32.scaleB (x : Float32) (i : @& Int) : Float32

Efficient Scaling of 32-bit Floating-Point Numbers by Powers of Two

Informal description

For a 32-bit floating-point number $x$ and an integer $i$, the function `Float32.scaleB` efficiently computes $x \cdot 2^i$ without reducing the computation in the kernel.

Float32.toFloat opaque

: Float32 → Float

Full source

/--
Converts a 32-bit floating-point number to a 64-bit floating-point number.

This function does not reduce in the kernel.
-/
@[extern "lean_float32_to_float"] opaque Float32.toFloat : Float32 → Float

Conversion from 32-bit to 64-bit Floating-Point Numbers

Informal description

The function converts a 32-bit floating-point number (IEEE 754 binary32) to a 64-bit floating-point number (IEEE 754 binary64).

Float.toFloat32 opaque

: Float → Float32

Full source

/--
Converts a 64-bit floating-point number to a 32-bit floating-point number.
This may lose precision.

This function does not reduce in the kernel.
-/
@[extern "lean_float_to_float32"] opaque Float.toFloat32 : Float → Float32

Conversion from 64-bit to 32-bit floating-point numbers

Informal description

The function converts a 64-bit floating-point number to a 32-bit floating-point number, potentially losing precision in the process. This conversion does not reduce in the kernel.