Init.Data.OfScientific

OfScientific structure

(α : Type u)

Full source

/-- For decimal and scientific numbers (e.g., `1.23`, `3.12e10`).
   Examples:
   - `1.23` is syntax for `OfScientific.ofScientific (nat_lit 123) true (nat_lit 2)`
   - `121e100` is syntax for `OfScientific.ofScientific (nat_lit 121) false (nat_lit 100)`

   Note the use of `nat_lit`; there is no wrapping `OfNat.ofNat` in the resulting term.
-/
class OfScientific (α : Type u) where
  /--
  Produces a value from the given mantissa, exponent sign, and decimal exponent. For the exponent
  sign, `true` indicates a negative exponent.

   Examples:
   - `1.23` is syntax for `OfScientific.ofScientific (nat_lit 123) true (nat_lit 2)`
   - `121e100` is syntax for `OfScientific.ofScientific (nat_lit 121) false (nat_lit 100)`

   Note the use of `nat_lit`; there is no wrapping `OfNat.ofNat` in the resulting term.
  -/
  ofScientific (mantissa : Nat) (exponentSign : Bool) (decimalExponent : Nat) : α

Scientific notation interpretation structure

Informal description

The structure `OfScientific` is used to represent decimal and scientific numbers (e.g., `1.23`, `3.12e10`). For a type `α`, it provides a way to interpret numbers written in scientific notation as elements of `α`. For example: - `1.23` is interpreted as `OfScientific.ofScientific 123 true 2` (where `true` indicates a negative exponent) - `121e100` is interpreted as `OfScientific.ofScientific 121 false 100` (where `false` indicates a positive exponent) The structure does not wrap the number in an `OfNat` constructor, instead using `nat_lit` directly.

Float.ofBinaryScientific definition

(m : Nat) (e : Int) : Float

Full source

/-- Computes `m * 2^e`. -/
def Float.ofBinaryScientific (m : Nat) (e : Int) : Float :=
  let s := m.log2 - 63
  let m := (m >>> s).toUInt64
  let e := e + s
  m.toFloat.scaleB e

Floating-point construction from binary scientific notation: $m \cdot 2^e$

Informal description

The function computes the floating-point number $m \cdot 2^e$ by: 1. Calculating the shift $s = \lfloor \log_2 m \rfloor - 63$ to normalize the mantissa, 2. Right-shifting $m$ by $s$ bits and converting it to a 64-bit unsigned integer, 3. Adjusting the exponent to $e + s$, 4. Converting the normalized mantissa to a floating-point number and scaling it by the adjusted exponent using `Float.scaleB`.

Float.ofScientific opaque

(m : Nat) (s : Bool) (e : Nat) : Float

Full source

/--
Constructs a `Float` from the given mantissa, sign, and exponent values.

This function is part of the implementation of the `OfScientific Float` instance that is used to
interpret floating-point literals.
-/
protected opaque Float.ofScientific (m : Nat) (s : Bool) (e : Nat) : Float :=
  if s then
    let s := 64 - m.log2 -- ensure we have 64 bits of mantissa left after division
    let m := (m <<< (3 * e + s)) / 5^e
    Float.ofBinaryScientific m (-4 * e - s)
  else
    Float.ofBinaryScientific (m * 5^e) e

Floating-point construction from scientific notation: $\text{Float.ofScientific}(m, s, e)$

Informal description

Given a natural number mantissa $m$, a Boolean sign $s$ (where `true` represents a negative exponent), and a natural number exponent $e$, the function constructs a floating-point number according to the IEEE 754 binary64 standard. Specifically: - If $s$ is `true`, the value is interpreted as $m \times 10^{-e}$ - If $s$ is `false`, the value is interpreted as $m \times 10^{e}$

instOfScientificFloat instance

: OfScientific Float

Full source

/--
  The `OfScientific Float` must have priority higher than `mid` since
  the default instance `Neg Int` has `mid` priority.
  ```
  #check -42.0 -- must be Float
  ```
-/
@[default_instance mid+1]
instance : OfScientific Float where
  ofScientific := Float.ofScientific

Scientific Notation Interpretation for Floating-Point Numbers

Informal description

The type `Float` of 64-bit floating-point numbers (IEEE 754 binary64) has a canonical interpretation of scientific notation numbers (e.g., `1.23`, `3.12e10`), where numbers are parsed as $m \times 10^e$ (with sign handling for the exponent). This instance has higher priority than other numeric type interpretations to ensure correct parsing of negative floating-point literals.

Float.ofNat definition

(n : Nat) : Float

Full source

/--
Converts a natural number into the closest-possible 64-bit floating-point number, or an infinite
floating-point value if the range of `Float` is exceeded.
-/
@[export lean_float_of_nat]
def Float.ofNat (n : Nat) : Float :=
  OfScientific.ofScientific n false 0

Conversion from natural numbers to floating-point numbers

Informal description

The function converts a natural number $n$ into the closest possible 64-bit floating-point number (IEEE 754 binary64). If $n$ exceeds the range of representable values in `Float`, the result is an infinite floating-point value.

Float.ofInt definition

: Int → Float

Full source

/--
Converts an integer into the closest-possible 64-bit floating-point number, or positive or negative
infinite floating-point value if the range of `Float` is exceeded.
-/
def Float.ofInt : Int → Float
  | Int.ofNat n => Float.ofNat n
  | Int.negSucc n => Float.neg (Float.ofNat (Nat.succ n))

Conversion from integers to floating-point numbers

Informal description

The function converts an integer $n$ into the closest possible 64-bit floating-point number (IEEE 754 binary64). If the magnitude of $n$ exceeds the range of representable values in `Float`, the result is either positive or negative infinity, depending on the sign of $n$.

instOfNatFloat instance

: OfNat Float n

Full source

instance : OfNat Float n   := ⟨Float.ofNat n⟩

Canonical Interpretation of Natural Number Literals as Floating-Point Numbers

Informal description

For any natural number literal `n`, there is a canonical interpretation of `n` as a 64-bit floating-point number (IEEE 754 binary64). This conversion yields the closest representable floating-point value to `n`, or an infinite value if `n` exceeds the range of representable floating-point numbers.

Nat.toFloat abbrev

(n : Nat) : Float

Full source

@[inherit_doc Float.ofNat] abbrev Nat.toFloat (n : Nat) : Float :=
  Float.ofNat n

Natural Number to Floating-Point Conversion

Informal description

The function converts a natural number $n$ to a 64-bit floating-point number (IEEE 754 binary64), preserving the exact value when possible and otherwise returning the closest representable floating-point approximation. If $n$ exceeds the range of representable values, the result will be an infinite floating-point value.

Float32.ofBinaryScientific definition

(m : Nat) (e : Int) : Float32

Full source

/-- Computes `m * 2^e`. -/
def Float32.ofBinaryScientific (m : Nat) (e : Int) : Float32 :=
  let s := m.log2 - 63
  let m := (m >>> s).toUInt64
  let e := e + s
  m.toFloat32.scaleB e

32-bit floating-point number from binary scientific notation

Informal description

The function computes the 32-bit floating-point number representing $m \times 2^e$, where $m$ is a natural number and $e$ is an integer. It first adjusts the mantissa $m$ to fit within the precision limits of 32-bit floats by shifting it right by $s = \lfloor \log_2 m \rfloor - 63$ bits, then converts the shifted mantissa to a `UInt64` and finally to a `Float32`, scaling it by $2^{e + s}$ to compensate for the shift.

Float32.ofScientific opaque

(m : Nat) (s : Bool) (e : Nat) : Float32

Full source

/--
Constructs a `Float32` from the given mantissa, sign, and exponent values.

This function is part of the implementation of the `OfScientific Float32` instance that is used to
interpret floating-point literals.
-/
protected opaque Float32.ofScientific (m : Nat) (s : Bool) (e : Nat) : Float32 :=
  if s then
    let s := 64 - m.log2 -- ensure we have 64 bits of mantissa left after division
    let m := (m <<< (3 * e + s)) / 5^e
    Float32.ofBinaryScientific m (-4 * e - s)
  else
    Float32.ofBinaryScientific (m * 5^e) e

Construction of 32-bit Float from Scientific Notation Components

Informal description

Given a natural number mantissa $m$, a Boolean sign $s$ (where `true` represents negative and `false` represents positive), and a natural number exponent $e$, the function constructs a 32-bit floating-point number according to the IEEE 754 binary32 format.

instOfScientificFloat32 instance

: OfScientific Float32

Full source

instance : OfScientific Float32 where
  ofScientific := Float32.ofScientific

Scientific Notation Interpretation for 32-bit Floats

Informal description

The type `Float32` of 32-bit floating-point numbers can be constructed from scientific notation, where a number is represented by a mantissa $m$, a sign $s$ (indicating whether the exponent is negative), and an exponent $e$.

Float32.ofNat definition

(n : Nat) : Float32

Full source

/--
Converts a natural number into the closest-possible 32-bit floating-point number, or an infinite
floating-point value if the range of `Float32` is exceeded.
-/
@[export lean_float32_of_nat]
def Float32.ofNat (n : Nat) : Float32 :=
  OfScientific.ofScientific n false 0

Conversion from natural number to 32-bit float

Informal description

The function converts a natural number $n$ into the closest representable 32-bit floating-point number according to the IEEE 754 binary32 format. If $n$ exceeds the range of `Float32`, the result is an infinite floating-point value.

Float32.ofInt definition

: Int → Float32

Full source

/--
Converts an integer into the closest-possible 32-bit floating-point number, or positive or negative
infinite floating-point value if the range of `Float32` is exceeded.
-/
def Float32.ofInt : Int → Float32
  | Int.ofNat n => Float32.ofNat n
  | Int.negSucc n => Float32.neg (Float32.ofNat (Nat.succ n))

Conversion from integer to 32-bit float

Informal description

The function converts an integer into the closest representable 32-bit floating-point number according to the IEEE 754 binary32 format. If the integer exceeds the range of `Float32`, the result is either positive or negative infinity.

instOfNatFloat32 instance

: OfNat Float32 n

Full source

instance : OfNat Float32 n   := ⟨Float32.ofNat n⟩

Natural Number Literals as 32-bit Floating-Point Numbers

Informal description

For any natural number $n$, there is a canonical interpretation of $n$ as a 32-bit floating-point number in the IEEE 754 binary32 format.

Nat.toFloat32 abbrev

(n : Nat) : Float32

Full source

@[inherit_doc Float32.ofNat] abbrev Nat.toFloat32 (n : Nat) : Float32 :=
  Float32.ofNat n

Conversion from Natural Number to 32-bit Float

Informal description

The function converts a natural number $n$ to the closest representable 32-bit floating-point number according to the IEEE 754 binary32 format. If $n$ exceeds the range of `Float32`, the result is an infinite floating-point value.