Closed Bug 1835034 Opened 1 year ago Closed 28 days ago

Implement JIT Support for Float16Array

Tracking

()

Status:

RESOLVED FIXED

Milestone:

130 Branch

Tracking Flags:

Tracking

Status

firefox130

---

fixed

People

(Reporter: dminor, Assigned: anba)

References

(Blocks 1 open bug)

Details

Attachments

(14 files)

Bug 1835034 - Part 1: Add float/int conversions to js::float16. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 2: Support encoding vcvtph2ps and vcvtps2ph instructions. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 3: Add missing to Float16 conversion when simulating FCVT_dh. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 4: Support `float` results in storeCallFloatResult. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 5: Add {Double,Float32,Int32}ToFloat16 conversion methods to MacroAssemblers. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 6: Inline Math.f16round. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 7: Inline loading from Float16Array. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 8: Inline DataView.prototype.getFloat16. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 9: Inline storing into Float16Array. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 10: Inline DataView.prototype.setFloat16. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 11: Fold ToFloat16 when the input is guaranteed to be Float16. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 12: AsmJS codegen doesn't support Float16Array. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 13: Fix indentation for codegen spew. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 14: Enable float32 optimizations for MToDouble. r=jandem! 1 month ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review

Dan Minor [:dminor]

Reporter

Description

•

1 year ago

From https://bugzilla.mozilla.org/show_bug.cgi?id=1833647#c12, a follow up to implement JIT support for Float16Array.

Matthew Gaudet (he/him) [:mgaudet]

Updated

•

1 year ago

Severity: -- → S3

Priority: -- → P3

Dan Minor [:dminor]

Reporter

Comment 1

•

4 months ago

There's some discussion of optimizations here: https://github.com/tc39/proposal-float16array/issues/12

André Bargull [:anba]

Assignee

Updated

•

1 month ago

Assignee: nobody → andrebargull

Status: NEW → ASSIGNED

Mathew Hodson

Updated

•

1 month ago

Depends on: 1905609

André Bargull [:anba]

Assignee

Comment 2

•

1 month ago

Attached file Bug 1835034 - Part 1: Add float/int conversions to js::float16. r=jandem! — Details

Add the more conversion methods from upstream. Later patches will call the new
methods.

Add if-constexpr to ElementSpecific::valueToNative to avoid compiler errors,
because both js::float16::operator=(double) and js::float16::operator=(float)
are applicable assignment operators when assigning from int64_t.

André Bargull [:anba]

Assignee

Comment 3

•

1 month ago

Attached file Bug 1835034 - Part 2: Support encoding vcvtph2ps and vcvtps2ph instructions. r=jandem! — Details

Support vcvtph2ps and vcvtps2ph instructions from the F16C instruction set.

F16C requires AVX being enabled per "Intel developer manual, Vol 1, §14.4.1
Detection of F16C Instructions".

Depends on D215762

André Bargull [:anba]

Assignee

Comment 4

•

1 month ago

Attached file Bug 1835034 - Part 3: Add missing to Float16 conversion when simulating FCVT_dh. r=jandem! — Details

Upstream already has this fixed.

Depends on D215763

André Bargull [:anba]

Assignee

Comment 5

•

1 month ago

Attached file Bug 1835034 - Part 4: Support `float` results in storeCallFloatResult. r=jandem! — Details

Support float in addition to double in storeCallFloatResult.

Depends on D215764

André Bargull [:anba]

Assignee

Comment 6

•

1 month ago

Attached file Bug 1835034 - Part 5: Add {Double,Float32,Int32}ToFloat16 conversion methods to MacroAssemblers. r=jandem! — Details

Hardware support for float16 conversions is limited:

ARM supports float32<>float16 conversions with Neon. This is not implemented yet.
ARM64 supports float32<>float16 and float64<>float16 conversions.
x86/x64 supports float32<>float16 conversions when F16C instructions are supported.

We use the following approach for this initial implementation:

Use supported conversions when available, otherwise fall back to an ABI call.
Represent float16 as float32 throughout the JIT (so no MIRType::Float16 yet),
because:
1. float32 is supported for all targets, so we reduce cross-target differences
  when choosing the data type which is universally supported,
2. float32<>float16 conversions are natively supported for the main target platforms,
3. actual float16 math operations have even more limited hardware support, so we need
  to convert float16 to either float32 or float64 anyway at some point.
And this also enables using the existing optimisations for MIRType::Float32.

The next part will start using the conversion methods from this patch.

Note 1: float64->float16 conversion can't be emulated through a float64->float32->float16
conversion sequence, because the sequence float64->float32 and float32->float16 can
round differently than the direct float64->float16 conversion.

Note 2: float f(int32_t) in "ABIFunctionType.yaml" requires an explicit General -> Float32
entry for the ARM simulator, just adding Int32 -> Float32 led to an error.

Depends on D215765

André Bargull [:anba]

Assignee

Comment 7

•

1 month ago

Attached file Bug 1835034 - Part 6: Inline Math.f16round. r=jandem! — Details

Inline Math.f16round similar how Math.fround is inlined:

CacheIRCompiler either calls the conversion methods from part 5
or calls into the VM.
Warp transpiles to MToFloat16, which has a similar implementation
as MToFloat32.

Depends on D215766

André Bargull [:anba]

Assignee

Comment 8

•

1 month ago

Attached file Bug 1835034 - Part 7: Inline loading from Float16Array. r=jandem! — Details

Extend MacroAssembler::loadFromTypedArray to support loading from Float16Array.
This requires passing an additional temp-register and LiveRegisterSet when the
target doesn't natively support float32<>float16 conversions.

Codegen for LoadUnboxedScalar on x86/x64 looks like:

movzwl 0x0(%rdx,%rbx,2), %esi
vmovd %esi, %xmm0
vpmovzxwq %xmm0, %xmm0
vcvtph2ps %xmm0, %xmm0
vucomiss %xmm0, %xmm0
jnp .Lfrom120
movss .Lfrom128(%rip), %xmm0

And on ARM64:

ldr h0, [x2, x3, lsl #1]
fcvt s0, h0
fcmp s0, s0
b.vc -> 1015f
ldr s0, pc+24 (addr 0x70c2b0a96224) ; .const nan

Depends on D215767

André Bargull [:anba]

Assignee

Comment 9

•

1 month ago

Attached file Bug 1835034 - Part 8: Inline DataView.prototype.getFloat16. r=jandem! — Details

Extend the existing DataView code to also support Float16, using similar
changes as the previous part.

Depends on D215768

André Bargull [:anba]

Assignee

Comment 10

•

1 month ago

Attached file Bug 1835034 - Part 9: Inline storing into Float16Array. r=jandem! — Details

Slightly larger changes when compared to the previous two parts, because
MacroAssembler::storeToTypedFloatArray had to be changed to support
conversions instead of performing conversion in its caller:

CacheIRCompiler::emitStoreTypedArrayElement used ScratchFloat32Scope to
convert double -> float32, but using the same approach won't work for float16,
because ScratchFloat32Scope is also needed in MacroAssembler::storeFloat16
to convert float32 -> float16.
Therefore move the conversion double -> float32 into StoreToTypedFloatArray
And the conversions double -> float16 into MacroAssembler::storeFloat16.

Codegen for StoreUnboxedScalar on x64 looks like:

vcvtps2ph $0x4, %xmm0, %xmm15
vmovd %xmm15, %r11d
movw %r11w, 0x0(%rdx,%rbx,2)

And on ARM64:

h31, s0
h31, [x2, x4, lsl #1]

Depends on D215769

André Bargull [:anba]

Assignee

Comment 11

•

1 month ago

Attached file Bug 1835034 - Part 10: Inline DataView.prototype.setFloat16. r=jandem! — Details

Depends on D215770

André Bargull [:anba]

Assignee

Comment 12

•

1 month ago

Attached file Bug 1835034 - Part 11: Fold ToFloat16 when the input is guaranteed to be Float16. r=jandem! — Details

Transpiler and type policies add the following instructions when reading and then
storing a value from a Float16Array:

value = MLoadUnboxedScalar(f16array)
guarded_value = MToDouble(value) <-- Inserted by WarpCacheIRTranspiler
typed_value = MToFloat16(guarded_value) <-- Inserted by StoreUnboxedScalarPolicy
MStoreUnboxedScalar(f16array, typed_value)

Neither MToDouble nor MToFloat16 are needed, so let MToFloat16::foldsTo remove them.

This extra folding is needed because we don't yet have a MIRType::Float16 which we
can handle in MToFloat16::foldsTo.

The WarpCacheIRTranspiler change is an optimisation to avoid generating the following
instructions during transpiling and applying the type policy:

value = MLoadUnboxedScalar(f16array)
double_value = MToDouble(value) <-- Inserted by js::jit::AlwaysBoxAt
boxed_value = MBox(double_value) <-- Inserted by BoxPolicy
unboxed_value = MUnbox(boxed_value, Double) <-- Inserted by WarpCacheIRTranspiler

GVN will remove the MBox->MUnbox sequence, but it seems preferable to avoid generating it
in the first place.

Depends on D215771

André Bargull [:anba]

Assignee

Comment 13

•

1 month ago

Attached file Bug 1835034 - Part 12: AsmJS codegen doesn't support Float16Array. r=jandem! — Details

Remove the TODO note about adding Float16Array JIT support by renaming
OutOfLineLoadTypedArrayOutOfBounds to OutOfLineAsmJSLoadHeapOutOfBounds
which makes it more clear that Float16 support isn't needed here.

Depends on D215772

André Bargull [:anba]

Assignee

Comment 14

•

1 month ago

Attached file Bug 1835034 - Part 13: Fix indentation for codegen spew. r=jandem! — Details

Before this change:

[Codegen] vucomiss   %xmm0, %xmm0
[Codegen] jnp        .Lfrom214
[Codegen] movss       .Lfrom222(%rip), %xmm0

After this change:

[Codegen] vucomiss   %xmm0, %xmm0
[Codegen] jnp        .Lfrom214
[Codegen] movss      .Lfrom222(%rip), %xmm0

Note how the label identifiers are now properly aligned.

Depends on D215773

André Bargull [:anba]

Assignee

Comment 15

•

1 month ago

Attached file Bug 1835034 - Part 14: Enable float32 optimizations for MToDouble. r=jandem! — Details

ToFloat32(ToDouble(float32)) is exactly equal to float32, so MToDouble can
produce Float32 when its input can produce Float32. This change is necessary to
enable Float32 optimizations for various instructions, for example MSqrt.

Without this change Float32 optimizations are always disabled, which makes it
hard to verify that Float16 operations correctly handle Float32 inputs and
outputs.

Depends on D215774

Dan Minor [:dminor]

Reporter

Updated

•

29 days ago

Mentor: dminor

Pulsebot

Comment 16

•

29 days ago

Pushed by andre.bargull@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/410329e58599
Part 1: Add float/int conversions to js::float16. r=jandem
https://hg.mozilla.org/integration/autoland/rev/9d01c98d3d64
Part 2: Support encoding vcvtph2ps and vcvtps2ph instructions. r=jandem
https://hg.mozilla.org/integration/autoland/rev/ade51dbcc573
Part 3: Add missing to Float16 conversion when simulating FCVT_dh. r=jandem
https://hg.mozilla.org/integration/autoland/rev/3f2cb72c6348
Part 4: Support `float` results in storeCallFloatResult. r=jandem
https://hg.mozilla.org/integration/autoland/rev/c6ec3155a5f8
Part 5: Add {Double,Float32,Int32}ToFloat16 conversion methods to MacroAssemblers. r=jandem
https://hg.mozilla.org/integration/autoland/rev/f8cdec89d3cc
Part 6: Inline Math.f16round. r=jandem
https://hg.mozilla.org/integration/autoland/rev/d5ea08f74244
Part 7: Inline loading from Float16Array. r=jandem
https://hg.mozilla.org/integration/autoland/rev/816e1e8497d5
Part 8: Inline DataView.prototype.getFloat16. r=jandem
https://hg.mozilla.org/integration/autoland/rev/1543e18cfd43
Part 9: Inline storing into Float16Array. r=jandem
https://hg.mozilla.org/integration/autoland/rev/b39efb191c8e
Part 10: Inline DataView.prototype.setFloat16. r=jandem
https://hg.mozilla.org/integration/autoland/rev/10fb376db8cf
Part 11: Fold ToFloat16 when the input is guaranteed to be Float16. r=jandem
https://hg.mozilla.org/integration/autoland/rev/1006c87386c1
Part 12: AsmJS codegen doesn't support Float16Array. r=jandem
https://hg.mozilla.org/integration/autoland/rev/a67dc538eaee
Part 13: Fix indentation for codegen spew. r=jandem
https://hg.mozilla.org/integration/autoland/rev/f0bc3536dce7
Part 14: Enable float32 optimizations for MToDouble. r=jandem,nbp
https://hg.mozilla.org/integration/autoland/rev/e0c206023ab0
apply code formatting via Lando

Cristian Tuns

Comment 17

•

28 days ago

bugherder

Status: ASSIGNED → RESOLVED

Closed: 28 days ago

status-firefox130: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 130 Branch

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

21 days ago

Regressions: 1909092

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Implement JIT Support for Float16Array

Categories

(Core :: JavaScript Engine, enhancement, P3)

Tracking

()

People

(Reporter: dminor, Assigned: anba)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(14 files)

Description

Updated

Comment 1

Updated

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Updated

Comment 16

Comment 17

Updated