This proposal has been implemented with some changes in Scala 3.0.0.
Authors: Erik Osheim and Jorge Vicente Cantero
Supervisor and advisor: Sébastien Doeraene
History
Date | Version |
---|---|
Sep 20th 2017 | Initial Draft |
Introduction
This is a proposal to introduce syntax for type aliases that only exist at compile time and emulate wrapper types.
The goal is that operations on these wrapper types must not create any extra overhead at runtime while still providing a type safe use at compile time.
Some use cases for opaque types are:
-
Implementing type members while retaining parametricity. Currently, concrete
type
definitions are treated as type aliases, i.e. they are expanded in-place. -
New numeric classes, such as unsigned integers. There would no longer need to be a boxing overhead for such classes. This is similar to value types in .NET and
newtype
in Haskell. Many APIs currently use signed integers (to avoid overhead) but document that the values will be treated as unsigned. -
Classes representing units of measure. Again, no boxing overhead would be incurred for these classes.
-
Classes representing different entities with the same underlying type, such as
Id
andPassword
being defined in terms ofString
.
We expand on all these points in our motivation section.
For a definition of boxing and previous state-of-the-art, we recommend reading SIP-15.
Implementation note
Opaque types have been implemented in Scala 3.0.0 with some changes compared to this proposal.
Opaque types
Motivation
Authors often introduce type aliases to differentiate many values that
share a very common type (e.g. String
, Int
, Double
, Boolean
,
etc.). In some cases, these authors may believe that using type
aliases such as Id
and Password
means that if they later mix these
values up, the compiler will catch their error. However, since type
aliases are replaced by their underlying type (e.g. String
), these
values are considered interchangeable (i.e. type aliases are not
appropriate for differentiating various String
values).
One appropriate solution to the above problem is to create case
classes which wrap String
. This works, but incurs a runtime overhead
(for every String
in the previous system we also allocate a wrapper,
or a “box”). In many cases this is fine but in some it is not.
Value classes, a Scala feature proposed in SIP-15, were introduced to the language to offer classes that could be inlined in some scenarios, thereby removing runtime overhead. These scenarios, while certainly common, do not cover the majority of scenarios that library authors have to deal with. In reality, experimentation shows that they are insufficient, and hence performance-sensitive code suffers (see Appendix A).
In the following example, we show the current limitations of value classes and motivate the need of compile-time wrapper types.
An example
Imagine we are working with floating-point data that are logarithmically-scaled (e.g. sensor data). We might have gigabytes of data which would be convenient to represent in terms of arrays of primitive doubles (rather than boxing each value).
We might choose to use value classes as follows:
package object valueclass {
class Logarithm(val exponent: Double) extends AnyVal {
def toDouble: Double = math.exp(exponent)
def +(that: Logarithm): Logarithm = Logarithm(toDouble + that.toDouble)
def *(that: Logarithm): Logarithm = new Logarithm(exponent + that.exponent)
}
object Logarithm {
def apply(x: Double): Logarithm = new Logarithm(math.log(x))
}
}
package object usesites {
// 1e7 is approximately (e ** 16.11809565095832),
// so x is encoded as 16.11809565095832.
val x: Logarithm = Logarithm(1e7)
}
Users of this library can use new Logarithm(...)
to wrap Double
values (and access alternate implementations of +
and *
) without
allocating Logarithm
values. This is confirmed when users have a
look at the generated bytecode for Logarithm
and notice that the
signature of apply
is redefined as def apply(x: Double): Double
and that val x: Logarithm = Logarithm(1e7)
is instead val x: Double
= Logarithm.apply(1e7)
.
Value classes can rewrite many instances of Logarithm
in terms of
Double
, including local variables, method parameters, return types,
and most fields. SIP-15 lays out the exact circumstances when these
rules occur, and extension methods ensure that boxing and unboxing
occurs transparently.
Unfortunately, this transparent boxing is relatively easy to
trigger. Some very common situations where Logarithm
instances will
be allocated at runtime include:
- use in arrays (e.g.
Array[Logarithm]
) - use in generic collections (e.g.
List[Logarithm]
) - parameters or return values of anonymous functions
- calling
equals
orhashCode
on aLogarithm
Concretely, consider:
val x = Logarithm(1e5)
val xs = List(Logarithm(12345.0), Logarithm(67890.0)).map(_ + x)
When looking at the bytecode, the author will notice that the the
emitted code boxes Double
values into allocated Logarithm
values,
stores those in the list, then produces a
Function1[Logarithm, Logarithm]
which unboxes each logarithm and
allocates a new result. Users who expect use of value classes to be
allocation-free (and equivalent to working with the underlying values)
will be sorely-disappointed in these cases.
Appendix A illustrates more cases where this unintended boxing/unboxing happens, hurting the performance of the code.
Boxing/unboxing of value classes happens anywhere in the program where
the type signatures are generic and require the runtime to pass a
java.lang.Object
.
Given the frequency of these scenarios in real-world code, we would like to introduce the notion of compile-time wrapper types. Compile-time wrapper types do not depend on the runtime and its capabilities, and are always erased at runtime. The previous code snippet can be implemented using opaque types and will produce code that never boxes/unboxes.
In exchange, opaque types disallow the redefinition of equals
and
hashCode
, which in our experience is often unnecessary.
Definition
Let’s continue with the example in our motivation section, and define Logarithm
with an opaque type:
package object opaquetypes {
opaque type Logarithm = Double
}
The opaque type definition is akin to the one of type aliases, except that it
is prefixed by a new opaque
keyword.
Opaque types are always accompanied by type companions. Type companions define the public API of an opaque type, and they are defined in the same way that class companions are. It is possible to define an opaque type without a type companion, but then the opaque type is useless.
Let’s define a type companion for our previous opaque type Logarithm
:
package object opaquetypes {
// The previous opaque type definition
opaque type Logarithm = Double
object Logarithm {
// These are the ways to lift to the logarithm type
def apply(d: Double): Logarithm = math.log(d)
def safe(d: Double): Option[Logarithm] =
if (d > 0.0) Some(math.log(d)) else None
// This is the first way to unlift the logarithm type
def exponent(l: Logarithm): Double = l
// Extension methods define opaque types' public APIs
implicit class LogarithmOps(val `this`: Logarithm) extends AnyVal {
// This is the second way to unlift the logarithm type
def toDouble: Double = math.exp(`this`)
def +(that: Logarithm): Logarithm = Logarithm(math.exp(`this`) + math.exp(that))
def *(that: Logarithm): Logarithm = Logarithm(`this` + that)
}
}
}
The above Logarithm
type companion contains the following definitions:
- Methods to lift the type from
Double
toLogarithm
(i.e.apply
andsafe
) - Methods to lower the type from
Logarithm
toDouble
(i.e.exponent
) - Extension methods to unlift the type from
Logarithm
toDouble
(i.e.toDouble
) - Extension methods to define more operations on the type (i.e. like
+
and*
)
The key peculiarity of opaque types is that they behave like normal type aliases inside their type companion; that is, users can convert from the type alias and its equivalent definition interchangeably without the use of explicit lift and unlift methods. We can say that opaque types are “transparent” inside their type companion.
However, the story changes for users of this API. Outside of their type companions, opaque types are not transparent and, therefore, the Scala compiler fails to compile code that pretends they are. A common example is:
val l: Logarithm = 1.0
which fails to compile with a type mismatch error:
<console>:11: error: type mismatch;
found : Double
required: Logarithm
val l: Logarithm = 1.0
^
The same happens for val d: Double = l
where l
is an instance of Logarithm
.
The idea, then, is to let library authors create wrapper types and their API in a concrete, isolated part of their code and force users to use this API to lift to and unlift from the opaque type.
By design, downstream users can define more operations on these opaque types via
their own extension methods, but they cannot create a new API to lift/unlift them,
e.g. users need to use the library-provided toDouble
, Logarithm.safe
and
Logarithm.apply
.
The following code showcases legit uses of the Logarithm
opaque type:
package object usesites {
import opaquetypes._
val l = Logarithm(1.0)
val l2 = Logarithm(2.0)
val l3 = l + l2
val d = l3.toDouble
val l3: Logarithm = (1.0).asInstanceOf[Logarithm]
}
While the following fails to typecheck:
package object usesites {
import opaquetypes._
val l: Logarithm = Logarithm(1.0)
val d: Double = l // fails to typecheck
val l2: Logarithm = 1.0 // fails to typecheck
}
For the sake of completeness, this is how you extend these opaque types with more operations:
package object usesites {
// ...
implicit class UserOps(`this`: Logarithm) extends AnyVal {
def **(that: Logarithm): Logarithm = Logarithm(`this` * that)
}
}
Note that the rationale for this encoding is to allow users to convert from the opaque type and the underlying type in constant time. Users have to be able to tag complex type structures without having to reallocate, iterate over or inspect it.
Formal definition
The Scala Language doesn’t have a concept of either opaque types or type companions. In the following section, we formalize our previous definitions and specify the required changes to the Scala Language Specification.
Opaque type definition
An opaque type follows the same structure as an alias type but it requires the use of a new
opaque
modifier.
LocalModifier ::= ‘abstract’
| ‘final’
| ‘sealed’
| ‘implicit’
| ‘lazy’
| ‘opaque’
This new modifier is then used to qualify the type alias definition:
Def ::= [‘opaque‘] ‘type’ {nl} TypeDef
TypeDef ::= id [TypeParamClause] ‘=’ Type
Opaque modifiers are only valid for type definitions. Note that contrary to type alias definitions, opaque type definitions cannot be overridden.
Here’s a formal definition of opaque types:
An opaque
type t = T
definest
to be an alias name for thetype T
only in the scope of the opaque type companiont
. The left hand side of an opaque type may have a type parameter clause, e.g.type t[tps] = T
. The scope of a type parameter extends over the right hand sideT
and the type parameter clausetps
itself.
As per this definition, opaque types modify the type equivalence relationship explained in the 3.5.1. Equivalence section of the Scala Language Specification. In particular, the next item qualifies the type equivalence for opaque types:
If
t
is defined by an opaque typeopaque type t = T
, thent
is not equivalent toT
unlesst
occurs in the template of the opaque type companion.
In the Implementation notes, we explain how this can be achieved in the implementation.
Opaque type companion
We define a type companion in a similar way companion classes are described in the Scala Language Specification in 5.5. Object Definitions:
Generally, a companion module of an opaque type is an object which has the same name as the opaque type and is defined in the same scope and compilation unit. Conversely to companion classes, the companion class of the module does not have a notion of companion type.
Opaque type companions and implicit search
These opaque type companions are also part of the implicit search scope of the opaque type t
.
Therefore, uses of extension methods defined in the opaque type companion do not require users
to import the opaque type companion explicitly.
Technical discussions
In our current proposal, we make several trade-offs. Next, we defend these trade-offs and propose alternative ways to achieve the same (if any).
opaque
as a modifier
opaque
has been added as a modifier to a type def for simplicity. We believe that a new keyword
for this feature is worthwhile.
Note that adding opaque
as a modifier prevents the use of opaque
anywhere in the users’
program, which could possibly break someone’s code. To fix this scenario, we could create a
Scalafix rule that will rewrite any place where opaque
is an identifier.
For those SIP Committee members wary of adding a new keyword to the language, we propose an
alternative approach. Instead of defining opaque types with the opaque
modifier, opaque types
may also be defined by combining the existing new
and type
keywords, e.g.
new type Logarithm = Double
. This option would be akin to the syntax used in Haskell for wrapper
types, e.g. newtype
.
Type companions
This proposal only adds the notion of type companions for opaque types. After discussing with members of the SIP Committee and Scala compiler hackers, we believe that a notion of type companions extended to type aliases would not work because the typechecker aggressively dealiases**, and it is not possible to link to the type companion symbol once type aliases have been dealiased.
** Note that dealiasing means beta reducing a type alias.
Static extension methods
The extension methods synthesized for implicit value classes are not static. We have not measured if the current encoding has an impact in performance, but if so, we can consider changing the encoding of extension methods either in this proposal or in a future one.
More examples
Type tagging
This example focuses on using type parameters to opaque types (one of
which is a phantom type). The goal here is to be able to mark concrete
types (S
) with arbitrary tags (T
) which can be used to
distinguish them at compile-time.
package object tagging {
// Tagged[S, T] means that S is tagged with T
opaque type Tagged[S, T] = S
object Tagged {
def tag[S, T](s: S): Tagged[S, T] = s
def untag[S, T](st: Tagged[S, T]): S = st
def tags[F[_], S, T](fs: F[S]): F[Tagged[S, T]] = fs
def untags[F[_], S, T](fst: F[Tagged[S, T]]): F[S] = fst
implicit def taggedClassTag[S, T](implicit ct: ClassTag[S]): ClassTag[Tagged[S, T]] =
ct
}
type @@[S, T] = Tagged[S, T]
implicit class UntagOps[S, T](st: S @@ T) extends AnyVal {
def untag: S = Tagged.untag(st)
}
implicit class UntagsOps[F[_], S, T](fs: F[S @@ T]) extends AnyVal {
def untags: F[S] = Tagged.untags(fs)
}
implicit class TagOps[S](s: S) extends AnyVal {
def tag[T]: S @@ T = Tagged.tag(s)
}
implicit class TagsOps[F[_], S](fs: F[S]) extends AnyVal {
def tags[T]: F[S @@ T] = Tagged.tags(fs)
}
trait Meter
trait Foot
trait Fathom
val x: Double @@ Meter = (1e7).tag[Meter]
val y: Double @@ Foot = (123.0).tag[Foot]
val xs: Array[Double @@ Meter] = Array(1.0, 2.0, 3.0).tags[Meter]
val o: Ordering[Double] = implicitly
val om: Ordering[Double @@ Meter] = o.tags[Meter]
om.compare(x, x) // 0
om.compare(x, y) // does not compile
xs.min(om) // 1.0
xs.min(o) // does not compile
// uses ClassTag[Double] via 'Tagged.taggedClassTag'.
val ys = new Array[Double @@ Foot](20)
}
There are a few interesting things to note here.
First, as above we expect that tagging and untagging will not cause
any boxing of these values at runtime, even though Tagged
is
generic. We also expect that the Array[Double @@ Meter]
will be
represented by Array[Double]
at runtime.
Second, notice that Ordering[Double]
and ClassTag[Double]
are not
automatically in scope for Tagged[Double, Meter]
. Opaque types
currently need to “re-export” (or otherwise provide) their own
implicit values.
It would be possible to automatically provide ClassTag
instances,
using an implicit val
in the case of opaque types wrapping concrete
types (e.g.opaque type X = Double
) and implicit def
in cases such
as above.
Fix-point type
Here’s an interesting little example which defines the recursive
opaque type Fix
:
package object fixed {
opaque type Fix[F[_]] = F[Fix[F]]
object Fix {
def fix[F[_]](unfixed: F[Fix[F]]): Fix[F] = unfixed
def unfix[F[_]](fixed: Fix[F]): F[Fix[F]] = fixed
}
sealed abstract class TreeU[R]
type Tree = Fix[TreeU]
object TreeU {
def cata[A](t: Tree)(f: TreeU[A] => A): A =
f(Fix.unfix(t) match {
case Branch(l, r) => Branch(cata(l)(f), cata(r)(f))
case Leaf(s) => Leaf(s)
})
case class Branch[R](left: R, right: R) extends TreeU[R]
case class Leaf[R](label: String) extends TreeU[R]
}
def leaf(s: String): Tree = Fix.fix(Leaf(s))
def branch(lhs: Tree, rhs: Tree): Tree = Fix.fix(Branch(lhs, rhs))
val tree: Tree = branch(branch(leaf("a"), leaf("b")), leaf("c"))
println(tree)
// Branch(Branch(Leaf(a), Leaf(b)), Leaf(c))
}
This is an interesting example which is intended to show that opaque
types (unlike type aliases) have an independent existence, even within
the companion. This means that unlike type aliases, Fix[F]
should not
result in an infinite expansion in the above code.
The Fix
type is useful to implementing recursion schemes, or just
for creating recursive structures which are parameterized on their
recursive type.
Explicit nullable types
There have previously been proposals to provide a “zero-cost Option”
using value classes. Opaque types make this very straight-forward by
bounding the underlying type (A
) with AnyRef
.
package object nullable {
opaque type Nullable[A <: AnyRef] = A
object Nullable {
def apply[A <: AnyRef](a: A): Nullable[A] = a
implicit class NullableOps[A <: AnyRef](na: Nullable[A]) {
def exists(p: A => Boolean): Boolean =
na != null && p(na)
def filter(p: A => Boolean): Nullable[A] =
if (na != null && p(na)) na else null
def flatMap[B <: AnyRef](f: A => Nullable[B]): Nullable[B] =
if (na == null) null else f(na)
def forall(p: A => Boolean): Boolean =
na == null || p(na)
def getOrElse(a: => A): A =
if (na == null) a else na
def map[B <: AnyRef](f: A => B): Nullable[B] =
if (na == null) null else f(na)
def orElse(na2: => Nullable[A]): Nullable[A] =
if (na == null) na2 else na
def toOption: Option[A] =
Option(na)
}
}
}
This example provides most of the benefits of using Option
at API
boundaries with libraries that use null
(such as many Java
libraries). Unlike a value class, we can guarantee that there will
never be a wrapper around the raw values (or raw nulls).
Notice that Nullable[Nullable[B]]
is not a valid type, because
Nullable[B]
is not known to be <: AnyRef
. This is a key difference
between a type like Option
(which is parametric and can easily wrap
itself) and a type like Nullable
(which only has one null
value to
use).
Custom instances
Currently, many libraries (including Scalding and Algebird) define wrapper types which change the kind of aggregation used for that type. This is useful in frameworks like MapReduce, Spark, Flink, Storm, etc. where users describe computation in terms of mapping and aggregation.
The following example shows how opaque types can make using these wrappers a bit more elegant:
package object groups {
trait Semigroup[A] {
def combine(x: A, y: A): A
}
object Semigroup {
def instance[A](f: (A, A) => A): Semigroup[A] =
new Semigroup[A] {
def combine(x: A, y: A): A = f(x, y)
}
}
type Id[A] = A
trait Wrapping[F[_]] {
def wraps[G[_], A](ga: G[A]): G[F[A]]
def unwraps[G[_], A](ga: G[F[A]]): G[A]
}
abstract class Wrapper[F[_]] { self =>
def wraps[G[_], A](ga: G[A]): G[F[A]]
def unwraps[G[_], A](gfa: G[F[A]]): G[A]
final def apply[A](a: A): F[A] = wraps[Id, A](a)
implicit object WrapperWrapping extends Wrapping[F] {
def wraps[G[_], A](ga: G[A]): G[F[A]] = self.wraps(ga)
def unwraps[G[_], A](ga: G[F[A]]): G[A] = self.unwraps(ga)
}
}
opaque type First[A] = A
object First extends Wrapper[First] {
def wraps[G[_], A](ga: G[A]): G[First[A]] = ga
def unwrap[G[_], A](gfa: G[First[A]]): G[A] = gfa
implicit def firstSemigroup[A]: Semigroup[First[A]] =
Semigroup.instance((x, y) => x)
}
opaque type Last[A] = A
object Last extends Wrapper[Last] {
def wraps[G[_], A](ga: G[A]): G[Last[A]] = ga
def unwrap[G[_], A](gfa: G[Last[A]]): G[A] = gfa
implicit def lastSemigroup[A]: Semigroup[Last[A]] =
Semigroup.instance((x, y) => y)
}
opaque type Min[A] = A
object Min extends Wrapper[Min] {
def wraps[G[_], A](ga: G[A]): G[Min[A]] = ga
def unwrap[G[_], A](gfa: G[Min[A]]): G[A] = gfa
implicit def minSemigroup[A](implicit o: Ordering[A]): Semigroup[Min[A]] =
Semigroup.instance(o.min)
}
opaque type Max[A] = A
object Max extends Wrapper[Max] {
def wraps[G[_], A](ga: G[A]): G[Max[A]] = ga
def unwrap[G[_], A](gfa: G[Max[A]]): G[A] = gfa
implicit def maxSemigroup[A](implicit o: Ordering[A]): Semigroup[Max[A]] =
Semigroup.instance(o.max)
}
opaque type Plus[A] = A
object Plus extends Wrapper[Plus] {
def wraps[G[_], A](ga: G[A]): G[Plus[A]] = ga
def unwrap[G[_], A](gfa: G[Plus[A]]): G[A] = gfa
implicit def plusSemigroup[A](implicit n: Numeric[A]): Semigroup[Plus[A]] =
Semigroup.instance(n.plus)
}
opaque type Times[A] = A
object Times extends Wrapper[Times] {
def wraps[G[_], A](ga: G[A]): G[Times[A]] = ga
def unwrap[G[_], A](gfa: G[Times[A]]): G[A] = gfa
implicit def timesSemigroup[A](implicit n: Numeric[A]): Semigroup[Times[A]] =
Semigroup.instance(n.times)
}
opaque type Reversed[A] = A
object Reversed extends Wrapper[Reversed] {
def wraps[G[_], A](ga: G[A]): G[Reversed[A]] = ga
def unwrap[G[_], A](gfa: G[Reversed[A]]): G[A] = gfa
implicit def reversedOrdering[A](implicit o: Ordering[A]): Ordering[Reversed[A]] =
o.reverse
}
opaque type Unordered[A] = A
object Unordered extends Wrapper[Unordered] {
def wraps[G[_], A](ga: G[A]): G[Unordered[A]] = ga
def unwrap[G[_], A](gfa: G[Unordered[A]]): G[A] = gfa
implicit def unorderedOrdering[A]: Ordering[Unordered[A]] =
Ordering.by(_ => ())
}
}
The example demonstrates using an abstract class (Wrapper
) to share
code between opaque type companion objects. Like the tagging example,
we can use two methods (wraps
and unwraps
) to wrap and unwrap A
types, even if nested in an arbitrary context (G[_]
). These methods
cannot be implemented in Wrapper
because each opaque type companion
contains the only scope where its particular methods can be
implemented.
Similarly to the tagging example, these types are zero-cost wrappers
which can be used to tag how to aggregate the underlying type (for
example Int
).
Probability interval
Here’s an example that demonstrates how opaque types can limit the underlying type’s range of values in a way that minimizes the required error-checking:
package object prob {
opaque type Probability = Double
object Probability {
def apply(n: Double): Option[Probability] =
if (0.0 <= n && n <= 1.0) Some(n) else None
def unsafe(p: Double): Probability = {
require(0.0 <= p && p <= 1.0, s"probabilities lie in [0, 1] (got $p)")
p
}
def asDouble(p: Probability): Double = p
val Never: Probability = 0.0
val CoinToss: Probability = 0.5
val Certain: Probability = 1.0
implicit val ordering: Ordering[Probability] =
implicitly[Ordering[Double]]
implicit class ProbabilityOps(p1: Probability) extends AnyVal {
def unary_~ : Probability = Certain - p1
def &(p2: Probability): Probability = p1 * p2
def |(p2: Probability): Probability = p1 + p2 - (p1 * p2)
def isImpossible: Boolean = p1 == Never
def isCertain: Boolean = p1 == Certain
import scala.util.Random
def sample(r: Random = Random): Boolean = r.nextDouble <= p1
def toDouble: Double = p1
}
val caughtTrain = Probability.unsafe(0.3)
val missedTrain = ~caughtTrain
val caughtCab = Probability.CoinToss
val arrived = caughtTrain | (missedTrain & caughtCab)
println((1 to 5).map(_ => arrived.sample()).toList)
// List(true, true, false, true, false)
}
}
Outside of the Probability
companion, we can be sure that the
underlying Double
values fall in the interval [0, 1], which means
we don’t need to include error-checking code when working with
Probability
values. We can be sure that adding this kind of
compile-time safety to a program doesn’t add any additional cost
(except for the error-checking that we explicitly want).
This example is somewhat similar to Logarithm
above. Other
properties we might want to verify at compile-time: NonNegative
,
Positive
, Finite
, Unsigned
and so on.
Immutable (i.e. write-once) arrays
Often performance sensitive code will use arrays via the following pattern:
- Allocate an array of a fixed, known size.
- Initialize the array via code which mutates it.
- Return the array, which is now treated as immutable.
In this pattern, the vast majority of time is spent in the third step, where the array’s compactness and speed of iteration/access provide major wins over other data types.
This pattern is currently only enforced by convention. However, opaque types create an opportunity to provide more safety without incurring any overhead:
package object ia {
import java.util.Arrays
opaque type IArray[A] = Array[A]
object IArray {
@inline final def initialize[A](body: => Array[A]): IArray[A] = body
@inline final def size(ia: IArray[A]): Int = ia.length
@inline final def get(ia: IArray[A], i: Int): A = ia(i)
// return a sorted copy of the array
def sorted(ia: IArray[A]): IArray[A] = {
val arr = Arrays.copyOf(ia, ia.length)
scala.util.Sorting.quickSort(arr)
arr
}
// use a standard java method to search a sorted IArray.
// (note that this doesn't mutate the array).
def binarySearch(ia: IArray[Long], elem: Long): Int =
Arrays.binarySearch(ia, elem)
}
// same as IArray.binarySearch but implemented by-hand.
//
// given a sorted IArray, returns index of `elem`,
// or a negative value if not found.
def binaryIndexOf(ia: IArray[Long], elem: Long): Int = {
var lower: Int = 0
var upper: Int = IArray.size(ia)
while (lower <= upper) {
val middle = (lower + upper) >>> 1
val n = IArray.get(ia, middle)
if (n == elem) return middle
else if (n < elem) first = middle + 1
else last = middle - 1
}
-lower - 1
}
}
This example is a bit different from others: there’s no attempt to
enrich the IArray
type with syntactic conveniences. Rather, the goal
is to show that traditional, “low-level” code with Array
, Int
,
etc. can be written with opaque types without sacrificing any
performance.
Our other examples enrich existing data types with new functionality. This example serves to constrain the operations used with a type (but without introducing any overhead/indirection, which a traditional wrapper would).
Differences with value classes
Most of the above examples can also be implemented using value classes. This section will highlight some differences between these hypothetical encodings.
Used as a type parameter
In many cases an author would introduce opaque types or value classes
to add extra type safety to a “stringly-typed” API, by replacing
instances of the String
type with a more specific type.
For example:
package object pkg {
import Character.{isAlphabetic, isDigit}
class Alphabetic private[pkg] (val value: String) extends AnyVal
object Alphabetic {
def fromString(s: String): Option[Alphabetic] =
if (s.forall(isAlphabetic(_))) Some(new Alphabetic(s))
else None
}
opaque type Digits = String
object Digits {
def fromString(s: String): Option[Digits] =
if (s.forall(isDigit(_))) Some(s)
else None
def asString(d: Digits): String = d
}
}
The code here is relatively comparable. However, when changing
String
to Alphabetic
in code, the following types would be changed
(i.e. boxed):
Array[Alphabetic]
Option[Alphabetic]
(e.g. the result ofAlphabetic.fromString
)Vector[Alphabetic]
Alphabetic => Boolean
Map[Alphabetic, Int]
Ordering[Alphabetic]
(Alphabetic, String)
Monoid[Alphabetic]
In many cases users won’t mind the fact that this code will box, but there will certainly be an impact on the bytecode produced (and possibly the runtime performance).
By contrast, replacing String
with Digits
is guaranteed to have no
impact (all occurrences of Digits
are guaranteed to be erased to
String
). Aside from the ergonomics of calling the fromString
and
asString
methods, there’s no runtime impact versus using the
underlying type.
One wrinkle to the above is that built-in primitive types will
naturally box in some situations (but not others). For example
List[Double]
will be represented as a List[java.lang.Double]
,
(Double, Double)
will be represented as a scala.Tuple2$mcDD$sp
,
and so on. In these cases, an opaque type will exhibit the same
behavior.
Default visibility
By default, a value class’ constructor and accessor are public. It
is possible to restrict access, using a private
constructor and
private
accessor, but this makes the comparison between opaque types
and value classes less attractive:
package object pkg {
opaque type XCoord = Int
case class private[pkg] YCoord (private[pkg] val n: Int) extends AnyVal
// in both cases, we'd need public factory constructors
// to allow users to produce values of these types.
}
Opaque types’ default behavior is more appropriate for information-hiding when defining wrapper types.
LUB differences
Value classes extend AnyVal
by virtue of the syntax used to define
them. One reason this is necessary is that value classes cannot be
null
(otherwise this could create ambiguities between their boxed
and unboxed representations when wrapping AnyRef
values).
By contrast, when seen from the “outside” opaque types extend
Any
. Their bounds are the same as those of a type parameter or type
member without explicit bounds, i.e. A <: Any >: Nothing
.
This is not a major difference (for example, under -Xlint
inferring
either type will generate a warning) but does it illustrate that an
opaque type is standing in for an unknown type (i.e. anything)
whereas a value class introduces its own semantics which remain in the
type system even if we hope to never see the instances:
class Letters(val toString: String) extends AnyVal
class Digits(val toInt: Int) extends AnyVal
// inferred to List[AnyVal]
val ys = List(new Letters("abc"), new Digits("123"))
// inferred to List[String].
List[AnyVal] val xs = List("abc", "123")
Through covariance List[String] <: List[AnyRef]
(and List[Any]
)
but it is not a List[AnyVal]
.
Size of artifacts produced
Since value classes do have a runtime representation, they do increase the size of runtime artifacts produced (whether a JAR file, a JavaScript file, or something else). Their methods are also compiled to multiple representations (i.e. they support both the boxed and unboxed forms via extensions methods). Again, this comes at a cost.
By contrast, opaque types have no inherent runtime footprint. The opaque type’s companion is present at runtime, but it usually contains validation and data transformation code which would have been required even if the author had just stuck to the underlying type, and doesn’t add any extra extension methods.
Implementation notes
To implement opaque types, we need to modify three compiler phases: parser, namer and typer. At the time of this writing, it is unclear if later phases like erasure must be changed as well, but we think this should not be necessary.
There are several key ideas in the current, work-in-progress implementation:
-
To meet the type equivalence relationship for opaque types, we synthesize two implicit conversions inside the opaque type companion, if they do not already exist. If
opaque type t = T
, then two implicit conversions are synthesized, one fromt
toT
is synthesized and another for the other way around. The body of these methods will uset.asInstanceOf[T]
and vice versa **. -
Phases after typer always dealias opaque types. This way, erasure and codegen can unwrap opaque types out of the box and, at the bytecode level, their underlying representation is used instead.
All these ideas are open to refinements by the SIP Committee.
On removing asInstanceOf
at compile-time
** Note that these asInstanceOf
can be removed at compile-time, but there is no precedent of
doing so in the Scalac compiler. However, it is not clear whether leaving these casts will have
an impact on performance – the underlying virtual machine may remove the operation based on type
analysis due to the fact that the cast is from Double => Double
via Class Hierarchy Analysis.
Despite not having a precedence, the authors of the proposal incline to remove these checks at
compile-time. This will also require a modification to the spec.
We’re also assuming that implicit enrichment via value types is a sufficient way to provide zero-cost extension methods. We’ll need to benchmark opaque types + value class enrichment to ensure that there aren’t hidden performance problems.
Cross-platform
We believe that by implementing opaque types early in the pipeline, Scala.js and Scala Native can compile them out-of-the-box. Thus, we do not expect opaque types to have problems for different backends, since erasure will always see the dealiased types.
Conclusion
We believe that opaque types fit in the language nicely. The combination of type aliases and value classes (for the zero runtime overhead of extension methods) result in compile-time wrapper types that address the performance issues of value classes.
As a summary, opaque types are:
- A subset of type aliases that are transparent only within the type companion.
- Extensible (users can define their own extension methods).
- Fit for performance-sensitive code (no boxing/unboxing by design).
- A lightweight way to add extra type-safety to strings, numbers, etc.