I would like to create a protocol like the following:
protocol Parser {
func parse() -> ParserOutcome<?>
}
enum ParserOutcome<Result> {
case result(Result)
case parser(Parser)
}
I want to have parsers that return either a result of a specific type, or another parser.
If I use an associated type on Parser
, then I can't use Parser
in the enum
. If I specify a generic type on the parse()
function, then I can't define it in the implementation without a generic type.
How can I achieve this?
Using generics, I could write something like this:
class Parser<Result> {
func parse() -> ParserOutcome<Result> { ... }
}
enum ParserOutcome<Result> {
case result(Result)
case parser(Parser<Result>)
}
This way, a Parser
would be parameterized by the result type. parse()
can return a result of the Result
type, or any kind of parser that would output either a result of the Result
type, or another parser parameterized by the same Result
type.
With associated types however, as far as I can tell, I'll always have a Self
constraint:
protocol Parser {
associatedtype Result
func parse() -> ParserOutcome<Result, Self>
}
enum ParserOutcome<Result, P: Parser where P.Result == Result> {
case result(Result)
case parser(P)
}
In this case, I can't have any type of parser that would return the same Result
type anymore, it has to be the same type of parser.
I would like to obtain the same behavior with the Parser
protocol as I would with a generic definition, and I would like to be able to do that within the bounds of the type system, without introducing new boxed types, just like I can with a normal generic definition.
It seems to me that defining associatedtype OutcomeParser: Parser
inside the Parser
protocol, then returning an enum
parameterized by that type would solve the problem, but if I try to define OutcomeParser
that way, I get the error:
Type may not reference itself as a requirement
You can define a generic type in a protocol by using an associated type. It’s kinda like a placeholder type, as we’ve seen before, but then for protocols. You’ve added the Item associated type with the associatedtype keyword The store (item:) and retrieve (index:) functions now use that associated type Item
The generic version of the function uses a placeholder type name (called T, in this case) instead of an actual type name (such as Int, String, or Double ). The placeholder type name doesn’t say anything about what T must be, but it does say that both a and b must be of the same type T, whatever T represents.
The generic Storage protocol only specifies that whatever class adopts it needs to include a function to store any item, and retrieve any item. It doesn’t specify how this item needs to be stored or retrieved, or what kind of item it can be. As a result, we can create any kind of storage that can store any kind of item. Magical!
You do this by defining a generic where clause. A generic where clause enables you to require that an associated type must conform to a certain protocol, or that certain type parameters and associated types must be the same.
I wouldn't be so quick to dismiss type erasures as "hacky" or "working around [...] the type system" – in fact I'd argue that they work with the type system in order to provide a useful layer of abstraction when working with protocols (and as already mentioned, used in the standard library itself e.g AnySequence
, AnyIndex
& AnyCollection
).
As you said yourself, all you want to do here is have the possibility of either returning a given result from a parser, or another parser that works with the same result type. We don't care about the specific implementation of that parser, we just want to know that it has a parse()
method that returns a result of the same type, or another parser with that same requirement.
A type erasure is perfect for this kind of situation, as all you need to do is take a reference to a given parser's parse()
method, allowing you to abstract away the rest of the implementation details of that parser. It's important to note that you aren't losing any type safety here, you're being exactly as precise about the type of the parser as you requirement specifies.
If we look at a potential implementation of a type-erased parser, AnyParser
, hopefully you'll see what I mean:
struct AnyParser<Result> : Parser {
// A reference to the underlying parser's parse() method
private let _parse : () -> ParserOutcome<Result>
// Accept any base that conforms to Parser, and has the same Result type
// as the type erasure's generic parameter
init<T:Parser where T.Result == Result>(_ base:T) {
_parse = base.parse
}
// Forward calls to parse() to the underlying parser's method
func parse() -> ParserOutcome<Result> {
return _parse()
}
}
Now in your ParserOutcome
, you can simply specify that the parser
case has an associated value of type AnyParser<Result>
– i.e any kind of parsing implementation that can work with the given Result
generic parameter.
protocol Parser {
associatedtype Result
func parse() -> ParserOutcome<Result>
}
enum ParserOutcome<Result> {
case result(Result)
case parser(AnyParser<Result>)
}
...
struct BarParser : Parser {
func parse() -> ParserOutcome<String> {
return .result("bar")
}
}
struct FooParser : Parser {
func parse() -> ParserOutcome<Int> {
let nextParser = BarParser()
// error: Cannot convert value of type 'AnyParser<Result>'
// (aka 'AnyParser<String>') to expected argument type 'AnyParser<_>'
return .parser(AnyParser(nextParser))
}
}
let f = FooParser()
let outcome = f.parse()
switch outcome {
case .result(let result):
print(result)
case .parser(let parser):
let nextOutcome = parser.parse()
}
You can see from this example that Swift is still enforcing type-safety. We're trying to wrap a BarParser
instance (that works with String
s) in an AnyParser
type erased wrapper that expects an Int
generic parameter, resulting in a compiler error. Once FooParser
is parameterised to work with String
s instead of Int
, the compiler error will be resolved.
In fact, as AnyParser
in this case only acts as a wrapper for a single method, another potential solution (if you really detest type erasures) is to simply use this directly as your ParserOutcome
's associated value.
protocol Parser {
associatedtype Result
func parse() -> ParserOutcome<Result>
}
enum ParserOutcome<Result> {
case result(Result)
case anotherParse(() -> ParserOutcome<Result>)
}
struct BarParser : Parser {
func parse() -> ParserOutcome<String> {
return .result("bar")
}
}
struct FooParser : Parser {
func parse() -> ParserOutcome<String> {
let nextParser = BarParser()
return .anotherParse(nextParser.parse)
}
}
...
let f = FooParser()
let outcome = f.parse()
switch outcome {
case .result(let result):
print(result)
case .anotherParse(let nextParse):
let nextOutcome = nextParse()
}
Status of the features needed to make this work:
Looks like this is currently not possible without introducing boxed types (the "type erasure" technique), and is something looked at for a future version of Swift, as described by the Recursive protocol constraints and Arbitrary requirements in protocols sections of the Complete Generics Manifesto (since generic protocols are not going to be supported).
When Swift supports these two features, the following should become valid:
protocol Parser {
associatedtype Result
associatedtype SubParser: Parser where SubParser.Result == Result
func parse() -> ParserOutcome<Result, SubParser>
}
enum ParserOutcome<Result, SubParser: Parser where SubParser.Result == Result> {
case result(Result)
case parser(P)
}
With generic typealias
es, the subparser type could also be extracted as:
typealias SubParser<Result> = Parser where SubParser.Result == Result
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With