Swift, meet WinRT

Open sourcing our Swift bindings generator for WinRT - and an end-to-end sample application for anyone looking to build a modern Windows application in Swift.

Oct 30, 2023

A early preview of Arc on Windows today — from Steve’s desktop straight to you.

If you’re interested in building a Windows app using Swift, then this blog post is for you.

In this piece, Steve on our team shares how we built a language projection tool to create Swift bindings for WinRT and pairs it with the release of our newest open source repo, Swift/WinRT — alongside an end-to-end sample application that uses the Swift/WinRT code generator and WinUI 3, right here.

The goal of this post is to share how we, at the Browser Company, have made it possible to use Swift to build a modern Windows application. There is no UI framework for Windows written in Swift, and Windows itself is written in C++ - so that may leave you wondering, “how can I build my app on Windows”? Modern Windows applications use WinRT, a technology built on top of COM, which can interop really well with Swift, as we presented in our previous post. To be able to build idiomatic UI for Windows in Swift, we have built a language projection tool which creates idiomatic Swift language bindings for WinRT, and today we are open sourcing it at https://github.com/thebrowsercompany/swift-winrt. Swift/Winrt is based on Microsoft’s code generators for C++ (github.com/microsoft/cppwinrt) and C# (github.com/microsoft/cswinrt), and thus is written in C++.

Before we start discussing the Swift/WinRT code generator, we felt that it would be helpful to provide some context on the history of Windows desktop applications. For a Mac or iOS developer curious about building for Windows, knowing where to start can be a daunting task. A simple web search may feel overwhelming, as Microsoft has produced a number of similar desktop UI frameworks throughout the years, all which still exist in some form today: MFC, Windows Forms, WPF, UWP, and the Windows App SDK.

WinUI 3, a component of the Windows App SDK, is the latest UI framework from Microsoft. It is used by the Windows operating system and is continually updated to match the latest design guidance from Microsoft. WinUI 3 is built using WinRT, so that makes it a perfect candidate for us Swifties, and is what we are using to build Arc on Windows.

A Brief History of Windows Desktop Development

Windows GUI started with Windows 3. The API shipped by the Windows team was known as the Windows API, and they were flat C APIs. In 1992, The Developer Division (DevDiv) team at Microsoft created Microsoft Foundation Classes (MFC), a C++ class based wrapper around the Windows API. With Windows 95 came the new 32-bit NT Kernel and new application model. These APIs, known as Win32, were still flat C APIs, and DevDiv updated MFC to support the new Win32 APIs. To this day, many Windows apps are still built directly on top of Win32.

In the late 1990s, DevDiv started working on .NET, a managed runtime and ecosystem inspired by Java. With this, they created Windows Forms, a lightweight .NET wrapper around the Win32 APIs and other graphics libraries like GDI.

At this time, .NET was becoming popular and looked like the future for Windows development, so the Windows team created WPF, a .NET UI Framework. WPF is a big milestone in the story of Windows UI, because every framework Microsoft has created since then has been modeled after WPF. The first version of Windows that shipped with WPF was Vista, and with it came the Aero design language. However, Vista was not well received, and one of the issues was the performance problems that came with .NET. This caused the Windows team to abandon WPF and give it to DevDiv.

Around that time, Flash applications on the web were becoming popular, and DevDiv was more interested in creating a Flash competitor than owning a Windows-specific UI framework. So they built Silverlight, a C++ implementation of the WPF API which ran on Windows, Mac, and Linux. While Silverlight is no longer supported, its legacy continues on as it was the starting point for all of the UI Frameworks that followed. Silverlight was forked for Windows Phone, Windows 8, and Xbox as the main UI framework for building applications on these platforms.

For Windows, owning 3 separate UI frameworks wasn’t sustainable, and the OneCore effort refactored all Windows devices to be based on the same core operating system. This first came to fruition with Windows 10, and the application model was called the Universal Windows Platform (UWP).

After a few years of UWP not being a popular platform for developers, the Windows team shifted strategy and shipped Windows App SDK. The UI Framework within the WinAppSDK is known as WinUI 3, which is where things are at today.

Swift, meet WinRT

WinUI 3 is the native UI framework of choice on Windows, and is now the best option for Swift developers thanks to our WinRT projection. Below is a small little snippet of what it looks like to write some WinUI 3 code in Swift:

In the following section we’ll dig into how we made this code possible. We’ll first highlight some of the features that make WinRT different from COM, how those features enable us to create bindings which allow us to bridge between WinRT and Swift, and finally follow up with how we use these bindings to build UI in Swift using WinUI 3.

Features of WinRT

WinRT builds on top of COM by extending the familiar IUnknown interface with the IInspectable interface, which provides a few more facilities:

Querying an object for its class name
Querying an object for all of its interfaces

While outside of the scope of this article, these facilities help make the Swift language projection feel more natural to developers.

The Interface Definition Language (IDL) for WinRT is very similar to that of COM, while not a complete superset. Inspired by .NET, it has an object-oriented type system with constructs such as namespaces, interfaces, classes, structs, enums, delegates, and generics. Let’s quickly go over what each of these features are.

Namespaces

Namespaces in WinRT work similar to how they do in C# or C++. They are used to group similar APIs and disambiguate types which have the same short type name.

Interfaces and Classes

In WinRT, interfaces and classes are very similar. In fact, a class is really just an object which implements one or more interfaces. These types can have properties, methods, and events. What is an event? An event is a .NET term for the observer pattern. Any number of subscribers can subscribe to an event and be notified when it fires. These subscribers are known as delegates in WinRT, and we’ll discuss them later on.

Enums and Structs

Enums and Structs in WinRT are much simpler than those in Swift. Enums are simple 32 bit integers and structs are simple value types with limitations on the field members. For example, reference types like classes can not be members of a struct.

Delegates

Delegates are also inspired by .NET and represent a single function pointer with a custom signature. A delegate can be used in any property or method just like any type, and they are used to subscribe to event notifications.

Generics

WinRT supports a small number of built-in generic types and does not support custom generics.

Now that we’ve laid the groundwork for what constitutes a WinRT API, we can go into more depth and dive into the anatomy of a WinRT API. We’ll then follow that up with some examples of what these look like in Swift.

Anatomy of a WinRT API

If we recall the earlier snippet of StackPanel, we can see a lot of the WinRT features we just discussed at play. To gain a clearer understanding of what these APIs are, let’s look at a scoped down definition of what the StackPanel IDL might look like. This should help us in the following section as we discuss how these APIs project into Swift.

namespace Microsoft.UI.Xaml.Controls {
    unsealed runtimeclass StackPanel: Panel {      
      Orientation Orientation { get; set; };
      Double Spacing { get; set; };
  }
}

You’ll notice that it only has the Spacing and Orientation properties, that’s because it inherits the other members from its superclasses:

unsealed runtimeclass Panel: FrameworkElement {
    UIElementCollection Children { get; };
}

UIElementCollection is a great example of generics:

runtimeclass UIElementCollection: IVector<UIElement> {
}

Button showcases how events are defined in IDL:

unsealed runtimeclass ButtonBase: ContentControl {
    event RoutedEventHandler Click;
}

ContentControl is a unique example of showing how value boxing works.

unsealed runtimeclass ContentControl: Control {
    Object Content;
}

These APIs are a representation of C interfaces with virtual function calls, and can be used from Swift in many of the same ways that Saleem discussed in the previous blog post on C interop. As he also pointed out, there are ways to create convenient wrappers around these interfaces to make them more manageable to call from Swift. However, working with WinRT objects at that layer is tedious to write, as it lacks a lot of the ergonomic features that modern languages like Swift offer. Furthermore, the size of the WinRT API surface that ships in the Windows SDK would make handwriting them an unwieldy effort. This is where a code generation solution comes in handy.

Code Generation: Projecting WinRT Features to Swift

Earlier in this post we looked at a snippet of code building some UI with WinUI 3 in Swift. It very much looks and feels like idiomatic Swift. Below is a table highlighting how WinRT types project into Swift types:

WinRT properties and methods project into Swift properties and methods and are translated to camel case. Events are projected as a special Event type and are properties on the type. This Event type has an addHandler method which callers can use to subscribe, and a removeHandler to unsubscribe.

Creating these natural feeling APIs is in large part due to the flexibility of the Swift language. We are able to use properties with custom get and set implementations to do the appropriate WinRT API call under the covers. One place where we can go even further in making the WinRT usage feel natural to Swift developers is collections.

WinRT Projection Case Study: Collections

When we speak of collections in WinRT, we are generally referring to the generic IVector<T> interface. While there are read-only arrays which project nicely to the Swift Array class, this type is not commonly used in the API surface area. We’ve made the IVector easier to use by making the type conform to the Collection protocol in Swift. This enables you to use a lot of the helpful extension methods that come with this protocol:

Furthermore, conforming to Collection makes the IVector API behave similar to Array in Swift, where out-of-bounds access through the subscript operator [] will result in the app being terminated. This behavior is slightly different from what the WinRT API presents, which we’ll discuss later on.

At this point, we are pleased with how well Swift and WinRT play together. If you are interested in knowing more about how these projections work under the covers, you are encouraged to take a peek at the generated code.This next section will discuss some of the challenges we’ve encountered while trying to make WinRT work in a Swift world.

Challenges with the Language Projection

While the shape of the Swift bindings feels very natural to developers, the inherent complexity of the problem and limitations of the WinRT API pose some interesting challenges. This section will discuss some of the main challenges we’ve encountered while building these projections that you may come across as you’re building out a Windows application.

Error Handling

Every WinRT API call can fail, including creating an object, getting or setting a property, and even registering an event handler. This would make writing WinRT code annoying as it would necessitate the use of a lot of trys everywhere. While we could mark many of these APIs as throws, it isn’t expected that these WinRT APIs will ever throw when used properly. We’ve already discussed the IVector case where we made the decision to not mark them as throws.

So with that in mind, we weighed the pros and cons and came to the conclusion that constructors, properties, and events wouldn’t be marked throws. We feel like it’s the right balance of developer ergonomics and correctness, and so far it’s been working out for us as we’ve built Arc on Windows. There are always escape hatches in case a developer needs to gracefully handle an error.

Null Reference Values

Every class and interface in WinRT represents a reference counted object that is heap allocated. While the Swift type system supports non-null reference types, the WinRT type system does not. This means that all WinRT reference types have to be considered nullable. For example, even read-only properties like StackPanel.Children could theoretically be null, meaning every access would have to include the ? operator. In practice, we know this will never be true, and so all properties and parameters which take reference types use implicitly unwrapped optionals. This feature gives developers the flexibility to safely check for nil when needed, but not be burdened by the check when they don’t.

Protected Members

Unlike Swift, WinRT has an access modifier of protected, meaning these APIs are only available to subclasses of the type. In Swift, these APIs have to be marked public, but we mark them with a @spi(WinRTImplements) so that a user has to do a @spi(WinRTImplements) import in order to call the APIs.

Weak References

Because there is no garbage collector in Swift, weak references are an important part of writing Swift code. WinRT has a similar memory management model as Swift and so that makes it equally important. You may be wondering what is difficult about weak references? Well, it’s common that the Swift wrapper created by the language projection won’t have any references to it. The wrapper is created once a WinRT object bridges back out into Swift, and if nothing in the Swift world keeps it alive, the Swift object will be deallocated. Clearly, this makes using weak var problematic. Thankfully, there are the IWeakReferenceSource and IWeakReference interfaces in WinRT to save us. The Swift language projection can query a WinRT object for IWeakReferenceSource and retrieve an IWeakReference to it. What this means to Swift developers is that a weak var object: UIElement? will turn into var object: WeakReference<UIElement>.

Wrapping It All Up

In this blog post, we’ve shared the exciting news that we’ve open sourced Swift/WinRT, a WinRT language bindings generator for Swift. We also discussed some of the history of Windows desktop development, and showed how WinUI 3 is the latest technology to write Windows UI applications. Finally, we rounded it out by showing how a WinRT API projects into Swift, using WinUI 3 as an example.

For a full end-to-end sample of an application that uses the Swift/WinRT code generator and WinUI 3, see https://github.com/thebrowsercompany/windows-samples. In a future blog post, we’ll discuss some of the challenges we’ve come across in building Arc on Windows using WinUI 3.

-Steve

Karl Traunmüller

Dec 6, 2023

This is amazing 👏🏻 Native WinUI apps with Swift - finally a viable path to porting my Swift Mac app to Windows.

Expand full comment

Taeksoo Jung

I have a few questions about implementation.

1. How to accomplish SwiftUI like declarative programming here? I guess that you use TCA for whole architecture, but for View- Model biding, I wonder how to handle this.

2. Dd you simplify the API for real use? I mean that sample code seems to be a little verbose, because it uses th C++/WinRT as it is.

3. Is there any issue for real use? I found the repository that shows similiar approach in c++ - "WinUI 3 in C++ Without XAML".

https://github.com/sotanakamura/winui3-without-xaml

4. Do you have any plan to provide SwiftUI-ish implementation?

1 reply

3 more comments...

Speaking in Swift by The Browser Company

Discussion about this post