3D Modeling Done Easy: Object Capture with Apple’s ARkit


Software Developer






What’s Object Capture?

For years Apple has been improving their AR/VR games through software in their ARKit and Machine learning frameworks. But recently they made two significant leaps in AR. First, Apple added a LIDAR sensor to their pro devices which provides exceptional depth tracking, and second, they’ve included a new capability called **Object Capture**, which creates a 3D object from a set of images.

That’s right! Now we can create a virtual 3D object from an object in real life!

Over the years one of the biggest AR limitations was the absence of an easy method for adding 3D objects to a scene without an expensive and laborious artist 3D model. With this new capability this lengthy process can now be done easily and for free on your iPhone.

In this blog we will go through why this is a incredible breakthrough for AR and how it works.

Why is this amazing?

Imagine you’re the owner of the a new decor shop and you’re worried about competing against IKEA’s online shopping experience. Particularly, you know that IKEA has a sophisticated studio for creating the furniture AR models that are featured online. Object Capture makes this ability available to anyone with an iPhone. Now there’s no need to grab a measuring tape, just point your camera and a new 3D model of your sofa can be plucked right from your living room.

What was an expensive process is now cheap and you can do it for almost no cost.  Any product can be rendered in 3D in your customer’s house and they can try the product in a real environment, removing one of the biggest hassles of online shopping.

Of course, the possibilities go beyond e-commerce; there are implications for games, tours, meetings, collaboration and so on!

How does it work?

In summary, an image composer takes multiple object images as input and converts them into a 3D asset by applying machine learning to predict the 3D object from 2D photos. The process of predicting a 3D object based on 2D images is called photogrammetry and Apple’s new Photogrammetry API makes this possible.

Caption if necessary


The Photogrammetry API works on all Apple Silicon Macs (also the fastest due to the built in Neural Engine) and on Intel Macbooks with at least 4GB AMD graphics and 16 GB of RAM.

Input images can be HEIC, PNG, JPG and the output models can be exported in USDZ, USDA or OBJ and supports 4 levels of model quality: Reduced, Medium, Full, Raw.

Reduced and Medium quality levels can be used to display items on your phone, including the Safari browser’s AR Quicklook view, while Full and Raw are recommended for more immersive experiences like video games.


Allowing sellers to easily upload 3D representations of their products and allowing customers to then try them out at home will greatly affect the different marketplace apps like Amazon, Ebay and MercadoLibre. And the competition is fierce, both Microsoft and Facebook are also pushing for improved VR/AR experiences with the HoloLens and Oculus products respectively.

In the next blog post we’ll go over the API, AR QuickLook, and we’ll build a sample model. Subscribe to follow along!

Like what you read?

Sign up to our newsletter to stay in the loop

What to read next


Crypto: How it started


Crypto: How it started


Micro Front-ends Explained with Jack Herrington