Question

I've been poking around at Xcode's project format lately, and I'm trying to understand the best way to load it into memory. For reference, I'm using Python for this experiment.

I'm looking for an easy way to read out the contents of a project so I can locate any of the files on disk. (Once I have paths to files, I hope to be able to read from and write to them.)

Xcode's file format is the xcodeproj, which is a package containing several files. The one of interest is the project.pbxproj file, because it contains the file references and logical groups that you see in the lefthand pane in Xcode. I've found a nice reference to for the format at monobjc.net.

The pbxproj file itself is an OpenStep property list. It contains a dictionary which identifies the various files and groups that make up the project. What's most interesting to me is the way the project refers to the files system.

What you need to know from that link is that every framework, file reference, and Xcode project group is represented with a dictionary that has, among other things, a property called isa that specifies the kind of object. Each object also has a unique identifier as a key. Some entries have paths. Files always have them, and groups have them sometimes. Allow me to explain...

Let's take an example file structure, for purposes of illustration:

enter image description here

In this case, I've got a folder named Characters which contains a folder named Warlock. Inside of Xcode, it looks like this:

enter image description here

I've got a group named Characters and a subgroup named Warlock. Essentially, my group structure mirrors my file structure.

In my pbxproj, there are actually two ways to refer to the file on disk. The dictionary representation for the group named "Characters" can have a path attribute, which would point to Characters/Warlock. The Accessories.bundle file would have a path of Accessories.bundle. This tends to be the case if I drag in the entire folder.

The second way to represent this arrangement of the filesystem is to have no path attribute in the group's dictionary, and the file reference will have a more complete path. This will happen if you drag in each file by itself.

I'm trying to figure out the best data structure for traversing these files, considering the group association. I want to be able to read the files, so I need to get complete paths, even if they're relative to the Xcode project.

What's a good way to handle a dictionary in python? What data structure does Apple use for managing Xcode projects in memory?

Was it helpful?

Solution

What's a good way to handle a dictionary in python?

Normally, a dict.

However, in this case, if you're running on a Mac, you might want to use Cocoa's nice property-list APIs, in which case you'll get an NSDictionary instead. That's fine too; either way, the API is the same. For example:

>>> import AppKit
>>> path = os.path.expanduser('~/src/foo.xcodeproj/project.pbxproj')
>>> d = AppKit.NSDictionary.dictionaryWithContentsOfFile_(path)

However, this isn't just a plist; it's an NSArchiver archive, a higher-level structure, more like a Python pickle—it encodes information about ObjectiveC classes, etc.

What data structure does Apple use for managing Xcode projects in memory?

Most likely whatever structure that the archive decodes to. But the core parts are probably mainly an NSDictionary, with values it probably accesses via KVC key paths.

Do you need to use that yourself? Not necessarily. As you'd already determined, the format isn't that deep or that complicated, so just reading the plist as a dict, throwing away everything but the objects value, and building your own filesystem tree out of the result isn't going to be all that hard.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top