I've been poking around at Xcode's project format lately, and I'm trying to understand the best way to load it into memory. For reference, I'm using Python for this experiment.
I'm looking for an easy way to read out the contents of a project so I can locate any of the files on disk. (Once I have paths to files, I hope to be able to read from and write to them.)
Xcode's file format is the xcodeproj
, which is a package containing several files. The one of interest is the project.pbxproj
file, because it contains the file references and logical groups that you see in the lefthand pane in Xcode. I've found a nice reference to for the format at monobjc.net.
The pbxproj
file itself is an OpenStep property list. It contains a dictionary which identifies the various files and groups that make up the project. What's most interesting to me is the way the project refers to the files system.
What you need to know from that link is that every framework, file reference, and Xcode project group is represented with a dictionary that has, among other things, a property called isa
that specifies the kind of object. Each object also has a unique identifier as a key. Some entries have paths. Files always have them, and groups have them sometimes. Allow me to explain...
Let's take an example file structure, for purposes of illustration:
In this case, I've got a folder named Characters
which contains a folder named Warlock
. Inside of Xcode, it looks like this:
I've got a group named Characters
and a subgroup named Warlock
. Essentially, my group structure mirrors my file structure.
In my pbxproj
, there are actually two ways to refer to the file on disk. The dictionary representation for the group named "Characters" can have a path
attribute, which would point to Characters/Warlock. The Accessories.bundle
file would have a path of Accessories.bundle
. This tends to be the case if I drag in the entire folder.
The second way to represent this arrangement of the filesystem is to have no path attribute in the group's dictionary, and the file reference will have a more complete path. This will happen if you drag in each file by itself.
I'm trying to figure out the best data structure for traversing these files, considering the group association. I want to be able to read the files, so I need to get complete paths, even if they're relative to the Xcode project.
What's a good way to handle a dictionary in python? What data structure does Apple use for managing Xcode projects in memory?