Question

I'm trying to understand how to do code completion with libclang. I've watched "Thinking beyond the compiler" and I've looked over c-index-test, and I found a simple sample program here

I compiled that program and ran it on this sample file that I whipped up to resemble the one in the video:

struct List {
    int Data;
    struct List *Next;
};

int sumListNode(struct List *Node) {
    int result = 0;
    for (; Node; Node = Node->Next)
        result = result + Node->
}

void test() {
    sumLi
}

If I point the program at the first incomplete space after Node->, it spits out a few C keywords, but it doesn't spit out Next or Data like the video says it should.

If I point it to the space after sumLi, it prints out those same C keywords. I can get it to print out sumListNode if I point it to the column that has the 's' in sumLi, but even then it assigns that as the same priority value as the other keywords, so it is really just printing out everything that I could put there instead of reading what is under the cursor and trying to make an intelligent guess. I was just grasping at straws hoping that putting the cursor at the beginning of the fragment instead of the end would help anyhow.

I've learned a lot about the type of data libclang can give me and how to operate with it from the doxygen, and from poking around in c-index-test, but I just haven't learned how to make it give me relevant data so that I have something to work with.

Was it helpful?

Solution

First, you should try printing any CXDiagnostic output by the translation unit, since any error could cause clang to get lost in your code (this is very unlikely in the very simple case you mention).

Second, be aware that libclang defines line and column number in a potentially different way than what you are used to (i.e. if you are getting line/col info from your text editor, you might have to add 1 to the col number to be in sync with clang's definition).

Third, you could use the clang compiler itself to test the validity of the compilation options and line/column information. This way you eliminate the uncertainty stemming from your libclang-based code. You can e.g. use the following command-line:

clang++ -cc1 -fsyntax-only -code-completion-at FILENAME:LINE:COL CLANG_ARGS

Also note that clang_codeCompleteAt is meant to be called only at the beginning of a token and produces a list of all possible tokens, the client being in charge of filtering the results with the potential partial token already entered in the text editor.

From the documentation (emphasis is mine):

Perform code completion at a given location in a translation unit.

This function performs code completion at a particular file, line, and column within source code, providing results that suggest potential code snippets based on the context of the completion. The basic model for code completion is that Clang will parse a complete source file, performing syntax checking up to the location where code-completion has been requested. At that point, a special code-completion token is passed to the parser, which recognizes this token and determines, based on the current location in the C/Objective-C/C++ grammar and the state of semantic analysis, what completions to provide. These completions are returned via a new CXCodeCompleteResults structure.

Code completion itself is meant to be triggered by the client when the user types punctuation characters or whitespace, at which point the code-completion location will coincide with the cursor. For example, if p is a pointer, code-completion might be triggered after the "-" and then after the ">" in p->. When the code-completion location is afer the ">", the completion results will provide, e.g., the members of the struct that "p" points to. The client is responsible for placing the cursor at the beginning of the token currently being typed, then filtering the results based on the contents of the token. For example, when code-completing for the expression p->get, the client should provide the location just after the ">" (e.g., pointing at the "g") to this code-completion hook. Then, the client can filter the results based on the current token text ("get"), only showing those results that start with "get". The intent of this interface is to separate the relatively high-latency acquisition of code-completion results from the filtering of results on a per-character basis, which must have a lower latency.

Taking your modified second example:

int main (int argc, char **argv) {
  int i = sumLi
  //      ^
}

Code completion should be called at the marked position (i.e. at the beginning of the token). Clang could then give a long list of results including for example:

  • argc
  • sumListNode(<# struct List *Node #>)

It is then up to you to filter this list based on the partially entered sumLi token and keep the only relevant completion: sumListNode.

If you understand elisp, clang's sources contain an auto-completion library for Emacs, which is a good example of this two-level implementation:

trunk/utils/clang-completion-mode.el

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top