In this instance the error message "error: NULL _cd_rawData but the object is not being turned into a fault" indicates that you are accessing a managed object outside of its context. Basically your fetch returns all the Tweets from your persistent store as faults. Once you try and access a property on the Managed Object, Core Data will fire a fault and fetch the full object from the store.
By calling the NSArray method indexOfObjectWithOptions:passingTest:
with an option of NSEnumerationConcurrent
you are implying that you want to perform asynchronous execution on the elements in your array. The keyword concurrent indicates that multiple threads can be used to operate on the array elements.
In your context this means that accessing a managed object inside this block might result in accessing it on a different thread from the managed object context that owns the object. So when you access tweetToCheck.text in your conditional check - if ([tweetToCheck.text rangeOfString:obj].location != NSNotFound)
, under the hood Core Data is fetching that managed object from the persistent store and returning it to a thread that is not part of the managed object contexts thread.
Furthermore, it is not necessary to use the method indexOfObjectWithOptions:passingTest:
since you are not actually interested in the result of this operation.
It seems to me that it might be more convenient for you to use an NSSet as you are only testing to see whether or not a given tweet word exists in your profane words. Quoting the documentation for NSSet: "You can use sets as an alternative to arrays when the order of elements isn’t important and performance in testing whether an object is contained in the set is a consideration". Clearly this seems to meet your criteria.
So your init would look like:
-(id)initWithStore:(NSPersistentStoreCoordinator*)store
badWords:(NSSet*)badWords
{
self = [super init];
if(self) {
self.persistentStoreCoordinator = store;
self.badWords = [words copy];
}
return self;
}
Since you are only interested in updating tweets that have not yet been tagged for profanity you would probably only want to fetch tweets that haven't been flagged profane:
//Create new fetch request
NSFetchRequest *request = [[NSFetchRequest alloc] init];
//Setup the Request
[request setEntity:[NSEntityDescription entityForName:@"Tweet" inManagedObjectContext:self.backgroundContext]];
[request setPredicate:[NSPredicate predicateWithFormat:@"profanity = NO"]];
Now that you have an array of tweets that are not profane you could iterate through your tweets and check each word if it contains a profane word. The only thing you will need to deal with is how to separate your tweet into words (ignoring commas and exclamation marks etc). Then for each word you are going to need to strip it of diacritics and probably ignore the case. So you would end up with someone along the lines of:
if([self.badWords containsObject:badWordString]) {
currentTweet.profanity = [NSNumber numberWithBOOL:YES];
}
Remember, you can run predicates on an NSSet so you could actually perform a case and diacritic insensitive query:
NSPredicate *searchPredicate = [NSPredicate predicateWithFormat:@"SELF = %@[cd]",wordToCheck];
BOOL foundABadWord = ([[[self.badWords filteredSetUsingPredicate:searchPredicate] allObjects] count] > 0);
Another thing you might want to consider is removing duplicate words in your tweets, you don't really want to perform the same check multiple times. So depending on how you find the performance you could place each word of your tweet into an NSSet and simply run the query on the unique words in your tweet:
if([[self.badWords intersectsSet:tweetDividedIntoWordsSet]) {
//we have a profane tweet here!
}
Which implementation you choose is up to you but assuming you are only using english in your app you are definitely going to want to run a case and diacritic insensitive search.
EDIT
One final thing to note is that no matter how much you try, people will always be the best means of detecting profane or abusive language. I encourage you to read this SO's post on detecting profanity - How do you implement a good profanity filter?