Question

We could use advice on a scaling/ops issue.

We have a simple mobile app that runs on Rails 3.2.12 and uses MongoMapper instead of ActiveRecord. There is one database call that sporadically performs poorly, causing users to write in and complain. It isn't clear why. We can't install NewRelic because of MongoMapper, and the data returned by Mongo isn't a lot (< 200KB). There isn't much logic being executed in the controller, either.

The problem seems exacerbated with more users. The server runs on a VPS, one shared with 30 nodes. The hosting company says average I/O utilization is at 12%, well below the critical threshold.

Since we can't use NewRelic, what's the best approach for troubleshooting the problem?

Here's output from explain:

User.collection.find({:username_downcase => 'banana2006'}).explain
=> {"cursor"=>"BtreeCursor username_downcase", 
    "isMultiKey"=>false, 
    "n"=>1, 
    "nscannedObjects"=>1, 
    "nscanned"=>1, 
    "nscannedObjectsAllPlans"=>1, 
    "nscannedAllPlans"=>1, 
    "scanAndOrder"=>false, 
    "indexOnly"=>false, 
    "nYields"=>0, 
    "nChunkSkips"=>0, 
    "millis"=>3, 
    "indexBounds"=>{"username_downcase"=>[["banana2006", "banana2006"]]}, 
    "allPlans"=>[{"cursor"=>"BtreeCursor username_downcase", "n"=>1, "nscannedObjects"=>1, "nscanned"=>1, "indexBounds"=>{"username_downcase"=>[["banana2006", "banana2006"]]}}], "oldPlan"=>{"cursor"=>"BtreeCursor username_downcase", "indexBounds"=>{"username_downcase"=>[["banana2006", "banana2006"]]}}, "server"=>"x.com"}

Output from db.serverStatus:

> db.serverStatus()
{
        "host" : "mongo.x.com",
        "version" : "2.2.0",
        "process" : "mongod",
        "pid" : 15957,
        "uptime" : 5232267,
        "uptimeMillis" : NumberLong("5232267460"),
        "uptimeEstimate" : 5178261,
        "localTime" : ISODate("2013-05-19T19:32:14.561Z"),
        "locks" : {
                "." : {
                        "timeLockedMicros" : {
                                "R" : NumberLong(131563265),
                                "W" : NumberLong("2824934127")
                        },
                        "timeAcquiringMicros" : {
                                "R" : NumberLong(536751143),
                                "W" : NumberLong(644540368)
                        }
                },
                "admin" : {
                        "timeLockedMicros" : {
                                "r" : NumberLong(11906),
                                "w" : NumberLong(0)
                        },
                        "timeAcquiringMicros" : {
                                "r" : NumberLong(424),
                                "w" : NumberLong(0)
                        }
                },
                "local" : {
                        "timeLockedMicros" : {
                                "r" : NumberLong(13829064),
                                "w" : NumberLong(0)
                        },
                        "timeAcquiringMicros" : {
                                "r" : NumberLong(96863334),
                                "w" : NumberLong(0)
                        }
                },
                "x-development" : {
                        "timeLockedMicros" : {
                                "r" : NumberLong(22074626),
                                "w" : NumberLong(645528)
                        },
                        "timeAcquiringMicros" : {
                                "r" : NumberLong(2876041),
                                "w" : NumberLong(3693)
                        }
                },
                "x-production" : {
                        "timeLockedMicros" : {
                                "r" : NumberLong("39251850394"),
                                "w" : NumberLong(1466862624)
                        },
                        "timeAcquiringMicros" : {
                                "r" : NumberLong("17410130690"),
                                "w" : NumberLong(858232658)
                        }
                },
                "z-development" : {
                        "timeLockedMicros" : {
                                "r" : NumberLong(1897461),
                                "w" : NumberLong(0)
                        },
                        "timeAcquiringMicros" : {
                                "r" : NumberLong(134836),
                                "w" : NumberLong(0)
                        }
                }
        },
        "globalLock" : {
                "totalTime" : NumberLong("5232267461000"),
                "lockTime" : NumberLong("2824934127"),
                "currentQueue" : {
                        "total" : 0,
                        "readers" : 0,
                        "writers" : 0
                },
                "activeClients" : {
                        "total" : 0,
                        "readers" : 0,
                        "writers" : 0
                }
        },
        "mem" : {
                "bits" : 64,
                "resident" : 87,
                "virtual" : 9071,
                "supported" : true,
                "mapped" : 4207,
                "mappedWithJournal" : 8414
        },
        "connections" : {
                "current" : 3,
                "available" : 9597
        },
        "extra_info" : {
                "note" : "fields vary by platform",
                "heap_usage_bytes" : 198457056,
                "page_faults" : 3176777
        },
        "indexCounters" : {
                "btree" : {
                        "accesses" : 18208995,
                        "hits" : 18208994,
                        "misses" : 0,
                        "resets" : 0,
                        "missRatio" : 0
                }
        },
        "backgroundFlushing" : {
                "flushes" : 87204,
                "total_ms" : 563603,
                "average_ms" : 6.463040686207055,
                "last_ms" : 1,
                "last_finished" : ISODate("2013-05-19T19:31:55.201Z")
        },
        "cursors" : {
                "totalOpen" : 0,
                "clientCursors_size" : 0,
                "timedOut" : 0
        },
        "network" : {
                "bytesIn" : 9286320357,
                "bytesOut" : 148669944094,
                "numRequests" : 5102457
        },
        "opcounters" : {
                "insert" : 0,
                "query" : 3213569,
                "update" : 1989197,
                "delete" : 0,
                "getmore" : 30944,
                "command" : 216139
        },
        "asserts" : {
                "regular" : 0,
                "warning" : 0,
                "msg" : 0,
                "user" : 0,
                "rollovers" : 0
        },
        "writeBacksQueued" : false,
        "dur" : {
                "commits" : 30,
                "journaledMB" : 0.04096,
                "writeToDataFilesMB" : 0.043131,
                "compression" : 0.9447148096039855,
                "commitsInWriteLock" : 0,
                "earlyCommits" : 0,
                "timeMs" : {
                        "dt" : 3069,
                        "prepLogBuffer" : 0,
                        "writeToJournal" : 0,
                        "writeToDataFiles" : 0,
                        "remapPrivateView" : 0
                }
        },
        "recordStats" : {
                "accessesNotInMemory" : 1102532,
                "pageFaultExceptionsThrown" : 657056,
                "admin" : {
                        "accessesNotInMemory" : 0,
                        "pageFaultExceptionsThrown" : 0
                },
                "local" : {
                        "accessesNotInMemory" : 0,
                        "pageFaultExceptionsThrown" : 0
                },
                "x-development" : {
                        "accessesNotInMemory" : 1555,
                        "pageFaultExceptionsThrown" : 1304
                },
                "x-production" : {
                        "accessesNotInMemory" : 1074115,
                        "pageFaultExceptionsThrown" : 639842
                },
                "z-development" : {
                        "accessesNotInMemory" : 0,
                        "pageFaultExceptionsThrown" : 0
                }
        },
        "ok" : 1
}
Was it helpful?

Solution

You can set up MongoDB profiling although the default 100ms should be good enough.

Still, we can't tell you how to optimize Mongo queries if we don't know what they are. What is the "one database call that sporadically performs poorly"?

In your comments you reference a long query that is using the snapshot option. That can cause lock contention, so if you can remove the snapshot option safely you will likely see improvement.

I don't know why you say you can't use New Relic. It might not give you automatic timing of database queries but since you know which one is the problem you can use method tracers to isolate the problem and make sure it is the database rather than some other part of your code.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top