Question

I am trying to run a query that is supposed to select several fields out of a mongodb collection (arround 20,000 objects with an average size of ~ 20k each). In the middle of the query run, I get a 'Killed' message outputed to the console, and the script dies.

The hardware is Amazon AWS EC2 m3.medium, running ubuntu 12.04

db stats:

{
    "ns" : "vita.raw_data",
    "count" : 95976,
    "size" : 21912264248,
    "avgObjSize" : 228309.83004084355,
    "storageSize" : 25440595872,
    "numExtents" : 40,
    "nindexes" : 4,
    "lastExtentSize" : 2146426864,
    "paddingFactor" : 1.0099999999997797,
    "flags" : 1,
    "totalIndexSize" : 12059600,
    "indexSizes" : {
        "_id_" : 3204992,
        "index_1_last_response_1" : 2330160,
        "retailer_id_1" : 3899952,
        "retailer_id_1_category_leaf_1_category_set_1" : 2624496
    },
    "ok" : 1
}

/var/log/mongodb/mongodb.log

  Thu May  8 14:13:48 [initandlisten] connection accepted from 127.0.0.1:48590 #5
    Thu May  8 14:13:48 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4221591 173ms
    Thu May  8 14:13:49 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4242110 162ms
    Thu May  8 14:13:49 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4235627 167ms
    Thu May  8 14:13:49 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4217085 174ms
    Thu May  8 14:13:49 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4247976 156ms
    Thu May  8 14:13:50 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4238294 175ms
    Thu May  8 14:13:50 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4222222 170ms
    Thu May  8 14:13:50 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:58 reslen:4194810 183ms
    Thu May  8 14:13:50 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4232253 154ms
    Thu May  8 14:13:50 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4204341 164ms
    Thu May  8 14:13:51 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4222782 162ms
    Thu May  8 14:13:51 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4226366 160ms
    Thu May  8 14:13:51 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4226044 160ms
    Thu May  8 14:13:51 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4236671 159ms
    Thu May  8 14:13:52 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4250988 185ms
    Thu May  8 14:13:52 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4239619 411ms
    Thu May  8 14:13:53 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:61 reslen:4253878 670ms
    Thu May  8 14:13:53 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4197101 162ms
    Thu May  8 14:13:53 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4220863 241ms
    Thu May  8 14:13:54 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4223308 239ms
    Thu May  8 14:13:55 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:56 reslen:4215477 192ms
    Thu May  8 14:13:56 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:56 reslen:4226147 238ms
    Thu May  8 14:13:57 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:56 reslen:4230631 246ms
    Thu May  8 14:13:58 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:56 reslen:4205313 248ms
    Thu May  8 14:13:59 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:56 reslen:4236395 238ms
    Thu May  8 14:14:00 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:55 reslen:4216751 259ms
    Thu May  8 14:14:01 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:57 reslen:4213662 239ms
    Thu May  8 14:14:01 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:57 reslen:4234624 131ms
    Thu May  8 14:14:02 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:57 reslen:4211163 128ms
    Thu May  8 14:14:03 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:58 reslen:4221606 255ms
    Thu May  8 14:14:05 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4250148 104ms
    Thu May  8 14:14:05 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:59 reslen:4219099 267ms
    Thu May  8 14:14:07 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:61 reslen:4199884 267ms
    Thu May  8 14:14:08 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:58 reslen:4208595 242ms
    Thu May  8 14:14:09 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4220319 262ms
    Thu May  8 14:14:11 [conn5] getmore vita.raw_data query: { retailer_id: 2 } cursorid:8677435513443899340 nreturned:60 reslen:4248640 334ms
    Thu May  8 14:14:17 [conn5] end connection 127.0.0.1:48590

/var/log/syslog

May  8 14:14:01 staging-api CRON[10987]: (root) CMD (/etc/ganglia/scripts/diskstats-xvdap1 > /dev/null 2>&1)
May  8 14:14:17 staging-api kernel: [ 8058.149946] nginx invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
May  8 14:14:17 staging-api kernel: [ 8058.149952] nginx cpuset=/ mems_allowed=0
May  8 14:14:17 staging-api kernel: [ 8058.149955] Pid: 964, comm: nginx Not tainted 3.2.0-60-virtual #91-Ubuntu
May  8 14:14:17 staging-api kernel: [ 8058.149957] Call Trace:
May  8 14:14:17 staging-api kernel: [ 8058.149964]  [<ffffffff8111b481>] dump_header+0x91/0xe0
May  8 14:14:17 staging-api kernel: [ 8058.149966]  [<ffffffff8111b805>] oom_kill_process+0x85/0xb0
May  8 14:14:17 staging-api kernel: [ 8058.149968]  [<ffffffff8111bbaa>] out_of_memory+0xfa/0x220
May  8 14:14:17 staging-api kernel: [ 8058.149971]  [<ffffffff81121583>] __alloc_pages_nodemask+0x8c3/0x8e0
May  8 14:14:17 staging-api kernel: [ 8058.149976]  [<ffffffff81158746>] alloc_pages_current+0xb6/0x120
May  8 14:14:17 staging-api kernel: [ 8058.149979]  [<ffffffff811183a7>] __page_cache_alloc+0xb7/0xd0
May  8 14:14:17 staging-api kernel: [ 8058.149981]  [<ffffffff8111a372>] filemap_fault+0x212/0x3c0
May  8 14:14:17 staging-api kernel: [ 8058.149984]  [<ffffffff8113aab2>] __do_fault+0x72/0x550
May  8 14:14:17 staging-api kernel: [ 8058.149988]  [<ffffffff81013728>] ? __switch_to+0x138/0x360
May  8 14:14:17 staging-api kernel: [ 8058.149990]  [<ffffffff8113e16a>] handle_pte_fault+0xfa/0x200
May  8 14:14:17 staging-api kernel: [ 8058.149993]  [<ffffffff810063fe>] ? xen_pmd_val+0xe/0x10
May  8 14:14:17 staging-api kernel: [ 8058.149995]  [<ffffffff81005379>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
May  8 14:14:17 staging-api kernel: [ 8058.149998]  [<ffffffff8113f449>] handle_mm_fault+0x269/0x370
May  8 14:14:17 staging-api kernel: [ 8058.150002]  [<ffffffff8165f584>] do_page_fault+0x184/0x550
May  8 14:14:17 staging-api kernel: [ 8058.150005]  [<ffffffff8165ba6e>] ? _raw_spin_unlock_irqrestore+0x1e/0x30
May  8 14:14:17 staging-api kernel: [ 8058.150009]  [<ffffffff811bcf9c>] ? ep_poll+0x1cc/0x380
May  8 14:14:17 staging-api kernel: [ 8058.150013]  [<ffffffff810607c0>] ? try_to_wake_up+0x200/0x200
May  8 14:14:17 staging-api kernel: [ 8058.150016]  [<ffffffff8165c1b5>] page_fault+0x25/0x30
May  8 14:14:17 staging-api kernel: [ 8058.150017] Mem-Info:
May  8 14:14:17 staging-api kernel: [ 8058.150018] Node 0 DMA per-cpu:
May  8 14:14:17 staging-api kernel: [ 8058.150020] CPU    0: hi:    0, btch:   1 usd:   0
May  8 14:14:17 staging-api kernel: [ 8058.150022] Node 0 DMA32 per-cpu:
May  8 14:14:17 staging-api kernel: [ 8058.150023] CPU    0: hi:  186, btch:  31 usd:   0
May  8 14:14:17 staging-api kernel: [ 8058.150026] active_anon:922299 inactive_anon:47 isolated_anon:0
May  8 14:14:17 staging-api kernel: [ 8058.150026]  active_file:33 inactive_file:89 isolated_file:0
May  8 14:14:17 staging-api kernel: [ 8058.150027]  unevictable:0 dirty:0 writeback:0 unstable:0
May  8 14:14:17 staging-api kernel: [ 8058.150028]  free:4048 slab_reclaimable:1994 slab_unreclaimable:2086
May  8 14:14:17 staging-api kernel: [ 8058.150028]  mapped:28 shmem:56 pagetables:16092 bounce:0
May  8 14:14:17 staging-api kernel: [ 8058.150030] Node 0 DMA free:8336kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:8080kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
May  8 14:14:17 staging-api kernel: [ 8058.150036] lowmem_reserve[]: 0 3772 3772 3772
May  8 14:14:17 staging-api kernel: [ 8058.150039] Node 0 DMA32 free:7856kB min:7848kB low:9808kB high:11772kB active_anon:3689196kB inactive_anon:188kB active_file:132kB inactive_file:356kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3862656kB mlocked:0kB dirty:0kB writeback:0kB mapped:112kB shmem:224kB slab_reclaimable:7976kB slab_unreclaimable:8344kB kernel_stack:856kB pagetables:64368kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:87851 all_unreclaimable? yes
May  8 14:14:17 staging-api kernel: [ 8058.150045] lowmem_reserve[]: 0 0 0 0
May  8 14:14:17 staging-api kernel: [ 8058.150048] Node 0 DMA: 2*4kB 3*8kB 1*16kB 1*32kB 3*64kB 3*128kB 2*256kB 0*512kB 1*1024kB 1*2048kB 1*4096kB = 8336kB
May  8 14:14:17 staging-api kernel: [ 8058.150055] Node 0 DMA32: 838*4kB 5*8kB 21*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 7856kB
May  8 14:14:17 staging-api kernel: [ 8058.150062] 191 total pagecache pages
May  8 14:14:17 staging-api kernel: [ 8058.150063] 0 pages in swap cache
May  8 14:14:17 staging-api kernel: [ 8058.150064] Swap cache stats: add 0, delete 0, find 0/0
May  8 14:14:17 staging-api kernel: [ 8058.150065] Free swap  = 0kB
May  8 14:14:17 staging-api kernel: [ 8058.150066] Total swap = 0kB
May  8 14:14:17 staging-api kernel: [ 8058.155922] 985072 pages RAM
May  8 14:14:17 staging-api kernel: [ 8058.155925] 24961 pages reserved
May  8 14:14:17 staging-api kernel: [ 8058.155925] 6551 pages shared
May  8 14:14:17 staging-api kernel: [ 8058.155926] 954054 pages non-shared
May  8 14:14:17 staging-api kernel: [ 8058.155928] [ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
May  8 14:14:17 staging-api kernel: [ 8058.155935] [  198]     0   198     6326       84   0       0             0 mountall
May  8 14:14:17 staging-api kernel: [ 8058.155938] [  284]     0   284     4309       49   0       0             0 upstart-udev-br
May  8 14:14:17 staging-api kernel: [ 8058.155940] [  371]     0   371     5381      133   0     -17         -1000 udevd
May  8 14:14:17 staging-api kernel: [ 8058.155942] [  435]     0   435     5367      102   0     -17         -1000 udevd
May  8 14:14:17 staging-api kernel: [ 8058.155944] [  436]     0   436     5367      102   0     -17         -1000 udevd
May  8 14:14:17 staging-api kernel: [ 8058.155947] [  520]     0   520     4801       71   0       0             0 rpcbind
May  8 14:14:17 staging-api kernel: [ 8058.155949] [  546]     0   546     3798       52   0       0             0 upstart-socket-
May  8 14:14:17 staging-api kernel: [ 8058.155951] [  583]     0   583     1817      124   0       0             0 dhclient3
May  8 14:14:17 staging-api kernel: [ 8058.155953] [  752]   104   752     9444      158   0       0             0 ntpd
May  8 14:14:17 staging-api kernel: [ 8058.155955] [  793]     0   793    12510      151   0     -17         -1000 sshd
May  8 14:14:17 staging-api kernel: [ 8058.155958] [  802]   106   802     5377      115   0       0             0 rpc.statd
May  8 14:14:17 staging-api kernel: [ 8058.155960] [  807]   101   807    62369      130   0       0             0 rsyslogd
May  8 14:14:17 staging-api kernel: [ 8058.155962] [  818]   102   818     5955       73   0       0             0 dbus-daemon
May  8 14:14:17 staging-api kernel: [ 8058.155964] [  874]     0   874     3627       42   0       0             0 getty
May  8 14:14:17 staging-api kernel: [ 8058.155966] [  878]     0   878     6386       56   0       0             0 rpc.idmapd
May  8 14:14:17 staging-api kernel: [ 8058.155968] [  879]     0   879     3627       42   0       0             0 getty
May  8 14:14:17 staging-api kernel: [ 8058.155970] [  891]     0   891     3627       43   0       0             0 getty
May  8 14:14:17 staging-api kernel: [ 8058.155972] [  893]     0   893     3627       42   0       0             0 getty
May  8 14:14:17 staging-api kernel: [ 8058.155974] [  895]     0   895     3627       41   0       0             0 getty
May  8 14:14:17 staging-api kernel: [ 8058.155976] [  920]   108   920 16916403     5645   0       0             0 mongod
May  8 14:14:17 staging-api kernel: [ 8058.155978] [  921]     0   921     1083       36   0       0             0 acpid
May  8 14:14:17 staging-api kernel: [ 8058.155980] [  922]     0   922     4779       53   0       0             0 cron
May  8 14:14:17 staging-api kernel: [ 8058.155982] [  924]     0   924     4228       40   0       0             0 atd
May  8 14:14:17 staging-api kernel: [ 8058.155984] [  931]     0   931    76570      110   0       0             0 automount
May  8 14:14:17 staging-api kernel: [ 8058.155986] [  943]   103   943    46919      320   0       0             0 whoopsie
May  8 14:14:17 staging-api kernel: [ 8058.155989] [  945]   998   945    18965     1747   0       0             0 gmond
May  8 14:14:17 staging-api kernel: [ 8058.155991] [  957]     0   957    15719      261   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.155993] [  959]    33   959    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.155995] [  960]    33   960    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.155997] [  961]    33   961    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.155998] [  962]    33   962    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.156000] [  963]    33   963    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.156002] [  964]    33   964    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.156004] [  965]    33   965    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.156006] [  966]    33   966    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.156008] [  967]    33   967    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.156010] [  968]    33   968    15840      360   0       0             0 nginx
May  8 14:14:17 staging-api kernel: [ 8058.156013] [  972]   109   972     5505      149   0       0             0 privoxy
May  8 14:14:17 staging-api kernel: [ 8058.156015] [ 1034]     0  1034    26056      145   0       0             0 monit
May  8 14:14:17 staging-api kernel: [ 8058.156017] [ 1059]     0  1059     3627       40   0       0             0 getty
May  8 14:14:17 staging-api kernel: [ 8058.156019] [ 1062]   999  1062    35145    12734   0       0             0 opsworks-agent
May  8 14:14:17 staging-api kernel: [ 8058.156022] [ 1064]   999  1064    43541    17536   0       0             0 opsworks-agent
May  8 14:14:17 staging-api kernel: [ 8058.156024] [ 1071]   999  1071    43659    17625   0       0             0 opsworks-agent
May  8 14:14:17 staging-api kernel: [ 8058.156026] [ 1077]   999  1077    42805    16782   0       0             0 opsworks-agent
May  8 14:14:17 staging-api kernel: [ 8058.156028] [ 1238]     0  1238    18361      201   0       0             0 sshd
May  8 14:14:17 staging-api kernel: [ 8058.156030] [ 1365]  2003  1365    18361      200   0       0             0 sshd
May  8 14:14:17 staging-api kernel: [ 8058.156032] [ 1366]  2003  1366     6513     1719   0       0             0 bash
May  8 14:14:17 staging-api kernel: [ 8058.156035] [ 7725]     0  7725    18361      200   0       0             0 sshd
May  8 14:14:17 staging-api kernel: [ 8058.156037] [ 7893]  2003  7893    18361      199   0       0             0 sshd
May  8 14:14:17 staging-api kernel: [ 8058.156039] [ 7894]  2003  7894     6580     1784   0       0             0 bash
May  8 14:14:17 staging-api kernel: [ 8058.156041] [10972]  2003 10972     1480       24   0       0             0 tail
May  8 14:14:17 staging-api kernel: [ 8058.156044] [10977]  2003 10977   879279   845692   0       0             0 python
May  8 14:14:17 staging-api kernel: [ 8058.156046] Out of memory: Kill process 10977 (python) score 882 or sacrifice child
May  8 14:14:17 staging-api kernel: [ 8058.156056] Killed process 10977 (python) total-vm:3517116kB, anon-rss:3382768kB, file-rss:

kern.log

Was it helpful?

Solution

You are running out of memory and the Linux OOM killer is selecting processes to kill and free up memory.

May 8 14:14:17 staging-api kernel: [ 8058.156046] Out of memory: Kill process 10977 (python) score 882 or sacrifice child May 8 14:14:17 staging-api kernel: [ 8058.156056] Killed process 10977 (python) total-vm:3517116kB, anon-rss:3382768kB, file-rss:

When Linux runs critically low on memory it may invoke the OOM killer to kill a process to free up memory. From what I can see you are running mongodb, nginx and your python script all on a EC2 medium instance with around 20GB of data in the database. That's probably a bit tight - I'd look at a few options:

  1. Upgrade to a larger instance type with more memory
  2. Move you python code and nginx web site off to another instance
  3. Try tuning your python code - looks to be eating up a fair amount of memory
  4. Try tuning the OOM killer to be less strict - not for the faint of heart, but here are a few articles if you are interested:

https://unix.stackexchange.com/questions/58872/how-to-set-oom-killer-adjustments-for-daemons-permanently

http://backdrift.org/how-to-create-oom-killer-exceptions

http://www.oracle.com/technetwork/articles/servers-storage-dev/oom-killer-1911807.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top