Question

I'm having crazy issues on checkout page - it kicks the client out of the checkout:

500 Server error on /checkout/onepage/progress/?toStep=shipping_method

sometimes on billing too.

The issue is intermittent, sometimes it shows up, sometimes not.

SUPEE 7405 I have applied both 1.0 and 1.1 via FTP(direct file upload), but the issue with checkout still persists!

Please help!

PS: Both MCRYPT, MBSTRING and SOAP are enabled on the server. I'm running PHP 5.4.45 with APC 3.1.13 on 1230 Intel CPU, 16GB of RAM, 2TB Drives

Update:

Chrome Console spits this out:

prototype.js:1530 POST /checkout/onepage/saveBilling/ 500 (Internal Server Error)

Ajax.Request.Class.create.request @ prototype.js:1530 Ajax.Request.Class.create.initialize @ prototype.js:1495 (anonymous function) @ prototype.js:429 klass @ prototype.js:101Billing.save @ /skin/frontend/base/default/js/opcheckout.js:313 onclick @ /checkout/onepage/:679

UPDATE: Apache Logs have these types of errors:

24.87.30.186 - - [21/Apr/2016:16:13:43 -0700] "POST /checkout/onepage/saveBilling/ HTTP/1.1" 500 - "https://www.example.com/checkout/onepage/" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"

216.129.65.170 - - [21/Apr/2016:16:08:00 -0700] "GET /checkout/onepage/progress/?toStep=payment HTTP/1.1" 500 461 "https://www.example.com/checkout/onepage/" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"

216.129.65.170 - - [21/Apr/2016:16:08:00 -0700] "GET /checkout/onepage/progress/?toStep=payment HTTP/1.1" 500 461 "https://www.example.com/checkout/onepage/" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36"


UPDATE

I just ran tail -f command on error_log when the error happened:

tail -f 200 /usr/local/apache/logs/error_log

[Fri Apr 22 08:37:36.234374 2016] [fcgid:warn] [pid 20874:tid 140365143197440] ( [Fri Apr 22 08:41:52.653576 2016] [fcgid:warn] [pid 20892:tid 140365143197440] (104)Connection reset by peer: [client 1.1.1.1:29184] mod_fcgid: error reading data from FastCGI server, referer: https://www.example.com/checkout/onepage/ [Fri Apr 22 08:41:52.653629 2016] [core:error] [pid 20892:tid 140365143197440] [client 1.1.1.1:29184] End of script output before headers: index.php, referer: https://www.example.com/checkout/onepage/ [Fri Apr 22 08:41:52.743064 2016] [fcgid:error] [pid 20871:tid 140365330593728] mod_fcgid: process /usr/local/cpanel/cgi-sys/php5(21528) exit(communication error), get signal 11, possible coredump generated zend_mm_heap corrupted [Fri Apr 22 08:41:52.791456 2016] [fcgid:warn] [pid 20892:tid 140365143197440] (104)Connection reset by peer: [client 1.1.1.1:29184] mod_fcgid: error reading data from FastCGI server, referer: https://www.example.com/checkout/onepage/ [Fri Apr 22 08:41:52.791491 2016] [core:error] [pid 20892:tid 140365143197440] [client 1.1.1.1:29184] End of script output before headers: index.php, referer: https://www.example.com/checkout/onepage/

[Fri Apr 22 08:45:52.376926 2016] [core:error] [pid 20873:tid 140365284730624] [client 1.1.1.1:29240] End of script output before headers: index.php, referer: https://www.example.com/checkout/onepage/ [Fri Apr 22 08:45:52.540121 2016] [fcgid:error] [pid 20871:tid 140365330593728] mod_fcgid: process /usr/local/cpanel/cgi-sys/php5(22178) exit(communication error), get signal 11, possible coredump generated [Fri Apr 22 08:45:52.540414 2016] [fcgid:error] [pid 20871:tid 140365330593728] mod_fcgid: process /usr/local/cpanel/cgi-sys/php5(22113) exit(communication error), get signal 11, possible coredump generated

I have checked with Sysadmin of the server:

"That message means the php script is crashing php. Most likely due to opcode caching."

^^^^ I dont understand this.

--------- APRIL-25 UPDATE --- TURNING OFF APC ------------

It looks like removing APC from local.xml fixes this issue. I need to do more testing, but new version of APC is probably buggy and nobody has time to update this old plugin.

It could be that Dev Site and Production Site cannot use APC(even though I added different prefix).

I tried removing plugins, but that did not help. I'll update as soon as I can test more.

Asked my admin to add Xcache and Memcached to see if I can switch to those.

--------- APRIL-28 UPDATE --- TURNING OFF APC DID NOT HELP ------------

Looks like APC is not the cause. Although it did lower number of clients it was happening to.

I have moved to Memcache and it has the same issue.

So it seems like I'll have to pick apart the whole checkout and maybe go back to /base version of it just to make sure there is no weird errors in custom theme code.

--------- JUNE 7 --- SWITCHING RAM MODULES DID NOT HELP ------------

We have tried to switch RAM modules and add more memory (32GB) to the server.

We are still getting this error.

I'm going to get a VPS server from Godaddy and see if I can reproduce it on another server.

Maybe it's the messed up Server environment? (Apache, PHP)?

------------- UPDATE JUNE 24 ---------------

I noticed the page would throw:

form.js:53 GET https://www.example.ca/checkout/onepage/index/ >net::ERR_CONTENT_DECODING_FAILED submit @ form.js:53onepageLogin @ /checkout/onepage/:645 onclick @ /checkout/onepage/:625

So I looked up this error and found this page:

http://stefantsov.com/fixing-err_content_decoding_failed-in-apachephp/

I edited my php.ini and added:

display_errors=off zlib.output_compression=On

Also I edited out in index.php

ini_set('display_errors', 1); error_reporting(E_ALL);

From what I understand, part of the checkout page would get GZIPped, but part of it would arrive late(ajax request for a shipping quote?) and it would not get encoded - so page would throw Encoding error and throw the customer out of the checkout page.

I will test this more to see if this is still happening when server is loaded up more.

The underlying issue is still probably there, I just turned off reporting of this error.

------------- UPDATE JUNE 27 ---------------

From all the testing Ive done, it seems I solved only part of the problem - SOAP requests for Shipping quotes were not GZipped and would not be sent compressed, so part of the page would be not encoded when customer went to checkout page.

I added something like this to Fedex and Canada Post plugins:

$client = new SoapClient($wsdl, array('compression'=> SOAP_COMPRESSION_ACCEPT | SOAP_COMPRESSION_GZIP,'exceptions'=>1,'trace' => $trace));

After that encoding issues disappeared, but 500 error still shows up.

At this point I assume something is merged up with Apache or PHP install.

I'm planning to start testing NGINX webserver with our magento install and see how well it works.

------------- UPDATE JULY 3rd ---------------

Our admin has installed Lightspeed webserver but 503 still showed up. He said he's going to put better logging on the server so we could read coredumps from the error.

Was it helpful?

Solution

My sys admin enabled fully core dumps and found that APC was throwing an error.

Apparently APC and PHP 5.4.5 has some kind of bug:

https://bugs.php.net/bug.php?id=62587

After removing APC off the server completely I confirmed that the error has stopped.

Licensed under: CC-BY-SA with attribution
Not affiliated with magento.stackexchange
scroll top