How much overhead does the NewRelic PHP agent add?

Question 1

Your mileage may vary based on the settings, your particular site's code base, etc...

The additional overhead you're seeing is less the memory used, but the tracing and profiling of your php code and gathering analytic data on it as well as DB request profiling. Basically some additional overhead hooked into every php function call. You see similar overhead if you left Xdebug or ZendDebugger running on a machine or profiling. Any module will use some resources, ones that hook deep in for profiling can be the costliest, but I've seen new relic has config settings to dial back how intensively it profiles, so you might be able to lighten it's hit more than say Xdebug.

All that being said, with the newrelic shared PHP module loaded with the default setup and config from their site my company's website overall server response latency went up about 15-20% across the board when we turned it on for all our production machines. I'm only talking about the time it takes for php-fpm to generate an initial response. Our site is http://www.nara.me. The newrelic-daemon and newrelic-sysmon services running as well, but I doubt they have any impact on response time.

Don't get me wrong, I love new relic, but the perfomance hit in my specific situation hit doesn't make me want to keep the PHP module running on all our live load balanced machines. We'll probably keep it running on one machine all the time. We do plan to keep the sysmon stuff going 100% and keep the module disabled in case we need it for troubleshooting.

My advice is this:

Wrap any calls to new relic functions in if(function_exists ( $function_name )) blocks so your code can run without error if the new relic module isn't loaded
If you've multiple identical servers behind a loadbalancer sharing the same code, only enable the php module on one image to save performance. You can keep the sysmon stuff running if you use new relic for this.
If you've just one server, only enable the shared php module when you need it--when you're actually profiling your code or mysql unless a 10-20% performance hit isn't a problem.

One other thing to remember if your main source of info is the new relic website: they get paid by the number of machines you're monitoring, so don't expect them to convince you to not use it on anything less than 100% of your machines even if it not needed. I think one of their FAQ's or blogs state basically you should expect some performance impact, but if you use it as intended and fix the issues you see from it, you should recoup the latency lost. I agree, but I think once you fix the issues, limit the exposure to the smallest needed number of servers.

Question 2

The agent shouldn't be adding much overhead the way it is designed. Because of the level of detail required to adequately troubleshoot the problem, this seems like a good question to ask at https://support.newrelic.com