should i try to avoid "new" keyword in ultra-low-latency software?

Question 1

In C++ you don't need new to create an object that has limited scope.

void FrequentlyCalledMethod() 
{
    std::vector<Action> actions;
    actions.reserve( 10 );
    for (int i = 0; i < 10; i++) 
    {
        actions.push_back( Action(....) );
    }
    // use actions, synchronous
    executor.Execute(actions);
    // now actions can be deleted
}

If Action is a base class and the actual types you have are of a derived class, you will need a pointer or smart pointer and new here. But no need if Action is a concrete type and all the elements will be of this type, and if this type is default-constructible, copyable and assignable.

In general though, it is highly unlikely that your performance benefits will come from not using new. It is just good practice here in C++ to use local function scope when that is the scope of your object. This is because in C++ you have to take more care of resource management, and that is done with a technique known as "RAII" - which essentially means taking care of how a resource will be deleted (through a destructor of an object) at the point of allocation.

High performance is more likely to come about through:

proper use of algorithms
proper parallel-processing and synchronisation techniques
effective caching and lazy evaluation.

Question 2

As much as I detest HFT, I'm going to tell you how to get maximum performance out of each thread on a given piece of iron.

Here's an explanation of an example where a program as originally written was made 730 times faster.

You do it in stages. At each stage, you find something that takes a good percentage of time, and you fix it. The keyword is find, as opposed to guess. Too many people just eyeball the code, and fix what they think will help, and often but not always it does help, some. That's guesswork. To get real speedup, you need to find all the problems, not just the few you can guess.

If your program is doing new, then chances are at some point that will be what you need to fix. But it's not the only thing.

Here's the theory behind it.

Question 3

For high-performance trading engines at good HFT shops, avoiding new/malloc in C++ code is a basic.