SAS name error when using a variable from a looping macro inside a hash dataset statement

Question 1

First, a few notes about the technical points in the other answer - ie, "Where the problem is not directly coming from, although both are examples of poor coding."

&i is indeed accessible here, although I would suggest it is poor style to use it the way you do. Macro variables that are relied upon in interior macros should be defined as macro parameters; that makes it clear where they came from. However, technically, this isn't wrong, see this:

%macro caller;
%do i=1 %to 5;
  %called;
%end;
%mend;
%macro called;
%put &i;
%mend called;

%caller;

However, it would be better to make i a paramter, such as %macro called(i=);, to make your macro more clear, and more reusable.

Second, lack of quotes is in fact not the direct problem, although again it does point to an issue and is a solution in a way. SAS does convert the numerics in that to a character value - otherwise you'd get a very different error message; however, it does so in a way that is not helpful. The most similar implementation of what you did is to add compress around it. That is because the problem is how SAS converts numerics to text; &i is a number (1 in your example). It needs to be converted to "1", and instead it is converted using best12. to " 1". That is a problem.

hh.output (dataset: compress('_'||&i||put(year, best.-L))) ;

That works. A better implementation would be to intentionally convert to a character value. Macro parameters are very easy to convert: just add " " around them.

hh.output(dataset: cats('_',"&i.",year);

cats strips all spaces off, and makes the -L unnecessary. It would work just as well with &i, although it's certainly better to add quotes.

I would add that you might consider why you're subsetting these. I don't think there's anything conceptually wrong with doing so, but odds are if you are doing something by year, you can use by year and get away with not subsetting them - keeping them in one dataset per treatment group (and perhaps even by group year?). Further, you might be able to do this in fewer steps. What are you going to do, finally? Let's say you had one dataset for each group/year. What code would you then run? It may be that you can write that in one or a few steps without breaking out 48x14 datasets, which is probably not efficient. If you're interested in finding out, start a new question with the details of what you'd like to do with just a pair of datasets.

Question 2

To address your specific point regarding the output line:-

hh.output (dataset: '_'||&i||put(year, best.-L)) ;

A few notes:-

The argument for dataset must be quoted, e.g. ..(dataset: "Out123"). Currently, yours is not.
The major macro does not have access to the macro variable &i. This is created in the calling macro and is available in that scope only. You could add another parameter to the major macro and send &i that way.

In truth, though, the code your creating looks very difficult to maintain; a hash in a macro is a nightmare to debug. And if you're producing code that creates lots of output datasets that should set alarm bells ringing; the by statement in SAS allows the applying of criteria to separate groups within a dataset and is definitely preferable to many datasets.

Let's look at the problem: you're matching controls to treatment using variables that will yield multiple control groups per treatment? You would then choose a control per treatment based on some distance criteria?

For the first part, a Proc SQL merge sounds about right. You'll get a long dataset with each treatment firm repeated the number of times it was matched to a control. Then sort by treatment descending [distance criteria] and pick the first one per by group. And that should be that. Of course, I'm sure I've misunderstood something...

A final point; you could just look for matching algorithms in SAS, particularly 'optimal matching' or 'greedy matching'. In my experience, SAS is not great for matching, especially if it involves a random element (yours doesn't), but you should be able to find code more useful than what you're working with right now.

Question 3

I think you may be able to simplify your above code greatly by using wildcards on your set statement instead of using multiple set statements to "loop over them".

For example, the below code would step through all of your datasets that begin with one of the prefixes so that you can work on them without multiple set statements.

data all;
  set out1:
      out2:
      out3:
      out4:
      ;
run;

That may even allow you to remove the need for a macro which in turn would simplify the code. That existing code looks very difficult to maintain/debug so I think simplifying it is the first step.