Question

In my problem I need to assign some values to a data.frame possibly out of its bounds on the basis of a given function arguments.

To implement a parametric assignment one could define:

ass=function(x,i,j,a){
  # some operations on x,i,j,a 
  `[<-`(x,i,j,a) 
}

or

ass=function(x,i,j,a){
  # some operations on x,i,j 
  do.call(`[<-`, list(x,i,j,a))
}

The problem comes when I need to emulate x[,j] or x[i,]. In some cases TRUE will work, i.e.:

`[<-`(x,T,j,a); `[<-`(x,i,T,a)  

is like:

x[,j]=a; x[i,]=a; 

Now assume x is a n*3 data.frame, I can assign out of bounds without any problems, e.g.:

`[<-`(x,T,4,0) 

still works like

x[,4]=0

But:

`[<-`(x,T,4:5,0) 

gives a "subscript out of bounds" error, while

 x[,4:5]=0

works.

How can I "hack" the the notation "[<-"(x,i,j,a) or do.call("[<-", list(x,i,j,a)) in order to take all i (or all j)?

Was it helpful?

Solution

Formal restatement of the problem

Given the integers i,j and the matrix/data.frame x, find the m such that:

x[i,m]; x[m,j]; x[m,m]

are identical resp. to:

x[i,]; x[,j]; x[,]

This should of course apply also to replacements, i.e. for:

x[*] = value

and with the syntax `[` and `[<-` or the related do.call.

Investigation of the problem

If you are not interested to know what is under the hood, you can go straight to the proposed solution.

Since `[` and `[<-` are functions, m should be considered as a missing value for the arguments i, j. Therefore one should create an "artificial" missing value.
What happens upon inspection of actual missing values in a function?

f=function(i,j){
   cat ('A) ');  print(match.call())
   cat ('B) ');  print(sys.call())
   cat ('C)\n'); print(as.list(sys.call()))
}

When using a comma without a preceding argument, we get:

f(,2)

#A) f(j = 2)
#B) f(, 2)
#C)
#[[1]]
#f
# 
#[[2]]
# 
# 
#[[3]]
#[1] 2

The second element of the sys.call list seems empty! So we try to capture this value:

f=function(a,b) as.list(sys.call())[[2]] 
m=f(,)

and:

m                   
Error: argument "m" is missing, with no default

...never an error was so welcomed.

Proposed solution

Set:

m=(function(a,b) as.list(sys.call())[[2]])(,)

m is now an artificial missing value. It operates like the empty space near the comma when subsetting. In fact:

x=matrix(1:12, ncol=3)    

x[m,m]
#     [,1] [,2] [,3]
#[1,]    1    5    9
#[2,]    2    6   10
#[3,]    3    7   11
#[4,]    4    8   12

x[1, m ]
#[1] 1 5 9

x[m, 1 ]
#[1] 1 2 3 4

So far, while more formal, this method does not offer more than the recycling trickery x[T,T], x[1,T], x[T,1]. But, when the workaround does not work:

x=data.frame(x)
x[T,4:5]=0
#Error in `*tmp*`[[j]] : subscript out of bounds

the artificial missing value works:

x[m,4:5]=0
x
#  X1 X2 X3 V4 V5
#1  1  5  9  0  0
#2  2  6 10  0  0
#3  3  7 11  0  0
#4  4  8 12  0  0
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top