Stata: import delimited with duplicate variables

https://stackoverflow.com/questions/22262683

stata

11-06-2023
|

Pregunta

I have a csv file with two identical columns:

X,X
0,0
1,1
2,2

I would like to import this into Stata 13, but it does not like importing the second X (since the names are the same):

. import delimited "filename.csv"
X already defined
Error creating variables
r(109);

Is there a simple way to force the import?

I do not want to specify the rows to import. The actual dataset has 100+ variables, and the duplicated variables are distributed throughout. Similarly, I do not want to manually rename the variables. I am fine if Stata wants to either drop or rename the second X.

As background, this csv file is being generated by some sloppy SQL code. The duplicated variables are precisely the variables I use for the joins. I could clean up the SQL code or pre-clean (with e.g. Python), but I would ideally like to have Stata force the import.

Solución 2

import delimited was patched for this particular problem in the 07oct2013 update. To update Stata 13 type...

. update all

in the Stata Command window.

Otros consejos

Try insheet.

With this example data in a .csv file:

x,x,y,y
238965,586,127,192864
238965,586,127,192864
1074,198264,5186,2947
1074,198264,5186,2947

All variables are imported and the resulting names in Stata are:

x
v2
y
v4

The command would be:

insheet using "~/some/file.csv"

(I'm on Stata 12.1 and according to the Stata 13 [U] manual, insheet is superseded by import delimited, p.21.)

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow