문제

Hi I have a string generated by Python and I need to read into R to analyze it.

The only difference between the two strings below is the length(number of elements inside list). And R cannot read the longer one successfully.

textWork <- "[('08/10/2013 01:50:16 AM INFO', 'product1', '', '61.12000', '1'), ('08/10/2013 02:04:23 AM INFO', 'product1', '', '61.12000', '1'), ('08/11/2013 02:29:46 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 12:58:43 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 01:12:18 AM INFO', 'product1', '', '61.12000', '1'), ('08/13/2013 01:14:57 AM INFO', 'product1', '', '61.12000', '1'), ('08/14/2013 02:01:42 AM INFO', 'product1', '', '61.12000', '1'), ('08/14/2013 02:04:43 AM INFO', 'product1', '', '61.12000', '1'), ('08/15/2013 01:09:23 AM INFO', 'product1', '', '61.12000', '1'), ('08/15/2013 01:22:50 AM INFO', 'product1', '', '61.12000', '1'), ('08/16/2013 12:56:52 AM INFO', 'product1', '', '61.12000', '1'), ('08/16/2013 01:09:38 AM INFO', 'product1', '', '61.12000', '1'), ('08/17/2013 12:54:20 AM INFO', 'product1', '', '61.12000', '1'), ('08/17/2013 01:07:51 AM INFO', 'product1', '', '61.12000', '1'), ('08/18/2013 12:54:14 AM INFO', 'product1', '', '61.12000', '1'), ('08/18/2013 01:09:37 AM INFO', 'product1', '', '61.12000', '1'), ('08/19/2013 12:54:13 AM INFO', 'product1', '', '61.12000', '1'), ('08/19/2013 01:10:06 AM INFO', 'product1', '', '61.12000', '1'), ('08/20/2013 02:09:17 AM INFO', 'product1', '', '61.12000', '1'), ('08/20/2013 02:25:56 AM INFO', 'product1', '', '61.12000', '1'), ('08/21/2013 01:21:03 AM INFO', 'product1', '', '61.12000', '1'), ('08/21/2013 01:34:59 AM INFO', 'product1', '', '61.12000', '1'), ('08/22/2013 01:32:54 AM INFO', 'product1', '', '61.12000', '1'), ('08/22/2013 01:55:25 AM INFO', 'product1', '', '61.12000', '1'), ('08/23/2013 01:23:44 AM INFO', 'product1', '', '61.12000', '1'), ('08/23/2013 01:41:08 AM INFO', 'product1', '', '61.12000', '1'), ('08/24/2013 01:17:46 AM INFO', 'product1', '', '61.12000', '1'), ('08/24/2013 01:31:12 AM INFO', 'product1', '', '61.12000', '1'), ('08/25/2013 12:57:21 AM INFO', 'product1', '', '61.12000', '1'), ('08/25/2013 01:10:55 AM INFO', 'product1', '', '61.12000', '1'), ('08/26/2013 12:56:37 AM INFO', 'product1', '', '61.12000', '1'), ('08/26/2013 01:11:03 AM INFO', 'product1', '', '61.12000', '1'), ('08/27/2013 01:00:15 AM INFO', 'product1', '', '61.12000', '1'), ('08/27/2013 01:13:09 AM INFO', 'product1', '', '61.12000', '1'), ('08/28/2013 01:07:21 AM INFO', 'product1', '', '61.12000', '1'), ('08/28/2013 01:24:13 AM INFO', 'product1', '', '61.12000', '1'), ('08/29/2013 12:57:08 AM INFO', 'product1', '', '61.12000', '1'), ('08/29/2013 01:10:57 AM INFO', 'product1', '', '61.12000', '1'), ('08/30/2013 12:56:22 AM INFO', 'product1', '', '61.12000', '1'), ('08/30/2013 01:10:43 AM INFO', 'product1', '', '61.12000', '1'), ('08/31/2013 12:53:37 AM INFO', 'product1', '', '61.12000', '1'), ('08/31/2013 01:08:01 AM INFO', 'product1', '', '61.12000', '1'), ('09/01/2013 12:52:11 AM INFO', 'product1', '', '61.12000', '1'), ('09/01/2013 01:06:40 AM INFO', 'product1', '', '61.12000', '1'), ('09/02/2013 12:50:31 AM INFO', 'product1', '', '61.12000', '1'), ('09/02/2013 01:05:16 AM INFO', 'product1', '', '61.12000', '1'), ('09/03/2013 12:54:07 AM INFO', 'product1', '', '61.12000', '1'), ('09/03/2013 01:09:32 AM INFO', 'product1', '', '61.12000', '1'), ('09/04/2013 01:16:11 AM INFO', 'product1', '', '61.12000', '1'), ('09/05/2013 12:59:34 AM INFO', 'product1', '', '61.12000', '1'), ('09/06/2013 12:55:00 AM INFO', 'product1', '', '61.12000', '1'), ('09/07/2013 01:13:40 AM INFO', 'product1', '', '61.12000', '1'), ('09/09/2013 01:07:43 AM INFO', 'product1', '', '61.12000', '1')]"

textNotWork <- "[('08/10/2013 01:50:16 AM INFO', 'product1', '', '61.12000', '1'), ('08/10/2013 02:04:23 AM INFO', 'product1', '', '61.12000', '1'), ('08/11/2013 02:29:46 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 12:58:43 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 01:12:18 AM INFO', 'product1', '', '61.12000', '1'), ('08/13/2013 01:14:57 AM INFO', 'product1', '', '61.12000', '1'), ('08/10/2013 01:50:16 AM INFO', 'product1', '', '61.12000', '1'), ('08/10/2013 02:04:23 AM INFO', 'product1', '', '61.12000', '1'), ('08/11/2013 02:29:46 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 12:58:43 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 01:12:18 AM INFO', 'product1', '', '61.12000', '1'), ('08/13/2013 01:14:57 AM INFO', 'product1', '', '61.12000', '1'), ('08/10/2013 01:50:16 AM INFO', 'product1', '', '61.12000', '1'), ('08/10/2013 02:04:23 AM INFO', 'product1', '', '61.12000', '1'), ('08/11/2013 02:29:46 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 12:58:43 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 01:12:18 AM INFO', 'product1', '', '61.12000', '1'), ('08/13/2013 01:14:57 AM INFO', 'product1', '', '61.12000', '1'), ('08/10/2013 01:50:16 AM INFO', 'product1', '', '61.12000', '1'), ('08/10/2013 02:04:23 AM INFO', 'product1', '', '61.12000', '1'), ('08/11/2013 02:29:46 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 12:58:43 AM INFO', 'product1', '', '61.12000', '1'), ('08/12/2013 01:12:18 AM INFO', 'product1', '', '61.12000', '1'), ('08/13/2013 01:14:57 AM INFO', 'product1', '', '61.12000', '1'), ('08/14/2013 02:01:42 AM INFO', 'product1', '', '61.12000', '1'), ('08/14/2013 02:04:43 AM INFO', 'product1', '', '61.12000', '1'), ('08/15/2013 01:09:23 AM INFO', 'product1', '', '61.12000', '1'), ('08/15/2013 01:22:50 AM INFO', 'product1', '', '61.12000', '1'), ('08/16/2013 12:56:52 AM INFO', 'product1', '', '61.12000', '1'), ('08/16/2013 01:09:38 AM INFO', 'product1', '', '61.12000', '1'), ('08/17/2013 12:54:20 AM INFO', 'product1', '', '61.12000', '1'), ('08/17/2013 01:07:51 AM INFO', 'product1', '', '61.12000', '1'), ('08/18/2013 12:54:14 AM INFO', 'product1', '', '61.12000', '1'), ('08/18/2013 01:09:37 AM INFO', 'product1', '', '61.12000', '1'), ('08/19/2013 12:54:13 AM INFO', 'product1', '', '61.12000', '1'), ('08/19/2013 01:10:06 AM INFO', 'product1', '', '61.12000', '1'), ('08/20/2013 02:09:17 AM INFO', 'product1', '', '61.12000', '1'), ('08/20/2013 02:25:56 AM INFO', 'product1', '', '61.12000', '1'), ('08/21/2013 01:21:03 AM INFO', 'product1', '', '61.12000', '1'), ('08/21/2013 01:34:59 AM INFO', 'product1', '', '61.12000', '1'), ('08/22/2013 01:32:54 AM INFO', 'product1', '', '61.12000', '1'), ('08/22/2013 01:55:25 AM INFO', 'product1', '', '61.12000', '1'), ('08/23/2013 01:23:44 AM INFO', 'product1', '', '61.12000', '1'), ('08/23/2013 01:41:08 AM INFO', 'product1', '', '61.12000', '1'), ('08/24/2013 01:17:46 AM INFO', 'product1', '', '61.12000', '1'), ('08/24/2013 01:31:12 AM INFO', 'product1', '', '61.12000', '1'), ('08/25/2013 12:57:21 AM INFO', 'product1', '', '61.12000', '1'), ('08/25/2013 01:10:55 AM INFO', 'product1', '', '61.12000', '1'), ('08/26/2013 12:56:37 AM INFO', 'product1', '', '61.12000', '1'), ('08/26/2013 01:11:03 AM INFO', 'product1', '', '61.12000', '1'), ('08/27/2013 01:00:15 AM INFO', 'product1', '', '61.12000', '1'), ('08/27/2013 01:13:09 AM INFO', 'product1', '', '61.12000', '1'), ('08/28/2013 01:07:21 AM INFO', 'product1', '', '61.12000', '1'), ('08/28/2013 01:24:13 AM INFO', 'product1', '', '61.12000', '1'), ('08/29/2013 12:57:08 AM INFO', 'product1', '', '61.12000', '1'), ('08/29/2013 01:10:57 AM INFO', 'product1', '', '61.12000', '1'), ('08/30/2013 12:56:22 AM INFO', 'product1', '', '61.12000', '1'), ('08/30/2013 01:10:43 AM INFO', 'product1', '', '61.12000', '1'), ('08/31/2013 12:53:37 AM INFO', 'product1', '', '61.12000', '1'), ('08/31/2013 01:08:01 AM INFO', 'product1', '', '61.12000', '1'), ('09/01/2013 12:52:11 AM INFO', 'product1', '', '61.12000', '1'), ('09/01/2013 01:06:40 AM INFO', 'product1', '', '61.12000', '1'), ('09/02/2013 12:50:31 AM INFO', 'product1', '', '61.12000', '1'), ('09/02/2013 01:05:16 AM INFO', 'product1', '', '61.12000', '1'), ('09/03/2013 12:54:07 AM INFO', 'product1', '', '61.12000', '1'), ('09/03/2013 01:09:32 AM INFO', 'product1', '', '61.12000', '1'), ('09/04/2013 01:16:11 AM INFO', 'product1', '', '61.12000', '1'), ('09/05/2013 12:59:34 AM INFO', 'product1', '', '61.12000', '1'), ('09/06/2013 12:55:00 AM INFO', 'product1', '', '61.12000', '1'), ('09/07/2013 01:13:40 AM INFO', 'product1', '', '61.12000', '1'), ('09/09/2013 01:07:43 AM INFO', 'product1', '', '61.12000', '1')]"

enter image description here

Question(1) As you can see, this is a list of tuple in Python, and the original data(textNotWork) actually contains more tuple elements (string was longer), and I cannot read the text successfully. Anyone know what is really going on? How can I read a string that is pretty long.

Question(2) How can I turn that into a dataframe with five variables (seems like one variable is an empty string) dataframe in R so I can turn that into a time series and analyze it.

Thanks

도움이 되었습니까?

해결책

One idea to transform your python structures(I think that the solution given here is general for any python structure) is to save them(using python) as a json format and read them after using R. So you can do something like this:

python

textNotWork = [('08/10/2013 01:50:16 AM INFO', ...]
with open("testing.json", "w") as file:
    json.dump(textNotWork,file)

R

library(rjson)
matrix(unlist(fromJSON(file='testing.json')),
          ncol=5,byrow=TRUE)

 [1,] "08/10/2013 01:50:16 AM INFO" "product1" ""   "61.12000" "1" 
 [2,] "08/10/2013 02:04:23 AM INFO" "product1" ""   "61.12000" "1" 
 [3,] "08/11/2013 02:29:46 AM INFO" "product1" ""   "61.12000" "1" 
 [4,] "08/12/2013 12:58:43 AM INFO" "product1" ""   "61.12000" "1" 
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top