Pandas DataFrame의 더 많은 열을 보려면 출력 표시를 어떻게 확장합니까?

https://stackoverflow.com//questions/11707586

13-12-2019
|

문제

대화형 또는 스크립트 실행 모드에서 출력 표시를 넓힐 수 있는 방법이 있습니까?

특히, 나는 describe() 팬더의 기능 DataFrame.때 DataFrame 5열(라벨) 너비이므로 원하는 기술 통계를 얻을 수 있습니다.그러나 만약 DataFrame 열이 더 있으면 통계가 표시되지 않고 다음과 같은 내용이 반환됩니다.

>> Index: 8 entries, count to max  
>> Data columns:  
>> x1          8  non-null values  
>> x2          8  non-null values  
>> x3          8  non-null values  
>> x4          8  non-null values  
>> x5          8  non-null values  
>> x6          8  non-null values  
>> x7          8  non-null values

열이 6개인지 7개인지에 관계없이 "8" 값이 지정됩니다."8"은 무엇을 의미하나요?

이미 IDLE 창을 더 크게 드래그하고 "IDLE 구성" 너비 옵션을 늘려 보았지만 소용이 없었습니다.

팬더를 사용하는 나의 목적과 describe() 기본적인 데이터 조작 및 조사를 위해 Stata와 같은 두 번째 프로그램을 사용하지 않는 것입니다.

해결책

업데이트:팬더 0.23.4 이상

이는 필요하지 않습니다. 설정하면 pandas가 터미널 창의 크기를 자동 감지합니다. pd.options.display.width = 0.(이전 버전에 대해서는 하단을 참조하세요.)

pandas.set_printoptions(...) 더 이상 사용되지 않습니다.대신에 pandas.set_option(optname, val), 또는 동등하게 pd.options.<opt.hierarchical.name> = val.좋다:

import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

여기는 에 대한 도움 set_option:

set_option(pat,value) - Sets the value of the specified option

Available options:
display.[chop_threshold, colheader_justify, column_space, date_dayfirst,
         date_yearfirst, encoding, expand_frame_repr, float_format, height,
         line_width, max_columns, max_colwidth, max_info_columns, max_info_rows,
         max_rows, max_seq_items, mpl_style, multi_sparse, notebook_repr_html,
         pprint_nest_depth, precision, width]
mode.[sim_interactive, use_inf_as_null]

Parameters
----------
pat - str/regexp which should match a single option.

Note: partial matches are supported for convenience, but unless you use the
full option name (e.g. x.y.z.option_name), your code may break in future
versions if new options with similar names are introduced.

value - new value of option.

Returns
-------
None

Raises
------
KeyError if no such option exists

display.chop_threshold: [default: None] [currently: None]
: float or None
        if set to a float value, all float values smaller then the given threshold
        will be displayed as exactly 0 by repr and friends.
display.colheader_justify: [default: right] [currently: right]
: 'left'/'right'
        Controls the justification of column headers. used by DataFrameFormatter.
display.column_space: [default: 12] [currently: 12]No description available.

display.date_dayfirst: [default: False] [currently: False]
: boolean
        When True, prints and parses dates with the day first, eg 20/01/2005
display.date_yearfirst: [default: False] [currently: False]
: boolean
        When True, prints and parses dates with the year first, eg 2005/01/20
display.encoding: [default: UTF-8] [currently: UTF-8]
: str/unicode
        Defaults to the detected encoding of the console.
        Specifies the encoding to be used for strings returned by to_string,
        these are generally strings meant to be displayed on the console.
display.expand_frame_repr: [default: True] [currently: True]
: boolean
        Whether to print out the full DataFrame repr for wide DataFrames
        across multiple lines, `max_columns` is still respected, but the output will
        wrap-around across multiple "pages" if it's width exceeds `display.width`.
display.float_format: [default: None] [currently: None]
: callable
        The callable should accept a floating point number and return
        a string with the desired format of the number. This is used
        in some places like SeriesFormatter.
        See core.format.EngFormatter for an example.
display.height: [default: 60] [currently: 1000]
: int
        Deprecated.
        (Deprecated, use `display.height` instead.)

display.line_width: [default: 80] [currently: 1000]
: int
        Deprecated.
        (Deprecated, use `display.width` instead.)

display.max_columns: [default: 20] [currently: 500]
: int
        max_rows and max_columns are used in __repr__() methods to decide if
        to_string() or info() is used to render an object to a string.  In case
        python/IPython is running in a terminal this can be set to 0 and pandas
        will correctly auto-detect the width the terminal and swap to a smaller
        format in case all columns would not fit vertically. The IPython notebook,
        IPython qtconsole, or IDLE do not run in a terminal and hence it is not
        possible to do correct auto-detection.
        'None' value means unlimited.
display.max_colwidth: [default: 50] [currently: 50]
: int
        The maximum width in characters of a column in the repr of
        a pandas data structure. When the column overflows, a "..."
        placeholder is embedded in the output.
display.max_info_columns: [default: 100] [currently: 100]
: int
        max_info_columns is used in DataFrame.info method to decide if
        per column information will be printed.
display.max_info_rows: [default: 1690785] [currently: 1690785]
: int or None
        max_info_rows is the maximum number of rows for which a frame will
        perform a null check on its columns when repr'ing To a console.
        The default is 1,000,000 rows. So, if a DataFrame has more
        1,000,000 rows there will be no null check performed on the
        columns and thus the representation will take much less time to
        display in an interactive session. A value of None means always
        perform a null check when repr'ing.
display.max_rows: [default: 60] [currently: 500]
: int
        This sets the maximum number of rows pandas should output when printing
        out various output. For example, this value determines whether the repr()
        for a dataframe prints out fully or just a summary repr.
        'None' value means unlimited.
display.max_seq_items: [default: None] [currently: None]
: int or None

        when pretty-printing a long sequence, no more then `max_seq_items`
        will be printed. If items are ommitted, they will be denoted by the addition
        of "..." to the resulting string.

        If set to None, the number of items to be printed is unlimited.
display.mpl_style: [default: None] [currently: None]
: bool

        Setting this to 'default' will modify the rcParams used by matplotlib
        to give plots a more pleasing visual style by default.
        Setting this to None/False restores the values to their initial value.
display.multi_sparse: [default: True] [currently: True]
: boolean
        "sparsify" MultiIndex display (don't display repeated
        elements in outer levels within groups)
display.notebook_repr_html: [default: True] [currently: True]
: boolean
        When True, IPython notebook will use html representation for
        pandas objects (if it is available).
display.pprint_nest_depth: [default: 3] [currently: 3]
: int
        Controls the number of nested levels to process when pretty-printing
display.precision: [default: 7] [currently: 7]
: int
        Floating point output precision (number of significant digits). This is
        only a suggestion
display.width: [default: 80] [currently: 1000]
: int
        Width of the display in characters. In case python/IPython is running in
        a terminal this can be set to None and pandas will correctly auto-detect the
        width.
        Note that the IPython notebook, IPython qtconsole, or IDLE do not run in a
        terminal and hence it is not possible to correctly detect the width.
mode.sim_interactive: [default: False] [currently: False]
: boolean
        Whether to simulate interactive mode for purposes of testing
mode.use_inf_as_null: [default: False] [currently: False]
: boolean
        True means treat None, NaN, INF, -INF as null (old way),
        False means None and NaN are null, but INF, -INF are not null
        (new way).
Call def:   pd.set_option(self, *args, **kwds)

편집하다:이전 버전 정보는 대부분 더 이상 사용되지 않습니다.

@bmu로 말하는, pandas는 (기본적으로) 표시 영역의 크기를 자동으로 감지하며, 개체 표현이 디스플레이에 맞지 않을 때 요약 보기가 사용됩니다.IDLE 창 크기 조정을 언급했지만 아무런 효과가 없습니다.만약 당신이 print df.describe().to_string() IDLE 창에 맞나요?

터미널 크기는 다음과 같이 결정됩니다. pandas.util.terminal.get_terminal_size() (더 이상 사용되지 않고 제거됨) 이는 다음을 포함하는 튜플을 반환합니다. (width, height) 디스플레이의.출력이 IDLE 창의 크기와 일치합니까?문제가 있을 수 있습니다(이전에 emacs에서 터미널을 실행할 때 문제가 있었습니다).

자동 감지를 우회할 수도 있습니다. pandas.set_printoptions(max_rows=200, max_columns=10) 행, 열 수가 지정된 제한을 초과하지 않으면 요약 보기로 전환되지 않습니다.

'max_colwidth' 옵션은 각 열의 잘리지 않은 형태를 보는 데 도움이 됩니다.

다른 팁

이 시도:

pd.set_option('display.expand_frame_repr', False)

문서에서:

display.expand_frame_repr :부울

여러 줄에 걸쳐 넓은 DataFrame에 대한 전체 DataFrame 표현을 인쇄할지 여부는 max_columns가 여전히 존중되지만 너비가 display.width를 초과하는 경우 출력은 여러 "페이지"에 걸쳐 순환됩니다.[기본:사실] [현재:진실]

보다: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.set_option.html

하나의 큰 DataFrame을 표시하도록 옵션을 일시적으로 설정하려면 다음을 사용할 수 있습니다. 옵션_컨텍스트:

with pd.option_context('display.max_rows', -1, 'display.max_columns', 5):
    print df

옵션 값은 종료 시 자동으로 복원됩니다. with 차단하다.

이 3줄만 사용하면 저에게 효과적이었습니다.

pd.set_option('display.max_columns', None)  
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', -1)

아나콘다/파이썬 3.6.5/팬더:0.23.0 / 비주얼 스튜디오 코드 1.26

다음을 사용하여 열 최대 너비를 설정하십시오.

pd.set_option('max_colwidth', 800)

이 특정 명령문은 열당 최대 너비를 800px로 설정합니다.

다음을 사용하여 팬더 인쇄 옵션을 조정할 수 있습니다. set_printoptions.

In [3]: df.describe()
Out[3]: 
<class 'pandas.core.frame.DataFrame'>
Index: 8 entries, count to max
Data columns:
x1    8  non-null values
x2    8  non-null values
x3    8  non-null values
x4    8  non-null values
x5    8  non-null values
x6    8  non-null values
x7    8  non-null values
dtypes: float64(7)

In [4]: pd.set_printoptions(precision=2)

In [5]: df.describe()
Out[5]: 
            x1       x2       x3       x4       x5       x6       x7
count      8.0      8.0      8.0      8.0      8.0      8.0      8.0
mean   69024.5  69025.5  69026.5  69027.5  69028.5  69029.5  69030.5
std       17.1     17.1     17.1     17.1     17.1     17.1     17.1
min    69000.0  69001.0  69002.0  69003.0  69004.0  69005.0  69006.0
25%    69012.2  69013.2  69014.2  69015.2  69016.2  69017.2  69018.2
50%    69024.5  69025.5  69026.5  69027.5  69028.5  69029.5  69030.5
75%    69036.8  69037.8  69038.8  69039.8  69040.8  69041.8  69042.8
max    69049.0  69050.0  69051.0  69052.0  69053.0  69054.0  69055.0

그러나 pandas가 콘솔 너비를 감지하고 다음을 사용하므로 모든 경우에 작동하지는 않습니다. to_string 출력이 콘솔에 맞는 경우(의 독스트링 참조) set_printoptions).이 경우 명시적으로 호출할 수 있습니다. to_string 의 답변대로 브렌반.

업데이트

버전 0.10에서는 넓은 데이터 프레임이 인쇄되는 방식 변경됨:

In [3]: df.describe()
Out[3]: 
                 x1            x2            x3            x4            x5  \
count      8.000000      8.000000      8.000000      8.000000      8.000000   
mean   59832.361578  27356.711336  49317.281222  51214.837838  51254.839690   
std    22600.723536  26867.192716  28071.737509  21012.422793  33831.515761   
min    31906.695474   1648.359160     56.378115  16278.322271     43.745574   
25%    45264.625201  12799.540572  41429.628749  40374.273582  29789.643875   
50%    56340.214856  18666.456293  51995.661512  54894.562656  47667.684422   
75%    75587.003417  31375.610322  61069.190523  67811.893435  76014.884048   
max    98136.474782  84544.484627  91743.983895  75154.587156  99012.695717   

                 x6            x7  
count      8.000000      8.000000  
mean   41863.000717  33950.235126  
std    38709.468281  29075.745673  
min     3590.990740   1833.464154  
25%    15145.759625   6879.523949  
50%    22139.243042  33706.029946  
75%    72038.983496  51449.893980  
max    98601.190488  83309.051963

또한 Pandas 옵션 설정을 위한 API가 변경되었습니다.

In [4]: pd.set_option('display.precision', 2)

In [5]: df.describe()
Out[5]: 
            x1       x2       x3       x4       x5       x6       x7
count      8.0      8.0      8.0      8.0      8.0      8.0      8.0
mean   59832.4  27356.7  49317.3  51214.8  51254.8  41863.0  33950.2
std    22600.7  26867.2  28071.7  21012.4  33831.5  38709.5  29075.7
min    31906.7   1648.4     56.4  16278.3     43.7   3591.0   1833.5
25%    45264.6  12799.5  41429.6  40374.3  29789.6  15145.8   6879.5
50%    56340.2  18666.5  51995.7  54894.6  47667.7  22139.2  33706.0
75%    75587.0  31375.6  61069.2  67811.9  76014.9  72039.0  51449.9
max    98136.5  84544.5  91744.0  75154.6  99012.7  98601.2  83309.1

당신이 사용할 수있는 print df.describe().to_string() 전체 테이블을 표시하도록 강제합니다.(당신이 사용할 수있는 to_string() 모든 DataFrame에 대해 이와 같습니다.결과 describe DataFrame 자체일 뿐입니다.)

8은 "설명"을 보유하는 DataFrame의 행 수입니다(왜냐하면 describe 최소, 최대, 평균 등 8개의 통계를 계산합니다.

현재 터미널 너비와 일치하도록 출력 디스플레이를 설정할 수 있습니다.

pd.set_option('display.width', pd.util.terminal.get_terminal_size()[0])

에 따르면 v0.18.0용 문서, 터미널(iPython 노트북, qtconsole 또는 IDLE이 아닌)에서 실행 중인 경우 Pandas가 화면 너비를 자동으로 감지하고 표시되는 열 수에 따라 즉석에서 조정하도록 하는 것은 2줄입니다.

pd.set_option('display.large_repr', 'truncate')
pd.set_option('display.max_columns', 0)

위의 답변이 모두 문제를 해결하는 것 같습니다.한 가지 더:대신에 pd.set_option('option_name'), (자동 완성 가능)을 사용할 수 있습니다

pd.options.display.width = None

보다 팬더 문서:옵션 및 설정:

옵션에는 완전한 "점선 스타일", 대소문자를 구분하지 않는 이름이 있습니다(예: display.max_rows).최상위 수준의 속성으로 옵션을 직접 설정할 수 있습니다. options 기인하다:
In [1]: import pandas as pd

In [2]: pd.options.display.max_rows
Out[2]: 15

In [3]: pd.options.display.max_rows = 999

In [4]: pd.options.display.max_rows
Out[4]: 999

[...]

에 대한 max_... 매개변수:

max_rows 그리고 max_columns 에 사용됩니다 __repr__() 여부를 결정하는 방법 to_string() 또는 info() 객체를 문자열로 렌더링하는 데 사용됩니다.python/IPython이 터미널에서 실행 중인 경우 이 값을 0으로 설정할 수 있으며 pandas는 터미널 너비를 올바르게 자동 감지하고 모든 열이 수직으로 맞지 않는 경우 더 작은 형식으로 바꿉니다.IPython 노트북, IPython qtconsole 또는 IDLE은 터미널에서 실행되지 않으므로 올바른 자동 감지를 수행할 수 없습니다. ‘None' 값은 무제한을 의미합니다. [원본에는 없는 강조]

에 대한 width 매개변수:

문자 단위로 표시되는 너비입니다.Python/IPython이 터미널에서 실행 중인 경우 다음과 같이 설정할 수 있습니다. None 팬더는 너비를 올바르게 자동 감지합니다.IPython 노트북, IPython qtconsole 또는 IDLE은 터미널에서 실행되지 않으므로 너비를 올바르게 감지할 수 없습니다.

import pandas as pd
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 1000)

SentenceA = "William likes Piano and Piano likes William"
SentenceB = "Sara likes Guitar"
SentenceC = "Mamoosh likes Piano"
SentenceD = "William is a CS Student"
SentenceE = "Sara is kind"
SentenceF = "Mamoosh is kind"


bowA = SentenceA.split(" ")
bowB = SentenceB.split(" ")
bowC = SentenceC.split(" ")
bowD = SentenceD.split(" ")
bowE = SentenceE.split(" ")
bowF = SentenceF.split(" ")

# Creating a set consisted of all words

wordSet = set(bowA).union(set(bowB)).union(set(bowC)).union(set(bowD)).union(set(bowE)).union(set(bowF))
print("Set of all words is: ", wordSet)

# Initiating dictionary with 0 value for all BOWs

wordDictA = dict.fromkeys(wordSet, 0)
wordDictB = dict.fromkeys(wordSet, 0)
wordDictC = dict.fromkeys(wordSet, 0)
wordDictD = dict.fromkeys(wordSet, 0)
wordDictE = dict.fromkeys(wordSet, 0)
wordDictF = dict.fromkeys(wordSet, 0)

for word in bowA:
    wordDictA[word] += 1
for word in bowB:
    wordDictB[word] += 1
for word in bowC:
    wordDictC[word] += 1
for word in bowD:
    wordDictD[word] += 1
for word in bowE:
    wordDictE[word] += 1
for word in bowF:
    wordDictF[word] += 1

# Printing Term frequency

print("SentenceA TF: ", wordDictA)
print("SentenceB TF: ", wordDictB)
print("SentenceC TF: ", wordDictC)
print("SentenceD TF: ", wordDictD)
print("SentenceE TF: ", wordDictE)
print("SentenceF TF: ", wordDictF)

print(pd.DataFrame([wordDictA, wordDictB, wordDictB, wordDictC, wordDictD, wordDictE, wordDictF]))

산출:

   CS  Guitar  Mamoosh  Piano  Sara  Student  William  a  and  is  kind  likes
0   0       0        0      2     0        0        2  0    1   0     0      2
1   0       1        0      0     1        0        0  0    0   0     0      1
2   0       1        0      0     1        0        0  0    0   0     0      1
3   0       0        1      1     0        0        0  0    0   0     0      1
4   1       0        0      0     0        1        1  1    0   1     0      0
5   0       0        0      0     1        0        0  0    0   1     1      0
6   0       0        1      0     0        0        0  0    0   1     1      0

데이터 규모가 클 때 이 설정을 사용했습니다.

# environment settings: 
pd.set_option('display.max_column',None)
pd.set_option('display.max_rows',None)
pd.set_option('display.max_seq_items',None)
pd.set_option('display.max_colwidth', 500)
pd.set_option('expand_frame_repr', True)

문서를 참고하시면 됩니다여기

표시 옵션을 엉망으로 만들고 싶지 않고 표시되는 모든 데이터 프레임을 확장하지 않고 이 특정 열 목록만 보고 싶다면 다음을 시도해 볼 수 있습니다.

df.columns.values

루프에서 시도해 볼 수도 있습니다.

for col in df.columns: 
    print(col)

아래 줄은 데이터 프레임의 모든 열을 표시하기에 충분합니다. pd.set_option('display.max_columns', None)

다음 단계를 수행하면 됩니다.

다음과 같이 pandas max_columns 기능의 옵션을 변경할 수 있습니다.
```
import pandas as pd
pd.options.display.max_columns = 10
```
(10개의 열을 표시할 수 있으며 필요에 따라 변경할 수 있습니다.)
마찬가지로 다음과 같이 표시해야 하는 행 수를 변경할 수 있습니다(최대 행도 변경해야 하는 경우).
```
pd.options.display.max_rows = 999
```
(이렇게 하면 한 번에 999개의 행을 인쇄할 수 있습니다)

친절하게 참조하시기 바랍니다 문서 팬더에 대한 다양한 옵션/설정을 변경하려면

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow