¿Cómo puedo ampliar la pantalla de salida para ver más columnas de una pandas DataFrame?

https://stackoverflow.com//questions/11707586

13-12-2019
|

Pregunta

Es allí una manera de ampliar la pantalla de salida en cualquiera de los interactivos o script-modo de ejecución?

Específicamente, estoy utilizando el describe() función en un pandas DataFrame.Cuando el DataFrame es de 5 columnas (etiquetas) de ancho, puedo obtener las estadísticas descriptivas que quiero.Sin embargo, si el DataFrame tiene más columnas, las estadísticas son reprimidos y que algo como esto es devuelto:

>> Index: 8 entries, count to max  
>> Data columns:  
>> x1          8  non-null values  
>> x2          8  non-null values  
>> x3          8  non-null values  
>> x4          8  non-null values  
>> x5          8  non-null values  
>> x6          8  non-null values  
>> x7          8  non-null values

El "8", el valor es dado de si hay 6 o 7 columnas.¿Qué hace el "8" se refieren?

Ya he intentado arrastrar la INACTIVIDAD de la ventana más grande, así como el aumento de la "Configurar IDLE" opciones de anchura, fue en vano.

Mi propósito en el uso de los pandas y describe() es para evitar el uso de un segundo programa de Stata para realizar operaciones básicas de manipulación de datos y la investigación.

Solución

Actualización:Los Pandas 0.23.4 adelante

Esto no es necesario, pandas, detecta automáticamente el tamaño de su ventana de terminal si se establece pd.options.display.width = 0.(Para versiones más antiguas de ver en la parte inferior).

pandas.set_printoptions(...) está en desuso.En su lugar, utilice pandas.set_option(optname, val), o, equivalentemente, pd.options.<opt.hierarchical.name> = val.Como:

import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

Aquí está el ayuda para set_option:

set_option(pat,value) - Sets the value of the specified option

Available options:
display.[chop_threshold, colheader_justify, column_space, date_dayfirst,
         date_yearfirst, encoding, expand_frame_repr, float_format, height,
         line_width, max_columns, max_colwidth, max_info_columns, max_info_rows,
         max_rows, max_seq_items, mpl_style, multi_sparse, notebook_repr_html,
         pprint_nest_depth, precision, width]
mode.[sim_interactive, use_inf_as_null]

Parameters
----------
pat - str/regexp which should match a single option.

Note: partial matches are supported for convenience, but unless you use the
full option name (e.g. x.y.z.option_name), your code may break in future
versions if new options with similar names are introduced.

value - new value of option.

Returns
-------
None

Raises
------
KeyError if no such option exists

display.chop_threshold: [default: None] [currently: None]
: float or None
        if set to a float value, all float values smaller then the given threshold
        will be displayed as exactly 0 by repr and friends.
display.colheader_justify: [default: right] [currently: right]
: 'left'/'right'
        Controls the justification of column headers. used by DataFrameFormatter.
display.column_space: [default: 12] [currently: 12]No description available.

display.date_dayfirst: [default: False] [currently: False]
: boolean
        When True, prints and parses dates with the day first, eg 20/01/2005
display.date_yearfirst: [default: False] [currently: False]
: boolean
        When True, prints and parses dates with the year first, eg 2005/01/20
display.encoding: [default: UTF-8] [currently: UTF-8]
: str/unicode
        Defaults to the detected encoding of the console.
        Specifies the encoding to be used for strings returned by to_string,
        these are generally strings meant to be displayed on the console.
display.expand_frame_repr: [default: True] [currently: True]
: boolean
        Whether to print out the full DataFrame repr for wide DataFrames
        across multiple lines, `max_columns` is still respected, but the output will
        wrap-around across multiple "pages" if it's width exceeds `display.width`.
display.float_format: [default: None] [currently: None]
: callable
        The callable should accept a floating point number and return
        a string with the desired format of the number. This is used
        in some places like SeriesFormatter.
        See core.format.EngFormatter for an example.
display.height: [default: 60] [currently: 1000]
: int
        Deprecated.
        (Deprecated, use `display.height` instead.)

display.line_width: [default: 80] [currently: 1000]
: int
        Deprecated.
        (Deprecated, use `display.width` instead.)

display.max_columns: [default: 20] [currently: 500]
: int
        max_rows and max_columns are used in __repr__() methods to decide if
        to_string() or info() is used to render an object to a string.  In case
        python/IPython is running in a terminal this can be set to 0 and pandas
        will correctly auto-detect the width the terminal and swap to a smaller
        format in case all columns would not fit vertically. The IPython notebook,
        IPython qtconsole, or IDLE do not run in a terminal and hence it is not
        possible to do correct auto-detection.
        'None' value means unlimited.
display.max_colwidth: [default: 50] [currently: 50]
: int
        The maximum width in characters of a column in the repr of
        a pandas data structure. When the column overflows, a "..."
        placeholder is embedded in the output.
display.max_info_columns: [default: 100] [currently: 100]
: int
        max_info_columns is used in DataFrame.info method to decide if
        per column information will be printed.
display.max_info_rows: [default: 1690785] [currently: 1690785]
: int or None
        max_info_rows is the maximum number of rows for which a frame will
        perform a null check on its columns when repr'ing To a console.
        The default is 1,000,000 rows. So, if a DataFrame has more
        1,000,000 rows there will be no null check performed on the
        columns and thus the representation will take much less time to
        display in an interactive session. A value of None means always
        perform a null check when repr'ing.
display.max_rows: [default: 60] [currently: 500]
: int
        This sets the maximum number of rows pandas should output when printing
        out various output. For example, this value determines whether the repr()
        for a dataframe prints out fully or just a summary repr.
        'None' value means unlimited.
display.max_seq_items: [default: None] [currently: None]
: int or None

        when pretty-printing a long sequence, no more then `max_seq_items`
        will be printed. If items are ommitted, they will be denoted by the addition
        of "..." to the resulting string.

        If set to None, the number of items to be printed is unlimited.
display.mpl_style: [default: None] [currently: None]
: bool

        Setting this to 'default' will modify the rcParams used by matplotlib
        to give plots a more pleasing visual style by default.
        Setting this to None/False restores the values to their initial value.
display.multi_sparse: [default: True] [currently: True]
: boolean
        "sparsify" MultiIndex display (don't display repeated
        elements in outer levels within groups)
display.notebook_repr_html: [default: True] [currently: True]
: boolean
        When True, IPython notebook will use html representation for
        pandas objects (if it is available).
display.pprint_nest_depth: [default: 3] [currently: 3]
: int
        Controls the number of nested levels to process when pretty-printing
display.precision: [default: 7] [currently: 7]
: int
        Floating point output precision (number of significant digits). This is
        only a suggestion
display.width: [default: 80] [currently: 1000]
: int
        Width of the display in characters. In case python/IPython is running in
        a terminal this can be set to None and pandas will correctly auto-detect the
        width.
        Note that the IPython notebook, IPython qtconsole, or IDLE do not run in a
        terminal and hence it is not possible to correctly detect the width.
mode.sim_interactive: [default: False] [currently: False]
: boolean
        Whether to simulate interactive mode for purposes of testing
mode.use_inf_as_null: [default: False] [currently: False]
: boolean
        True means treat None, NaN, INF, -INF as null (old way),
        False means None and NaN are null, but INF, -INF are not null
        (new way).
Call def:   pd.set_option(self, *args, **kwds)

EDITAR:información de las versiones anteriores, mucho de esto ha quedado obsoleta.

Como @bmu mencionó, pandas, detecta automáticamente (por defecto) el tamaño del área de visualización de una vista de resumen se utiliza cuando un objeto repr no caben en la pantalla.Usted ha mencionado el tamaño de la ventana de INACTIVIDAD, a ningún efecto.Si usted print df.describe().to_string() ¿se ajusta en el vacío de la ventana?

El tamaño de los terminales está determinado por pandas.util.terminal.get_terminal_size() (obsoleto y se retira), este devuelve una tupla que contiene el (width, height) de la pantalla.¿La salida que coincida con el tamaño de su ventana de INACTIVIDAD?Podría ser un problema (hubo uno antes de cuando se ejecuta un terminal en emacs).

Tenga en cuenta que es posible omitir la detección automática, pandas.set_printoptions(max_rows=200, max_columns=10) nunca cambie a la vista de resumen si el número de filas, columnas no exceda los límites dados.

El 'max_colwidth opción de ayuda en ver untruncated forma de cada columna.

Otros consejos

Intente esto:

pd.set_option('display.expand_frame_repr', False)

A partir de la documentación:

pantalla.expand_frame_repr :boolean

Si va a imprimir la totalidad DataFrame repr para una amplia DataFrames a través de múltiples líneas, max_columns está siendo respetado, pero la salida de wrap-around a través de varias "páginas" si la anchura excede de la pantalla.la anchura.[predeterminado:True] [en la actualidad:True]

Ver: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.set_option.html

Si desea establecer opciones temporalmente para mostrar una gran DataFrame, puede utilizar option_context:

with pd.option_context('display.max_rows', -1, 'display.max_columns', 5):
    print df

Los valores de la opción se restauran automáticamente al salir de la with el bloque.

Sólo el uso de estas 3 líneas trabajado para mí:

pd.set_option('display.max_columns', None)  
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', -1)

Anaconda / Python 3.6.5 / pandas:0.23.0 / Código De Visual Studio 1.26

Columna de conjunto de ancho máximo el uso de:

pd.set_option('max_colwidth', 800)

Esta declaración en particular de los conjuntos de max ancho de 800px, por columna.

Usted puede ajustar los pandas opciones de impresión con set_printoptions.

In [3]: df.describe()
Out[3]: 
<class 'pandas.core.frame.DataFrame'>
Index: 8 entries, count to max
Data columns:
x1    8  non-null values
x2    8  non-null values
x3    8  non-null values
x4    8  non-null values
x5    8  non-null values
x6    8  non-null values
x7    8  non-null values
dtypes: float64(7)

In [4]: pd.set_printoptions(precision=2)

In [5]: df.describe()
Out[5]: 
            x1       x2       x3       x4       x5       x6       x7
count      8.0      8.0      8.0      8.0      8.0      8.0      8.0
mean   69024.5  69025.5  69026.5  69027.5  69028.5  69029.5  69030.5
std       17.1     17.1     17.1     17.1     17.1     17.1     17.1
min    69000.0  69001.0  69002.0  69003.0  69004.0  69005.0  69006.0
25%    69012.2  69013.2  69014.2  69015.2  69016.2  69017.2  69018.2
50%    69024.5  69025.5  69026.5  69027.5  69028.5  69029.5  69030.5
75%    69036.8  69037.8  69038.8  69039.8  69040.8  69041.8  69042.8
max    69049.0  69050.0  69051.0  69052.0  69053.0  69054.0  69055.0

Sin embargo, esto no funcionará en todos los casos como el panda detecta la consola de ancho y solo uso to_string si la salida se ajusta en la consola (ver el docstring de set_printoptions).En este caso se puede llamar explícitamente to_string como contestada por BrenBarn.

Actualización

Con la versión 0.10 de la manera amplia dataframes se imprimen cambiado:

In [3]: df.describe()
Out[3]: 
                 x1            x2            x3            x4            x5  \
count      8.000000      8.000000      8.000000      8.000000      8.000000   
mean   59832.361578  27356.711336  49317.281222  51214.837838  51254.839690   
std    22600.723536  26867.192716  28071.737509  21012.422793  33831.515761   
min    31906.695474   1648.359160     56.378115  16278.322271     43.745574   
25%    45264.625201  12799.540572  41429.628749  40374.273582  29789.643875   
50%    56340.214856  18666.456293  51995.661512  54894.562656  47667.684422   
75%    75587.003417  31375.610322  61069.190523  67811.893435  76014.884048   
max    98136.474782  84544.484627  91743.983895  75154.587156  99012.695717   

                 x6            x7  
count      8.000000      8.000000  
mean   41863.000717  33950.235126  
std    38709.468281  29075.745673  
min     3590.990740   1833.464154  
25%    15145.759625   6879.523949  
50%    22139.243042  33706.029946  
75%    72038.983496  51449.893980  
max    98601.190488  83309.051963

Más allá de la API para la configuración de los pandas opciones cambiado:

In [4]: pd.set_option('display.precision', 2)

In [5]: df.describe()
Out[5]: 
            x1       x2       x3       x4       x5       x6       x7
count      8.0      8.0      8.0      8.0      8.0      8.0      8.0
mean   59832.4  27356.7  49317.3  51214.8  51254.8  41863.0  33950.2
std    22600.7  26867.2  28071.7  21012.4  33831.5  38709.5  29075.7
min    31906.7   1648.4     56.4  16278.3     43.7   3591.0   1833.5
25%    45264.6  12799.5  41429.6  40374.3  29789.6  15145.8   6879.5
50%    56340.2  18666.5  51995.7  54894.6  47667.7  22139.2  33706.0
75%    75587.0  31375.6  61069.2  67811.9  76014.9  72039.0  51449.9
max    98136.5  84544.5  91744.0  75154.6  99012.7  98601.2  83309.1

Puede utilizar print df.describe().to_string() a fuerza de mostrar a toda la tabla.(Puede utilizar to_string() como esto para cualquier DataFrame.El resultado de describe es sólo un DataFrame a sí misma).

El 8 es el número de filas en el DataFrame la celebración de la "descripción" (porque describe calcula 8 estadísticas, min, max, media, etc.).

Usted puede configurar la pantalla de salida para que coincida con su actual terminal de ancho:

pd.set_option('display.width', pd.util.terminal.get_terminal_size()[0])

De acuerdo a la docs para v0.18.0, si se está ejecutando en una terminal (es decir, no iPython notebook, qtconsole o INACTIVO), es un 2-forro de tener Pandas detectar automáticamente el ancho de la pantalla y adaptarse sobre la marcha con cuántas columnas se muestra:

pd.set_option('display.large_repr', 'truncate')
pd.set_option('display.max_columns', 0)

Parece que todas las respuestas de arriba a resolver el problema.Un punto más:en lugar de pd.set_option('option_name'), usted puede utilizar el (auto-completa-poder)

pd.options.display.width = None

Ver Los Pandas doc:Opciones y Ajustes:

Las opciones de tener un completo "puntos " estilo", no distingue mayúsculas de minúsculas del nombre (p. ej. display.max_rows).Usted puede obtener/establecer opciones directamente como atributos de el nivel superior options atributo:
In [1]: import pandas as pd

In [2]: pd.options.display.max_rows
Out[2]: 15

In [3]: pd.options.display.max_rows = 999

In [4]: pd.options.display.max_rows
Out[4]: 999

[...]

para el max_... params:

max_rows y max_columns se utilizan en __repr__() métodos para decidir si to_string() o info() se utiliza para representar un objeto en una cadena.En el caso de python/IPython se está ejecutando en un terminal que se puede ajustar a 0 y pandas correctamente auto-detectar el ancho de la terminal y cambiar a un formato más pequeño en el caso de que todas las columnas que no se ajusta verticalmente.El IPython notebook, IPython qtconsole, INACTIVA o no ejecuta en un terminal, y por lo tanto no es posible hacer la correcta auto-detección. ‘None el valor de los medios ilimitados. [énfasis en el original]

para el width param:

La anchura de la pantalla de caracteres.En el caso de python/IPython se está ejecutando en un terminal que se puede ajustar a None y pandas correctamente auto-detectar el ancho.Tenga en cuenta que el IPython notebook, IPython qtconsole, INACTIVA o no ejecuta en un terminal, y por lo tanto no es posible detectar correctamente el ancho.

import pandas as pd
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 1000)

SentenceA = "William likes Piano and Piano likes William"
SentenceB = "Sara likes Guitar"
SentenceC = "Mamoosh likes Piano"
SentenceD = "William is a CS Student"
SentenceE = "Sara is kind"
SentenceF = "Mamoosh is kind"


bowA = SentenceA.split(" ")
bowB = SentenceB.split(" ")
bowC = SentenceC.split(" ")
bowD = SentenceD.split(" ")
bowE = SentenceE.split(" ")
bowF = SentenceF.split(" ")

# Creating a set consisted of all words

wordSet = set(bowA).union(set(bowB)).union(set(bowC)).union(set(bowD)).union(set(bowE)).union(set(bowF))
print("Set of all words is: ", wordSet)

# Initiating dictionary with 0 value for all BOWs

wordDictA = dict.fromkeys(wordSet, 0)
wordDictB = dict.fromkeys(wordSet, 0)
wordDictC = dict.fromkeys(wordSet, 0)
wordDictD = dict.fromkeys(wordSet, 0)
wordDictE = dict.fromkeys(wordSet, 0)
wordDictF = dict.fromkeys(wordSet, 0)

for word in bowA:
    wordDictA[word] += 1
for word in bowB:
    wordDictB[word] += 1
for word in bowC:
    wordDictC[word] += 1
for word in bowD:
    wordDictD[word] += 1
for word in bowE:
    wordDictE[word] += 1
for word in bowF:
    wordDictF[word] += 1

# Printing Term frequency

print("SentenceA TF: ", wordDictA)
print("SentenceB TF: ", wordDictB)
print("SentenceC TF: ", wordDictC)
print("SentenceD TF: ", wordDictD)
print("SentenceE TF: ", wordDictE)
print("SentenceF TF: ", wordDictF)

print(pd.DataFrame([wordDictA, wordDictB, wordDictB, wordDictC, wordDictD, wordDictE, wordDictF]))

Salida:

   CS  Guitar  Mamoosh  Piano  Sara  Student  William  a  and  is  kind  likes
0   0       0        0      2     0        0        2  0    1   0     0      2
1   0       1        0      0     1        0        0  0    0   0     0      1
2   0       1        0      0     1        0        0  0    0   0     0      1
3   0       0        1      1     0        0        0  0    0   0     0      1
4   1       0        0      0     0        1        1  1    0   1     0      0
5   0       0        0      0     1        0        0  0    0   1     1      0
6   0       0        1      0     0        0        0  0    0   1     1      0

He utilizado estos ajustes cuando la escala de los datos es alta.

# environment settings: 
pd.set_option('display.max_column',None)
pd.set_option('display.max_rows',None)
pd.set_option('display.max_seq_items',None)
pd.set_option('display.max_colwidth', 500)
pd.set_option('expand_frame_repr', True)

Puede consultar la documentaciónaquí

Si usted no quiere meterse con las opciones de visualización y sólo desea ver esta particular lista de columnas sin necesidad de ampliar cada dataframe ver, puedes intentar:

df.columns.values

También puede intentar en un bucle:

for col in df.columns: 
    print(col)

La siguiente línea es suficiente para mostrar todas las columnas de dataframe. pd.set_option('display.max_columns', None)

Usted puede simplemente hacer los siguientes pasos,

Usted puede cambiar las opciones de pandas max_columns característica de la siguiente manera
```
import pandas as pd
pd.options.display.max_columns = 10
```
(esto permite que el 10 columnas a mostrar, puede cambiar esto, como de la necesidad)
Al igual que usted puede cambiar el número de filas como necesite para mostrar de la siguiente manera (si necesita cambiar el número máximo de filas)
```
pd.options.display.max_rows = 999
```
(esto permite imprimir 999 filas de una vez)

Por favor consulte la doc para cambiar diferentes opciones/configuración de los pandas

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow