Domanda

Ho uno script python scritto con Spark Contesto e voglio eseguirlo. Ho cercato di integrare IPython con Spark, ma non ho potuto farlo. Così, ho cercato di impostare il percorso scintilla [Installazione cartella / bin] come variabile di ambiente e denominata scintilla Invia comando nel prompt dei cmd. Credo che è trovare il contesto scintilla, ma produce davvero un grande errore. Qualcuno può aiutarmi con questo problema?

variabile d'ambiente PATH: C: /Users/Name/Spark-1.4; C: /Users/Name/Spark-1.4/bin

Dopo di che, in cmd pronta: scintilla presentare script.py

entrare descrizione dell'immagine qui

È stato utile?

Soluzione 4

Finally, I resolved the issue. I had to set the pyspark location in PATH variable and py4j-0.8.2.1-src.zip location in PYTHONPATH variable.

Altri suggerimenti

I'm fairly new to Spark, and have figured out how to integrate with with IPython on Windows 10 and 7. First, check your environment variables for Python and Spark. Here are mine: SPARK_HOME: C:\spark-1.6.0-bin-hadoop2.6\ I use Enthought Canopy, so Python is already integrated in my system path. Next, launch Python or IPython and use the following code. If you get an error, check what you get for 'spark_home'. Otherwise, it should run just fine.

import os

import sys

spark_home = os.environ.get('SPARK_HOME', None)

if not spark_home:

raise ValueError('SPARK_HOME environment variable is not set')

sys.path.insert(0, os.path.join(spark_home, 'python'))

sys.path.insert(0, os.path.join(spark_home, 'C:/spark-1.6.0-bin-hadoop2.6/python/lib/py4j-0.9-src.zip')) ## may need to adjust on your system depending on which Spark version you're using and where you installed it.

execfile(os.path.join(spark_home, 'python/pyspark/shell.py'))

pySpark on IPython

Check if this link could help you out.

Johnnyboycurtis answer works for me. If you are using python 3, use below code. His code doesnt work in python 3. I am editing only the last line of his code.

import os
import sys


spark_home = os.environ.get('SPARK_HOME', None)
print(spark_home)
if not spark_home:
    raise ValueError('SPARK_HOME environment variable is not set')
sys.path.insert(0, os.path.join(spark_home, 'python'))
sys.path.insert(0, os.path.join(spark_home, 'C:/spark-1.6.1-bin-hadoop2.6/spark-1.6.1-bin-hadoop2.6/python/lib/py4j-0.9-src.zip')) ## may need to adjust on your system depending on which Spark version you're using and where you installed it.


filename=os.path.join(spark_home, 'python/pyspark/shell.py')
exec(compile(open(filename, "rb").read(), filename, 'exec'))
Autorizzato sotto: CC-BY-SA insieme a attribuzione
scroll top