Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Issue with generate_spark_graph() #5

@eirikhoye

Description

@eirikhoye

Hi, I managed to get sparkhpc and imnet running on our institute HPC cluster, however, when I run the code to generate a distributed graph:

import findspark; findspark.init()
import sparkhpc
template_path = '/cluster/home/eirikhoy/sparkhpc/build/lib/sparkhpc/templates/sparkjob.slurm.template'
sj = sparkhpc.sparkjob.SLURMSparkJob(ncores=4, template=template_path)
from pyspark import SparkContext
sc = SparkContext(master=sj.master_url())
import imnet
import numpy as np
from scipy.sparse import csr_matrix 
import pyspark
strings = imnet.random_strings.generate_random_sequences(5000)

g_rdd = imnet.process_strings.generate_spark_graph(strings, sc, max_ld=2).cache()

I get the error:

UnboundLocalError                         Traceback (most recent call last)
<ipython-input-15-af167cc949f4> in <module>()
----> 1 g_rdd = imnet.process_strings.generate_spark_graph(strings, sc, max_ld=2).cache()

/cluster/home/eirikhoy/.conda/envs/imnet_v0.2/lib/python2.7/site-packages/imnet/process_strings.pyc in generate_spark_graph(strings, sc, mat, min_ld, max_ld)
    189         warn("Problem importing pyspark -- are you sure your SPARK_HOME is set?")
    190 
--> 191     sqc = SQLContext(sc)
    192 
    193     strings_b = sc.broadcast(strings)

UnboundLocalError: local variable 'SQLContext' referenced before assignment

Note, I tested it on a local VM and got the same error, so maybe the issue is not with incorrect dependencies?

Both SPARK_HOME and JAVA_HOME environment variable are assigned:

>>> os.environ['SPARK_HOME']
'/cluster/software/Spark/2.4.0-intel-2018b-Python-3.6.6'
>>> os.environ['JAVA_HOME']
'/cluster/software/Java/1.8.0_212'

The rest of the code examples ran fine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions