spark-connections {sparklyr} | R Documentation |
These routines allow you to manage your connections to Spark.
spark_connect(master, spark_home = Sys.getenv("SPARK_HOME"), method = c("shell", "livy", "databricks", "test"), app_name = "sparklyr", version = NULL, config = spark_config(), extensions = sparklyr::registered_extensions(), ...) spark_connection_is_open(sc) spark_disconnect(sc, ...) spark_disconnect_all() spark_submit(master, file, spark_home = Sys.getenv("SPARK_HOME"), app_name = "sparklyr", version = NULL, config = spark_config(), extensions = sparklyr::registered_extensions(), ...)
master |
Spark cluster url to connect to. Use |
spark_home |
The path to a Spark installation. Defaults to the path
provided by the |
method |
The method used to connect to Spark. Default connection method
is |
app_name |
The application name to be used while running in the Spark cluster. |
version |
The version of Spark to use. Only applicable to
|
config |
Custom configuration for the generated Spark connection. See
|
extensions |
Extension packages to enable for this connection. By
default, all packages enabled through the use of
|
... |
Optional arguments; currently unused. |
sc |
A |
file |
Path to R source file to submit for batch execution. |
When using method = "livy"
, it is recommended to specify version
parameter to improve performance by using precompiled code rather than uploading
sources. By default, jars are downloaded from GitHub but the path to the correct
sparklyr
JAR can also be specified through the livy.jars
setting.
sc <- spark_connect(master = "spark://HOST:PORT") connection_is_open(sc) spark_disconnect(sc)