Install Homebrew: Open Terminal and run the following command to install Homebrew (if not already installed):

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Install Apache Spark: Run the following command in Terminal to install Apache Spark using Homebrew:

brew install apache-spark

Configure Environment Variables:
Open Terminal and run the following command:
nano ~/.bash_profile
Add the following lines to the file:

export SPARK_HOME=/usr/local/Cellar/apache-spark//libexec
export PYSPARK_PYTHON=/usr/bin/python3
export PATH=$SPARK_HOME/bin:$PATH

Save the file (press Ctrl + X, then Y, and Enter).
Refresh the Environment: Run the following command in Terminal to apply the changes to your current session:
source ~/.bash_profile
Verify the Setup: Open a new Terminal window and run pyspark to launch the PySpark shell. If it starts without errors, the setup is successful.

Testing the PySpark Installation:

  • Open a new terminal or command prompt window.
  • Run the following command to start the PySpark shell:
pyspark
  • If everything is set up correctly, you should see the PySpark shell starting and a Python prompt (>>>) appearing.
  • You can test PySpark by running simple PySpark commands, such as creating RDDs or DataFrames and performing basic operations on them.