본문 바로가기

빅데이터

macOS에 pyspark설치, pyspark실행시 jupyterlab 실행시키기

스파크를 다루다보면 jupyterlab을 통해 pyspark를 실행시킬 일이 종종 있습니다. 로컬컴퓨터(macOS)에 설치하는 방법을 설명드리도록 하겠습니다. pyspark와 jupyterlab을 설치하기 위해서 패키지 설치도구인 homebrew를 사용하도록 하겠습니다. homebrew에 대한 자세한 설명은 아래 영상을 참고해주세요.

 

 

1. pyspark설치

$ brew install apache-spark


2. jupyterlab설치

$ brew install jupyterlab


3. pyspark환경변수 설정
pyspark를 실행할때 아래 2개의 환경변수를 .bashrc 또는 .zshrc에 넣어도록합니다. pyspark를 실행하면 jupyterlab을 실행하게 됩니다.

export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'

위 구문을 .bashrc 또는 .zshrc에 넣은 이후에 source명령을 실행하면 해당 환경변수가 적용됩니다.

$ source ~/.zshrc


4. 실행
이제 pyspark를 실행하면 즉시 jupyterlab이 실행됩니다.

$ pyspark
/usr/local/Cellar/jupyterlab/2.1.5/libexec/lib/python3.8/site-packages/traitlets/config/loader.py:795: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(key) is 1:
/usr/local/Cellar/jupyterlab/2.1.5/libexec/lib/python3.8/site-packages/traitlets/config/loader.py:804: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if len(key) is 1:
[I 17:06:52.559 NotebookApp] Writing notebook server cookie secret to /Users/1003855/Library/Jupyter/runtime/notebook_cookie_secret
[I 17:06:53.846 NotebookApp] JupyterLab extension loaded from /usr/local/Cellar/jupyterlab/2.1.5/libexec/lib/python3.8/site-packages/jupyterlab
[I 17:06:53.846 NotebookApp] JupyterLab application directory is /usr/local/Cellar/jupyterlab/2.1.5/libexec/share/jupyter/lab
[I 17:06:53.849 NotebookApp] Serving notebooks from local directory: /Users/1003855/skp
[I 17:06:53.849 NotebookApp] The Jupyter Notebook is running at:
[I 17:06:53.849 NotebookApp] http://localhost:8888/?token=0cb9e3faa0b16b469af3ca0421dc7190a5ac19
2c9594782a
[I 17:06:53.849 NotebookApp]  or http://127.0.0.1:8888/?token=0cb9e3faa0b16b469af3ca0421dc7190a5ac192c9594782a
[I 17:06:53.849 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 17:06:53.854 NotebookApp]

    To access the notebook, open this file in a browser:
        file:///Users/1003855/Library/Jupyter/runtime/nbserver-33517-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=0cb9e3faa0b16b469af3ca0421dc7190a5ac192c9594782a
     or http://127.0.0.1:8888/?token=0cb9e3faa0b16b469af3ca0421dc7190a5ac192c9594782a
^C[I 17:06:58.692 NotebookApp] interrupted
Serving notebooks from local directory: /Users/1003855/skp
0 active kernels
The Jupyter Notebook is running at:
http://localhost:8888/?token=0cb9e3faa0b16b469af3ca0421dc7190a5ac192c9594782a
 or http://127.0.0.1:8888/?token=0cb9e3faa0b16b469af3ca0421dc7190a5ac192c9594782a

pyspark를 통해 실행된 주피터랩