Skip to main content

References

Tutorials

NLP courses at other universities

Python resources

Using python 3.5+ on biglab

Biglab has python3.4 installed, which is a little out of date, so if you want to use a more modern python, follow these steps. First, to get to biglab:

$ ssh USERNAME@biglab.seas.upenn.edu

(where USERNAME is your Penn username)

You can either use an existing miniconda installation, or you can download your own.

1. Use existing miniconda installation

For this, open up ~/.bashrc and add this line to the end:

export PATH="/home1/m/mayhew/miniconda3/bin:$PATH"

Restart your terminal (exit and ssh in again), and python should be version 3.6 from anaconda.

2. Install miniconda in your home directory

This is more involved, but may give you more freedom. Anaconda is a collection of scientific packages for python, and also a virtual environment manager. I suggest miniconda, which is a stripped down version. To install go here: https://conda.io/miniconda.html. Alternatively, run this:

$ wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ chmod +x Miniconda3-latest-Linux-x86_64.sh
$ . Miniconda3-latest-Linux-x86_64.sh

Then restart your terminal (exit and ssh in again), and run this:

$ conda install gensim

Bash resources

  • John has a basic introduction to bash for NLP here, and a discussion of advanced topics in bash here.
  • Kevin Knight of the University of Southern California has a nice unix skills for NLP tutorial here.

Screen / byobu / tmux

Since you will be running code remotely, we strongly recommend that you use some sort of session manager. I (Stephen) use screen, but other options are byobu, or tmux. These allow you to ssh to a remote machine, start a terminal session, disconnect from it, and reconnect at a later time. This is especially useful when you want to run long jobs. Here’s a sample screenrc file.