spaCy is excellent Python NLP library. It also has a cleverly named visualization tool, displaCy.
AllenNLP[https://github.com/allenai/allennlp] is an open-source NLP research library from AI2, built on PyTorch
Using python 3.5+ on biglab
Biglab has python3.4 installed, which is a little out of date, so if you want to use a more modern python, follow these steps. First, to get to biglab:
(where USERNAME is your Penn username)
You can either use an existing miniconda installation, or you can download your own.
1. Use existing miniconda installation
For this, open up ~/.bashrc and add this line to the end:
Restart your terminal (exit and ssh in again), and python should be version 3.6 from anaconda.
2. Install miniconda in your home directory
This is more involved, but may give you more freedom. Anaconda is a collection of scientific packages for python, and also a virtual environment manager. I suggest miniconda, which is a stripped down version. To install go here: https://conda.io/miniconda.html. Alternatively, run this:
Then restart your terminal (exit and ssh in again), and run this:
John has a basic introduction to bash for NLP here, and a discussion of advanced topics in bash here.
Kevin Knight of the University of Southern California has a nice unix skills for NLP tutorial here.
Screen / byobu / tmux
Since you will be running code remotely, we strongly recommend that you use some sort of session manager. I (Stephen) use screen, but other options are byobu, or tmux. These allow you to ssh to a remote machine, start a terminal session, disconnect from it, and reconnect at a later time. This is especially useful when you want to run long jobs. Here’s a sample screenrc file.