Unix Primer

From AstroBaki
Revision as of 13:59, 13 January 2022 by Aparsons (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Short Topical Videos[edit]

Reference Material[edit]

A SHORT UNIX PRIMER

This document provides a short primer on how to navigate Unix-like operating systems (e.g. Linux). Unix has a long tradition of use in scientific computing. That tradition was bolstered when Linus Torvalds developed an open-source implementation of Unix (Linux) which enabled the operating system to be continuously developed and distributed for free. Today, Linux has become almost synonymous with Unix, and you will find many flavors of Linux (e.g. Ubuntu, Debian, CentOS, Raspbian, Fedora, etc.) available supporting most modern hardware. You can even find closely related Unix variants in other major operating systems (e.g. terminal in Mac OSX and cygwin on Windows).

Unix (or Linux, if you like) supports graphical user interfaces (e.g. Gnome, KDE), but it was originally written to support command-line (text) processing, and that is where its true power shines. The learning curve can be a bit steep at first. Many of the commands are quite terse/baroque, as they date from a time when keystrokes were slow to be processed. Once you master the tools, though, you may find that you spend a lot less time clicking through menus or manually renaming files, and a lot more time doing fun things like programming/scripting and getting things done.

Everything below assumes you are at a command-line prompt. If you are using a graphic interface, you may need to find the "terminal" icon and click it to get started. The command line is actually an interface to a program that interprets the text you enter. There are various flavors of command-line programs (called "shells, which include bash, zsh, csh, etc.). We will use bash because it is among the most prevalent and full-featured, but most of the commands below will work on any of these shells.

1 DIRECTORY STRUCTURE

The Unix file system is all held inside the root directory (/). All other directories are subdirectories off of this one. When you log in, you will usually start in a user directory bearing your username. For most Unix flavors, this is in /home/<username>. This is where you will put most of your files (preferably in a subdirectory off of your home directory, as it tends to get cluttered). It is also where a lot of configuration files get stored. These files tend to begin with a dot (e.g. .bashrc). These files don’t tend to be visible to you by default, but as you will see below, they are there if you know how to look for them.

All of the rest of the stuff your computer and operating system needs is also linked off of root. This includes places where executable programs ("applications," if you like) are stored (e.g. /bin, /usr/bin, /usr/local/bin), where all of the run-time program and system information is stored (/var), where configuration and resource for your programs are stored (/etc), where external devices (/dev) and mounted disks (/mnt) are held, and even where the boot instructions of your computer are held (/boot). Until you know what you are doing, though, best not to mess with these things.

1.1 Navigating the Directory Tree

The first thing you will need to learn is to navigate between directories, find out what is in them, make/remove directories, and copy/move files between directories.

Here is a list of the most important commands relevant to directories include:

  • To see a listing of the files and subdirectories of the present working directory (pwd), use the ls command. It has some useful options. One of my favorite combinations of options is ls -lrt: the l means “long listing format”, the t means “time order”, and the r means “reverse order”—so this lists all files in reverse time order so that the most recent file appears as the last line of the list. You can also use the * character as a wildcard: for example, ls *pdf lists all files with the suffix pdf (i.e. PDF files). Note: if you want to see all the files (even the invisible configuration files, you need to add the a flag to the command.
  • To change to a directory, e.g. the tex directory under lab1, type the absolute path
cd /home/user_name/lab1

Or, shorter: stands for your home directory, so you can type

cd ~/lab1

Or, if your present working directory (pwd) is /home/user_name, you can type just a relative path cd tex.

  • To create a new directory, e.g. a directory called monday under data, type
mkdir ~/lab1/data/monday

Or, if your cwd is /lab1/data, you need only type mkdir monday. If you want to eliminate (remove) a directory, get rid of all the files and type rmdir monday.

  • To copy a file from one directory to another, e.g. the file carl from to /lab1/src, type
cp ~/carl ~/lab1/src

You end up with two copies of the file. You can instead move the file using mv. To remove a file, use rm.

We have defined the following options for these commands:

  • For cp: cp -ip. The i means “inquire before overwriting a file of the same name”; once you overwrite the original version, it’s gone! the p means ”preserve the original time information about the file”; otherwise, it would tag the copied file with the current time.
  • For mv: mv -i. As above. inquire!
  • For rm: rm -i. Ask to confirm your intention to eliminate the file. Once it’s gone, it’s gone—there’s no “wastebasket”.

1.2 Permissions for Directories and Files

Permissions determine who has access. You might want your data to be accessible by everybody, for reading, but you probably don’t want other people writing over your data! And you shouldn’t want your lab writeup to be accessible by anybody, either reading or writing—you don’t want to facilitate plagiarism! And it makes sense to keep all your love letters in a separate directory that isn’t accessible in any way by anybody else—including their recipient(s)!

You set permissions with chmod. Permissions recognize three classes of people: u (user—yourself!), g (group—that’s usually all the users of ugastro), and o (other—everybody else, including somebody in Timbuktu who happens to crash into our system). Note that group is essentially like other—almost everybody! Each class can have three permissions: r (read permission), w (write permission), and x (execute permission). chmod allows you to add, take away, and set exactly permissions for different users with the operators (+, -, =).

Suppose, for your data subdirectory, you want read permission for everyone and write permission only for yourself. To grant the read permissions to group and other, get into the directory above data (that’s lab1) and then type chmod go+rx data; to eliminate the write permissions, chmod go-w data. To check your work, do a long listing of the directory (ls -l data). On your screen, it would write

drwxr-xr-x 1 heiles bin 10 2007-01-20 17:58 data/

Translation: The first character, d, means it’s a directory. The next three specify the permissions for the user (that’s you): rw means you have read, write privileges, and the x means you can access the directory. The next set of three characters r-x is for the group (all your classmates); the final set of three characters r-x means that everybody can read and access the directory contents, but can’t write into it.

For your love letter directory (called love), you’d want chmod go-rwx love

Permissions apply slightly differently to directories and files:

  1. Directories. In order to access any file in a directory—even to obtain a listing of the files with the ls command—the person needs execute permission. The read and write permissions are as you’d expect.
  2. Files. Most files don’t need execute permission. By default, when you create a file, it has read/write permissions for you and read for the other two classes.

2 OTHER ASPECTS…

2.1 Typing on the Command-Line in Linux/UNIX, Emacs

One of the joys of Unix is command-line editing using shortcuts. If you master a few of these, you will find that you can interact much more quickly with your computer.

  • Ctrl-r Reverse search through your history for what you’ve already started typing. Hit enter to pull out the command, which you can then modify before entering.
  • up/down arrow scroll up (or down) through commands you’ve already entered.
  • Ctrl-e moves the cursor to the end of the line.
  • Ctrl-a moves the cursor to the beginning of the line.
  • Ctrl-k deletes the rest of the line.
  • Ctrl-w deletes the word or portion of the word preceding the cursor.

2.2 Stopping or Suspending Processes

When you run a program (from the command line or otherwise), the operating system kernel (that’s the main program that makes your computer operate) launches a process that carries out its instructions. Often, that process ends naturally, but sometimes you’d like to end it before it is finished. Or perhaps you’d like the computer to keep executing the program in the background, but you would like control over your command line again to start typing something else. Here are some basic tools:

  • Ctrl-c Break out of the current process. If the process is responsive, this will land you back at the command line. If it is not responsive, this will not work and you might need to try suspending or killing (see below) the process.
  • Ctrl-z Suspends the current process. The process still exists, but it has been frozen by the kernel. You will land back at the command line. Sometimes this succeeds in escaping from a process when Ctrl-c fails. Once you are at the command line, you can do whatever you wish. If you wish to continue the process, you can type fg (foreground) to jump back in. Alternately, you can type bg (background) to continue the process, but keep the command line available for running other commands.
  • jobs This command lists all processes (foreground or background) that you have started from this command line. Note that this does not list all the processes that are running on your machine.

2.3 Piping, etc:

Piping () directs the text output of a command to the next succeeding command. For example,

ls | grep /

directs the output of the listing command to grep, which here selects all names containing the string “/”; those are directories, so this gives a list of directories just under the current directory.

Unix has a rich set of commands for manipulating text output exchanged through pipes. Examples include:

  • cat Send the contents of a file to the output pipe (stdout).
  • less Like more. Buffers output and allows to you interactively page though it. Can also operate directly on a file.
  • sort Sort the lines of the output (with optional control over how to sort). uniq List only one instance of consecutive unique lines (pair with uniq to get all the unique lines).
  • wc Word count. Counts the characters, words, and lines in the output.
  • awk A very powerful parser for extracting fields from text output. Often coupled with sed (a file editor) to reparse and format information.

Normally the result of a UNIX command is written to the terminal for you to see. However, you can direct the output elsewhere. For example,

more love1

prints the file love1 on your screen, while

more love1 > love2

creates the file love2 and writes the content into it.

Normally the input to a UNIX command is expected to be from the terminal. However, you can get the input from elsewhere. For example,

mail aparsons < idea.txt

uses the file idea.txt as input to the command mail, which means that it mails the file idea.txt to the user aparsons on this computer. You can try it, but mail on this account is not checked.

2.4 Text Editing

Unix comes with some build-in text editors (e.g. vim, emacs, nano/pico). You can always use a different one, but it’s a good idea to be reasonably conversant in one of these, since it’s guaranteed to be installed on any UNIX system. Moreover, you may find that vim and emacs, in particular, are actually incredible powerful and streamlined. Of the two, vim (VI iMproved) is terser, more baroque, and arguably faster for expert use, but it has a steeper learning curve. emacs is similarly powerful, but with perhaps an easier learning curve because it takes more Ctrl key presses (which slows down editing).

2.5 Scripting

Once you can write files, it becomes easy to write scripts. The easiest way is to put commands into a file and then run that file from the command line. These commands could be bash, but they could also be python or any other programming language. For example, edit a file test.py with the following:

a = 1
b = 2
print(a + b)

Then from the command line, run

$ python test.py

If you’d like to not have to type "python", but instead just run the script as if it were a program, you need to do two things:

  1. Add #! /usr/bin/env python to the top of the text of your script. This uses the env command to find where python is on this computer.
  2. Make your script executable (chmod a+x test.py)

Now you can run your script directly from the command line. Bash scripting works just as easily. Edit test.sh:

#! /usr/bin/env bash
for file in `ls`; do
echo $file
echo `wc $file`
done

You can run this script (don’t forget to make it executable) to print out the word count of each file in the current directory. A full bash script tutorial is beyond the scope of this document, but there are many tutorials online. For reading this script, it helps to know that backticks enclosing a command replace the text of the command with the text that command outputs, and that dollar signs are used to extract a value from a variable name. If you find yourself (as a python programmer) using bash scripting, you may find that using python -c "<some quick one-line script here>" is very helpful for gluing bash scripts together.

All of this is to say, if you want to edit code and run it exclusively over the command line, there are many versatile tools at your disposal.

2.6 Remote Logging In

You can log in from home if your computer has the secure login software, called ssh. If you have Linux, type ssh ugastro.berkeley.edu

If you’re logging in from home on an (ugh!) Windows machine, you need to be able to use X windows and a SSH client. To enable X11 forwarding (X windows) run SSH with the -X and -Y flags.

3 COMMON COMMANDS

  • passwd Change your account password.
  • man commandname Gets voluminous (usually overly so) info for you on a specific command. Old-school StackExchange that works without an internet connection.
  • pwd Shows your “present working directory”.
  • cd dirname Moves you into the subdirectory, below your present directory.
  • cd .. Moves you out of a subdirectory into the directory above it.
  • cd - Moves you to previous directory you were in.
  • mkdir dirname Creates a subdirectory named dirname.
  • rmdir dirname Removes a subdirectory named dirname.
  • rm filename Removes a file named filename; it must be empty.
  • cp file1 file2 Copies the contents of file1 into file2. You are left with two files.
  • mv oldfile newfile Moves (or renames) oldfile as newfile.
  • cat file1 file2 fileboth Concatenates file1 and file2, writing them into the new fileboth.
  • which cp Tells the current definition of cp (which works for any command);
  • history Gives a numbered list of the previous commands you’ve typed; typing !number repeats that command.
  • !! Repeats the previous UNIX command.
  • find dirname -name filename finds all files with filename in and under the directory with dirname.
  • find dirname -name ’*love*’ Finds all files whose names contain the string “love”.
  • ls -lrt Lists the files and subdirectories in the present directory. The -lrt gives a long format in reverse time order. ls -lrt grep / Pipes the output to grep, which selects only those names containing “/” (which are directories).
  • grep -il text file Searches the file for occurrences of the string text The -i ignores capitalization and l lists only the filename.
  • less filename Shows you the contents of the file named filename one screen at a time; more flexible than more.
  • tail -40 filename Shows you the last 40 lines of the file filename.
  • du -h dirname Tells the disk space used by everything in dirname. The “h” means in “human units”. Also handy for giving the directory tree structure.
  • df -k Tells kilobytes used and available on all disks. (“-h” works here too!)
  • top Shows CPU usage, etc, for jobs on your machine.
  • ps -u username List the programs that username is currently running on the machine you are logged onto.
  • kill processnum Kills the process listed with processnum. You must own the process