Todays lecture was about getting your heading on the arctic server, finding your home directory, getting your web folder setup and dabbling around in python. In the theory part we defined what big data is – it is big data when at least one of the three Vs applies (volume, velocity, variety) – and we briefly discussed different approaches to big data analysis.
For the exercise we needed to get access to arctic. Please find the pdf file of the Exercise here: Exercise 1. For that you either needed ssh (on OSX) or putty on Windows. If you want graphic content being forwarded to you and to use xemacs properly you either need to use ssh -X username@arctic.cse.msu.edu or enable X11 forwarding on putty. However, for this to work on a mac you need to install xquartz first.
To remember from today: Moving around in the unix/linux filesystem is done with ls (list the directory content) and folders are changed with cd (.. is the folder above). Files are copied with cp, and moved with mv. In the exercise we used gzip and tar to unpack a file, and we also used ln to create a symbolic link instead of copying files. In order to make folders you use mkdir, and you can remove file and folders using rm. There is of cause a very detailed and good command line (called shell) tutorial to be studied here.
For next time I ask you to install ipython notebook which is easiest done with enthought canopy package. Please install the free version. You should also, just in case, get an account at sagemath.cloud so you can do everything just in case the installation has issues.
If you want to check out the slides of today you can find them here: Lecture 1.
I think at times the exercise must have been confusing. You needed to do thing that sometimes didn’t work, and sometimes needed more time to sink in. I also think that especially the shell is rather unintuitive because we are used to graphic interfaces and suddenly we have to do everything by hand using strange abbreviations with an system that doesn’t allow mistakes. The next lectures that deal with ipython notebook will be way more intuitive and do not require you to jump through hoops. I don’t know why handin didn’t work properly, but you know my email addess now and can just mail the exercise.
Please create an account on this blog so that you can also comment. In case your comment contains a link it will end up in a cue to approve it, this is spam protection. I encourage you to use the comments to ask, point out, question, criticizes, demand, or compliment on the lecture. Feedback will improve the quality.
We haven’t talked about the schedule and outline of the class, we will do that Monday. Looking forward to see you again,
Cheers Arend